NumPy: numpy.unique() function
numpy.unique() function
The numpy.unique() function is used to find the unique elements of an array.Returns the sorted unique elements of an array. There are three optional outputs in addition to the unique elements:
- the indices of the input array that give the unique values
- the indices of the unique array that reconstruct the input array
- the number of times each unique value comes up in the input array
This function is useful when we need to identify the unique values of an array and use them for further processing, such as counting occurrences or filtering out duplicates. It is commonly used in data analysis and machine learning applications, as well as in general programming tasks.
Syntax:
numpy.unique(filt, trim='fb')
Parameters:
Name | Description | Required / Optional |
---|---|---|
ar | The input array. If axis is not specified, this array will be flattened. | Required |
return_index | If True, also return the indices of ar that result in the unique array. | Optional |
return_inverse | If True, also return the indices to reconstruct ar from the unique array. | Optional |
return_counts | If True, also return the number of times each unique item appears in ar. | Optional |
axis | The axis to operate on. If None, ar will be flattened. For integer axis, the subarrays indexed by the given axis will be treated as elements. | Optional |
Return value:
- unique: The sorted unique elements of the array.
- unique_indices (Optional): Indices of the first occurrences of the unique values in the original array. Returned if return_index is True.
- unique_inverse (Optional): Indices to reconstruct the original array from the unique array. Returned if return_inverse is True.
- unique_counts (Optional): The count of each unique value in the original array. Returned if return_counts is True.
Use Cases:
- Data Analysis: Identifying unique values in datasets.
- Preprocessing: Removing duplicates before further data processing.
- Statistics: Counting occurrences of each unique element.
Example: Finding unique elements in a numpy array
>>> import numpy as np
>>> np.unique([0,1,2,0,2,3,4,3,0,4])
array([0, 1, 2, 3, 4])
In the above code, the input array is [0,1,2,0,2,3,4,3,0,4]. The numpy.unique() function returns a sorted array of unique elements present in the input array, in ascending order.
Visual Presentation:
Example: Finding unique elements in a 2D array using numpy.unique()
>>> import numpy as np
>>> x = np.array([[1, 1], [2,3], [3,4]])
>>> np.unique(x)
array([1, 2, 3, 4])
In the above code we first create a 2D array 'x' using the np.array() method with the values [[1, 1], [2, 3], [3, 4]]. Then, the np.unique() function is used to find the unique elements in the array 'x'.
Since 'x' is a 2D array, np.unique() treats it as a flattened array and returns only the unique values. Therefore, the output is an array of unique elements [1, 2, 3, 4].
Visual Presentation:
Example: Finding unique elements and their indices in a numpy array
>>> import numpy as np
>>> x = np.array(['o', 'p', 'y', 't', 'h', 'o', 'p'])
>>> u, indices = np.unique(x, return_index=True)
>>> u
array(['h', 'o', 'p', 't', 'y'],
dtype='<U1')
>>> indices
array([4, 0, 1, 3, 2])
>>> x[indices]
array(['h', 'o', 'p', 't', 'y'],
dtype='<U1')
The above code defines a numpy array x consisting of the string values ['o', 'p', 'y', 't', 'h', 'o', 'p']. The np.unique() function is called on the array x with the parameter return_index=True to get unique elements and their indices. The returned values are assigned to u and indices respectively. The u array consists of the unique elements of the x array in sorted order, i.e. ['h', 'o', 'p', 't', 'y'].
The indices array consists of the indices of the first occurrences of these unique elements in the x array, i.e. [4, 0, 1, 3, 2]. Finally, the code uses the indices array to retrieve the original elements in the same order as in the u array, i.e. ['h', 'o', 'p', 't', 'y'], by accessing the corresponding elements in the x array.
Example: Finding unique values and their indices using numpy.unique() function with return_inverse parameter
>>> import numpy as np
>>> x = np.array([0, 1, 2, 5, 2, 6, 5, 2, 3, 1])
>>> u, indices = np.unique(x, return_inverse=True)
>>> u
array([0, 1, 2, 3, 5, 6])
In the above code the np.unique() function takes an array as input and returns an array of unique elements in that array. In this case, the function takes the x array as input and returns an array u that contains the unique elements in x.
The second parameter return_inverse is set to True. This will also return an array of indices that can be used to reconstruct the original array from the unique values. The indices array indices is returned.
Visual Presentation:
Example: Mapping an array using numpy.unique()
>>> import numpy as np
>>> x = np.array([0, 1, 2, 5, 2, 6, 5, 2, 3, 1])
>>> indices
array([0, 1, 2, 4, 2, 5, 4, 2, 3, 1])
>>> u[indices]
array([0, 1, 2, 5, 2, 6, 5, 2, 3, 1])
In the above code:
- In the first line, we create an array 'x' with 10 elements. Then, we use the numpy.unique() function with the argument 'return_inverse=True'. This returns two outputs - the unique values in the array 'u' and an array 'indices' which contains the indices of the unique values corresponding to each element in the input array.
- In the second line, we print the unique values obtained from the first output.
- In the third line, we print the 'indices' array, which contains the indices of the unique values corresponding to each element in the input array.
- In the fourth line, we use the 'indices' array to map each element in the input array 'x' to its corresponding unique value from the 'u' array. The resulting mapped array is printed in the output.
Example: Finding unique values in an array using numpy.unique() with return_inverse=True
>>> import numpy as np
>>> x = np.array([1,2,5,3,4,2,3,2,5,4])
>>> u, indices = np.unique(x, return_inverse=True)
>>> u
array([1, 2, 3, 4, 5])
The above code uses the NumPy function np.unique() to find the unique values in a NumPy array 'x'. The function is called with the optional argument 'return_inverse=True', which also returns an array of indices of the unique values in the original array.
The unique values are returned as an array 'u', which is sorted in ascending order. In this case, the unique values in 'x' are [1, 2, 3, 4, 5]. The array 'indices' is also returned, which represents the original array 'x' in terms of the unique values in 'u'.
Visual Presentation:
Example: Mapping original values using unique and inverse index in numpy
>>> import numpy as np
>>> x = np.array([1,2,5,3,4,2,3,2,5,4])
>>> u, indices = np.unique(x, return_inverse=True)
>>> indices
array([0, 1, 4, 2, 3, 1, 2, 1, 4, 3])
>>> u[indices]
array([1, 2, 5, 3, 4, 2, 3, 2, 5, 4])
In the above code, first a NumPy array x is created with some repeated values. Then, the np.unique function is used to find the unique elements in the array x and the indices that reconstruct the original array from unique values. The returned unique values are stored in the u variable and the indices that reconstruct the original array are stored in the indices variable.
The indices variable is an array of indices, which are the positions of each element in the u array that corresponds to the corresponding element in the original x array. The u[indices] expression is then used to map the original x values back from the unique values and indices.
The resulting array is the same as the original array x, but with each element mapped to the corresponding unique value.
Frequently Asked Questions (FAQ) - numpy.unique() Function
1. What does the numpy.unique() function do?
The numpy.unique() function returns the unique elements of an array, sorted in ascending order.
2. Can numpy.unique() return additional information besides unique elements?
Yes, besides unique elements, numpy.unique() can also return indices of the input array corresponding to the unique values, indices to reconstruct the original array from the unique array, and the count of each unique element.
3. Is the output of numpy.unique() always sorted?
Yes, the output of numpy.unique() is always sorted in ascending order.
4. How can numpy.unique() be helpful in data analysis?
In data analysis, numpy.unique() is useful for identifying unique values within datasets, removing duplicates, and performing various statistical analyses.
5. Can numpy.unique() handle multi-dimensional arrays?
Yes, numpy.unique() can handle multi-dimensional arrays. It treats them as flattened arrays when determining unique elements.
6. Are there any limitations to the data types that numpy.unique() can process?
numpy.unique() can process arrays of any data type, including integers, floats, strings, and custom data types.
7. How does numpy.unique() handle NaN (Not a Number) values?
numpy.unique() treats NaN values as unique elements and includes them in the output.
8. Can numpy.unique() be used to remove duplicates from an array?
Yes, numpy.unique() can be used to remove duplicates from an array by extracting only the unique elements.
9. In what scenarios is numpy.unique() commonly used?
numpy.unique() is commonly used in data preprocessing, statistical analysis, machine learning, and general programming tasks where identifying unique elements is required.
10. Does numpy.unique() modify the original array?
No, numpy.unique() does not modify the original array. It returns a new array containing the unique elements.
Python - NumPy Code Editor:
Previous: trim_zeros()
Next: Rearrangeing elements flip()
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics