Introduction
- NumPy stands for numerical Python and it turns data into numbers that the computer can analyze to find patterns
- We use NumPy because it’s fast and it’s optimized using C in the background
Data Types and Attributes
ndarray
: n-dimensional array
.ndim
: number of dimensions.dtype
: datatype.shape
: tells you the shape (row, column, beyond).size
: how many elements in the array
Creating arrays
shift + tab
in jupyter notebook will show you the info from the documentationnp.ones(shape)
: array in the shape specified filled with onesnp.zeros(shape)
: array in the shape specified filled with zerosnp.arange(start,stop,step)
np.random.randint(low,high,shape)
np.random.random(shape)
Random Seed
- Note: random numbers are pseudo-random
np.random.seed(seed=num)
: allows you to randomly generate things in a reproducible way
Viewing arrays and matrices
np.unique(array)
: returns the unique values of the array- You can index and slice the arrays like you would normally in Python lists
- Tip: the last number in the shape is usually the inner-most array size
Manipulating & comparing arrays
arr1 + arr2
ornp.add(arr1, arr2)
: adds the two arrays together- Most standard arithmetic is compatible with arrays as well
- Important note: Not all shapes are compatible for arithmetic though so keep that in mind!
- The term for shape compatibility is broadcasting
- Dimensions are compatible if:
- they are equal to each other
- one of the numbers is 1
Aggregation
- Aggregation is performing the same operation on a number of objects
sum(arr1)
vsnp.sum(arr1)
- Use the Python operation on Python datatypes and the NumPy operations on NumPy operations
np.mean()
np.max()
np.min()
np.std()
: standard deviation- Standard deviation: measure of how spread out a group of numbers is from the mean (square root of the variance)
np.var()
: variance- Variance: measure of the average degree to which each number is different to the mean (ie: Higher variance = wider range of numbers)
Reshaping & transposing arrays
.reshape(shape)
: reshapes the array to the desired shape.T
: transposes the array meaning that it swaps the axises
Dot product
np.dot(arr1, arr2)
: dot product aka matrix multiplication- Requirement: the inside numbers of the shape have to be equal
- Result: produces a matrix that’s the shape of the outside numbers
Sorting arrays
np.sort(arr)
: sorts each rownp.argsort(arr)
: tells you the index the value will be when sortednp.argmin(arr,axis)
,np.argmax(arr,axis)
: finds the minimum or maximum value in the specified axis. Axises are “flipped”:axis=0
looks at the columnsaxis=1
looks at the rows
Practical example: turn an image into NumPy array
Other methods & functions
np.linespace()
returns evenly spaced numbers over a specified intervalnp.random.radn(size)
creates a data set that has a normal distribution