Creating the covariance matrix of the dataset

To calculate the covariance matrix of iris, we will first calculate the feature-wise mean vector (for use in the future) and then calculate our covariance matrix using NumPy.

The covariance matrix is a d x d matrix (square matrix with the same number of features as the number of rows and columns) that represents feature interactions between each feature. It is quite similar to a correlation matrix:

# Calculate a PCA manually# import numpyimport numpy as np# calculate the mean vectormean_vector = iris_X.mean(axis=0)print mean_vector[ 5.84333333  3.054       3.75866667  1.19866667]
# calculate the covariance matrixcov_mat = np.cov((iris_X-mean_vector).T)print cov_mat.shape(4, 4)

The variable cov_mat stores ...

Get Feature Engineering Made Easy now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.