Clustering and PCA Session

Principle component Analysis

 * Application of PCA:

1. Dimensionality reduction - Explain with one example

2. Data Visualization - we cant visualize beyond 3D so if we want to visualize multidimensional data say 10 Dimensional data then we can use PCA.

3. Data Anonymization - if you don't want to send any confidential information then you can use PCA. give an example of credit card fraud detection. https://www.kaggle.com/mlg-ulb/creditcardfraud

4. Factor Analysis: PCA component is basically a linear combination of multiple features so you can use it for factor analysis.


How PCA does dimension reduction.


If I take any point then I need two values (x1 and x2) to represent this point. that means this is 2 dimensional.
Now how can I represent this same data on only 1 dimension without losing any information?



We can do this by rotating the axis. if I rotate my axis like this then I can see our all points are lying on the only one axis so we have preserved our all the information only on 1 axis.
so converted 2d into 1d


Suppose this time, we don't have all data points in one line, all are like the below image.


Then still I can convert it into one dimension but here I'll lose some information. (a blue line which is difference from point and red line).
How can I put points on x1? so I'll take projection, suppose light is coming from one direction then where will it's shadow will come on axis x1. there i'll take these points.

suppose if i want to take it on axis x2 then i'll put some light on opposite side and take projection on x2.



It has two components 1. direction 2. Variance
varaince is how much it is spread out along the axis.

so 1st we'll create covariance matrix then if we apply eigen decomposition on covariance matrix it'll give us two matrices 1st eigen vector and 2nd eigen values.
eigen vector will give us direction and eigen values will us the explain variance 


Here this is non linear data, it can be dimension reduce by PCA but it'll have high loss because distance between point and axis is large. so in the case of non linear data, we use TSNE

* show the real world case of PCA - https://covidscholar.org/word-embeddings
here we have 10000 data points i.e. 10k words and each word is represented by 100 dimension.
by using PCA, researcher have transformed into 3 dimensions. 

here you can see similar word to any words.

same way for TSNE but TNSE will find non linear pattern in the data.

Here andrew karpathy has taken high dimensional images and converted to the lower dimension. 

here you can see images of the same person are comming tomgether.



In the same blog, he has created 50 dimensional word embedding.
here you can see, we have similar words together.

Case Study - 

Try to Explain the code line by line.


This is correlation plot heat map.
This shows how much each feature is correlated with each other. 
in pca we have orthogonal axis, i.e. both axis are decorrelated with each other.

pc1 contains feature which are correlated with each other so all will be in 1 dimension or component

45 min 

Comments

Popular posts from this blog

Network and Graphical Model

1st session (Python For DS)

Mentor training session