Network and Graphical Model

 Hello Everybody,


Today we'll learn about network and graphical model.

so let's start.

Graph is basically study of the relationships. modern graph networks are loosely inspired by human brain. [draw diagram] in the human brain also we have Neurons and one neuron is connected to other neuron through dendrite. similarly in graphs also we have similar network but here we call it as node and multiple nodes connected to each other through edge.

[show in existing diagram] if we are creating the graph for social networks then each node represents person and edge represents friendship or connection. similarly if we are creating for maps then node represents city or place and edge represents connecting road.

But why do we study the Graph and networks?

whole facebook recommendation system depends on graph. If I want to advertise some product or If I want to find an influencer who can advertise my product facebook can easily find node who is the most influential by using some graph techniques.

Google maps uses graph network to find minimum distance between cities and suggest us the best route accordingly.

[show in existing diagram]

When Google stared it's search engine at that time they were using graphical algorithm called Page Rank. In this individual node is the website. suppose on wikipedia, I am reading about recommendation system then wikipedia referring me to read also about suprise library and sklearn website also referring to  the same surprise library and many other websites also referring to this then as per page rank algo surprise will get highest priority if someone search for recommendation system this page will be on the top. because not only many websites are referring to it but also important one are referring.

here wikipedia is referring to surprise so it is directed graph but in our facebook graph if both nodes are friends then it is undirected or bidirected graph because both are following each other.


------------------------------------------------------------------------------------------

Page 6

Now If I want to do some mathematics on the graph then it is not possible to do on this diagram. to do some mathematics, we have to convert it into numbers and that's why we convert graph into one matrix which called as adjacency matrix.

now lets understand how to create it. 

suppose I want to convert this graph. then I'll take all the nodes on the rows and the column.

then I'll search is node 1 connected with node 1 if no then 0. is node 1 connected with node 2? if yes then 1 otherwise 0. similarly on row side is node 3 connected with node 1, if yes then 1 else 0.

so this is how, we can construct the adjacency matrix.


--------------------------------------------------------------------------------------------

Page 7

Degree of a node refers to the number of edges of the node. in facebook, it can be your number of friends. in simple words it is popularity measure. higher the degree, popular the person is.

if you want to know the average degree of the network then we can do it by using formula 2m/n where m is number edges and n is number of nodes.

[draw diagram] suppose we have network like this (square of nodes) then here average degree of the node is 2m/n i.e. 2 into number of edges which are 4 and divided by number of nodes which are also 4 so answer is 2. that means in my network on an average one person is connect with two people.

------------------------------------------------------------------------------------

Page 8-9:

Suppose I have very large network where I have millions of node and finding the importance node is not possible visually. so for this I need some measure and we called it as centrality measure.

we have multiple centrality measures.

If I want to find the most popular person then without using any algorithm from the common sense. can I say the person who has most number of friends i.e node having highest edges in my network is the most popular person?

am I making any sense?

This is called as degree centrality.

but now I want to find the most influential person.

[draw diagram] suppose I have two friends one friend has 1 million normal friends i.e. he has 1 million edges and on the other side my another friend has only 100 friends but those friends are elon musk, jeff bezoz, joe bidden, narendra modi and rafel nadal.

then can you tell me who is the most influential person?

correct. because suppose person 1 tweeted something and his some of the friends retweeted again then it'll reach to max 5-10 million people but on the other hand if my friend tweeted something and suppose it's been retweeted only by his 10 friends  still it'll reach to billion people.

the centrality which helps us to find such influential person is called as eigenvector centrality.

[draw diagram] erase right side eigen vector diagram. 

Now suppose I want to sell some product and want to find influencer for that. remember that not want to advertise but sell. there is difference between process of advertisement and actual sell.

so suppose again my this friend has 1 million connection and but now this friend has only 10k connections but his all connection has very close to him. and how will you find closeness so there are multiple factor like they like, comment, share or message personally or they having similar interest talk on the same forum. so who will end up selling more number of products?

can you answer? correct... so B will endup selling more products because in case of A even though people look at his post most of them will ignore because they are not that close.

the centrality which helps us to find such a node is called as the closeness centrality.

Erase everything and draw network of Sarah 

then we have another centrality called as betweenness centrality: it measures the importance of a node in a network based on how many times it occurs as the shortest path between all pairs of the nodes in the graph. for this lets take example of Great learning batches. here we have lot of batches. I am connected only with Nov batch but Sarah connected with all the batches. so if someone from batch 2 wants connect with batch 1 then for them sarah is the shortest path and if remove Sarah from this network everything will be disconnected. so that's why from betweenness centrality Sarah is the most important node.

Read from page 10 and explain.

Page 11 - 12

We have some metrics to measure the network.

connected components is very simple, how many connected subgraph do we have. here we have 4 graph so number of connected components are 4.

Page 13

We have already learnt about avg degree of the network. 

Here, we can see that the degree distribution is looking like the power law distribution and because lot of people of very less degree that means lot of people almost 95% have only 300 to 600 friends on facebook but very few people have only high degree means more than 1 million followers. 

so thats why it is like pareto distribution, exponential distribution or power law distribution.


page 14

Diameter of network is very simple it shortest path between node i and j. here we can see in this network, to connect node A and node C, we have multiple path. one could be from A B D and C and from A E D & C but the shortest path is A E C so that's why shortest path here is 2.

like this we calculate shortest path for each and every node then we take max of it. in this case it is 2.

---------------------------------------------------------------------------------------------------

open website - https://kieranhealy.org/blog/archives/2013/06/09/using-metadata-to-find-paul-revere/

So here is example of using Graph network in the real world.

Here we everybody know that the Great American revolution started in the 17th century and before that America was under the great Britain. so queen of Britain wanted to stop this revolution and for that she hired one spy agent and his name was Mr David Fischer and this spy was statistician. so he took list of 254 people who were taking interest in the American Revolution and he started tracking them. he kept eye on  what they are doing, where they are visiting and of which club they are part of.

so he came up with this table, where have people in the row and clubs or groups are in columns and if they are part of the club then it is 1 else 0. so I can say mr. Adams.John the part of NorthCaucus and longroomclub. 

similarly mr Samual Adams is part of North caucus, longroomclub and londonenemies.

Can somebody tell what we call this type of matrics?

Correct. so that time he read about one research paper called "the Duality of persons and Groups" and this research paper talking about from adjacency matrix how you can get the relationship between people and people. and groups and groups using matrix factorization techniques.

[draw diagram]

so for this he took adjacency matrix A which is 254 X 7 because we have 254 people and 7 clubs. then he transpose it so it became 7 X 254 then he took dot product of this two matrices. can you tell me what will the dimension of this dot product? 

correct... it'll be 254 X 254. 

numbers inside this says john adams and samual adam have two common clubs. so lets check in original adjacency matrix. [show in the table] yes here we can see these two clubs are common. 

Comments

Popular posts from this blog

1st session (Python For DS)

Mentor training session