April 13, 2018
Clustering measurements provide a good way to predict where future edges will be created in a graph. One strong indication of future connections is the existence of Triadic closures around nodes.
Existing connections indicate a high probability of where a new connection will be added — two nodes that share a connection have a high probability of becoming connected themselves/
This measurement can be computed on a node by the number of pairs of connections divided by the number of pairs of connections from the nodes connections.
import networkx as nx G = nx.karate_club_graph() def local_clustering(G, n): neighbors = G.degree(n) pairs_neighbors = neighbors * (neighbors - 1) / 2 if neighbors > 1 else 1 friendly_neighbors = 0 for i in G.neighbors(n): for j in G.neighbors(i): if j in G.neighbors(n): friendly_neighbors += 1 friendly_neighbors /= 2 coeff = friendly_neighbors / pairs_neighbors return coeff local_clustering(G, 31) == nx.clustering(G, 31)
Can be described by taking the average of all the local clustering coefficients from each node, or by transivity — number of triangles divided by the number of open triads.
def ave_clustering(G): coeff = 0 num_nodes = nx.number_of_nodes(G) for n in G.nodes(): coeff += local_clustering(G, n) coeff /= num_nodes return coeff ave_clustering(G) == nx.average_clustering(G)