Networks? Graphs?

A mathematical structure used to model pairwise relations between objects, where the objects are usually called nodes and the relationship between them edges.

G = (V, E)

V = set of nodes/vetices

E = set of (x, y) edges

In [1]:
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
In [2]:
nx.draw_circular(nx.erdos_renyi_graph(10, 0.4), with_labels=True)

Examples of Networks?

Can you think of some real world networks?

Using NetworkX

In [3]:
# Creating a graph/network object
G = nx.Graph()
In [4]:
# accessing nodes
G.nodes()
Out[4]:
NodeView(())
In [5]:
# accessing edges
G.edges()
Out[5]:
EdgeView([])
In [6]:
# Let's create a recipe network 
# G.add_node('Tomato')
# G.add_node('Eggs')
# G.add_node('Lamb')
# G.add_node('Chicken')
G.add_nodes_from(['Tomato', 'Eggs', 'Lamb', 'Chicken'])
In [7]:
# G.add_edge('Tomato', 'Eggs')
# G.add_edge('Lamb', 'Tomato')
# G.add_edge('Chicken', 'Tomato')
G.add_edges_from([('Tomato', 'Eggs'), ('Tomato', 'Lamb'), ('Tomato', 'Chicken')])
In [8]:
G.nodes()
Out[8]:
NodeView(('Tomato', 'Eggs', 'Lamb', 'Chicken'))
In [9]:
G.edges()
Out[9]:
EdgeView([('Tomato', 'Eggs'), ('Tomato', 'Lamb'), ('Tomato', 'Chicken')])
In [10]:
nx.draw(G, with_labels=True)
In [11]:
# any hashable object can be a node in the network
G.add_node([1, 2])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-ef5d43398d9a> in <module>()
      1 # any hashable object can be a node in the network
----> 2 G.add_node([1, 2])

~/dev/venv/system/lib/python3.6/site-packages/networkx/classes/graph.py in add_node(self, node_for_adding, **attr)
    479         doesn't change on mutables.
    480         """
--> 481         if node_for_adding not in self._node:
    482             self._adj[node_for_adding] = self.adjlist_inner_dict_factory()
    483             self._node[node_for_adding] = attr

TypeError: unhashable type: 'list'
Let's work on a read world network.

Arxiv GR-QC (General Relativity and Quantum Cosmology) collaboration network is from the e-print arXiv and covers scientific collaborations between authors papers submitted to General Relativity and Quantum Cosmology category. If an author i co-authored a paper with author j, the graph contains a undirected edge from i to j. If the paper is co-authored by k authors this generates a completely connected (sub)graph on k nodes.

source: http://snap.stanford.edu/data/index.html#canets

In [12]:
# If six authors wrote a paper together, they will have a complete graph
nx.draw(nx.complete_graph(6))
In [13]:
# create a author graph from the dataset
import csv
authors_graph = nx.Graph()
with open('CA-GrQc.txt', 'r') as f:
    reader = csv.reader(f, delimiter='\t')
    for row in reader:
        authors_graph.add_edge(row[0], row[1])
In [14]:
print(authors_graph.number_of_edges())
print(authors_graph.number_of_nodes())
14496
5242

Can we find the most influential researcher in this network?

Something which is not so much dependent on citations.

How do we evaluate the importance of some individuals in a network?

Within a social network, there will be certain individuals which perform certain important functions. For example, there may be hyper-connected individuals who are connected to many, many more people. They would be of use in the spreading of information. Alternatively, if this were a disease contact network, identifying them would be useful in stopping the spread of diseases. How would one identify these people?

Exercise

Create a list of (node, degree of node) tuples and find the node with maximum degree.

degree of node = number of neighbors

In [15]:
result = [(node, len(list(authors_graph.neighbors(node)))) for node in authors_graph.nodes()]
max(result, key=lambda node:node[1])
Out[15]:
('21012', 81)

The degree of a node translates to degree centrality (which is a normalised version of degree)

In [16]:
nx.degree_centrality(authors_graph)
Out[16]:
{'3466': 0.0015264262545315779,
 '937': 0.0009540164090822362,
 '5233': 0.00038160656363289447,
 '8579': 0.0009540164090822362,
 '10310': 0.002480442663613814,
 '15931': 0.0019080328181644724,
 '17038': 0.003243655790879603,
 '18720': 0.0005724098454493417,
 '19607': 0.0007632131272657889,
 '1854': 0.0015264262545315779,
 '4583': 0.0005724098454493417,
 '9572': 0.006487311581759206,
 '10841': 0.0013356229727151307,
 '13056': 0.0019080328181644724,
 '14982': 0.00038160656363289447,
 '16310': 0.003243655790879603,
 '19640': 0.004770082045411181,
 '23855': 0.0015264262545315779,
 '24372': 0.00019080328181644724,
 '24814': 0.0040068689181453915,
 '5052': 0.00553329517267697,
 '899': 0.0005724098454493417,
 '1796': 0.00038160656363289447,
 '2287': 0.0026712459454302615,
 '3096': 0.0022896393817973667,
 '3386': 0.0020988360999809196,
 '4472': 0.0007632131272657889,
 '5346': 0.003816065636328945,
 '5740': 0.00019080328181644724,
 '6094': 0.0007632131272657889,
 '6376': 0.00038160656363289447,
 '9124': 0.0045792787635947334,
 '10235': 0.002480442663613814,
 '10427': 0.0015264262545315779,
 '10597': 0.0011448196908986834,
 '15159': 0.0007632131272657889,
 '16148': 0.00343445907269605,
 '16741': 0.0013356229727151307,
 '18235': 0.00019080328181644724,
 '18549': 0.00019080328181644724,
 '19297': 0.0030528525090631558,
 '20511': 0.0061057050181263115,
 '20595': 0.001717229536348025,
 '20613': 0.0019080328181644724,
 '24371': 0.004197672199961839,
 '24559': 0.004388475481778287,
 '24731': 0.00019080328181644724,
 '25102': 0.0013356229727151307,
 '25271': 0.0030528525090631558,
 '25396': 0.004388475481778287,
 '1658': 0.0011448196908986834,
 '4822': 0.00038160656363289447,
 '6864': 0.00038160656363289447,
 '7689': 0.0061057050181263115,
 '7926': 0.001717229536348025,
 '10268': 0.0013356229727151307,
 '12971': 0.00019080328181644724,
 '18600': 0.0007632131272657889,
 '20421': 0.0005724098454493417,
 '20886': 0.00038160656363289447,
 '21048': 0.0011448196908986834,
 '22393': 0.0009540164090822362,
 '23186': 0.0011448196908986834,
 '23214': 0.0007632131272657889,
 '23298': 0.00038160656363289447,
 '23945': 0.0005724098454493417,
 '24939': 0.0005724098454493417,
 '339': 0.005724098454493417,
 '624': 0.003243655790879603,
 '3731': 0.001717229536348025,
 '4743': 0.004770082045411181,
 '5407': 0.001717229536348025,
 '6610': 0.012974623163518412,
 '6700': 0.005914901736309864,
 '8045': 0.0028620492272467086,
 '9099': 0.001717229536348025,
 '9639': 0.005914901736309864,
 '9785': 0.012974623163518412,
 '12141': 0.0009540164090822362,
 '15184': 0.002480442663613814,
 '15784': 0.0020988360999809196,
 '18719': 0.004197672199961839,
 '19870': 0.0040068689181453915,
 '20532': 0.003816065636328945,
 '22527': 0.004770082045411181,
 '23576': 0.0007632131272657889,
 '23577': 0.001717229536348025,
 '23649': 0.001717229536348025,
 '24199': 0.0011448196908986834,
 '24293': 0.003816065636328945,
 '25201': 0.002480442663613814,
 '10243': 0.0026712459454302615,
 '6774': 0.005151688609044075,
 '8049': 0.0015264262545315779,
 '8053': 0.0005724098454493417,
 '8517': 0.0013356229727151307,
 '11964': 0.002480442663613814,
 '15538': 0.0026712459454302615,
 '16694': 0.001717229536348025,
 '18648': 0.0007632131272657889,
 '19423': 0.012020606754436176,
 '21012': 0.015455065827132226,
 '22457': 0.005724098454493417,
 '22691': 0.014691852699866437,
 '23452': 0.004197672199961839,
 '16174': 0.0007632131272657889,
 '16470': 0.00019080328181644724,
 '17822': 0.00019080328181644724,
 '14265': 0.007059721427208548,
 '392': 0.00019080328181644724,
 '2485': 0.00038160656363289447,
 '2949': 0.001717229536348025,
 '3173': 0.00019080328181644724,
 '3441': 0.0011448196908986834,
 '3593': 0.0011448196908986834,
 '3853': 0.00019080328181644724,
 '3927': 0.00038160656363289447,
 '3937': 0.0007632131272657889,
 '3939': 0.00343445907269605,
 '5107': 0.00343445907269605,
 '5218': 0.00038160656363289447,
 '5230': 0.00038160656363289447,
 '6030': 0.0007632131272657889,
 '7350': 0.0036252623545124977,
 '7504': 0.0020988360999809196,
 '7601': 0.00038160656363289447,
 '8718': 0.00038160656363289447,
 '9522': 0.0009540164090822362,
 '11621': 0.0013356229727151307,
 '12498': 0.00038160656363289447,
 '12691': 0.00038160656363289447,
 '15251': 0.00019080328181644724,
 '16020': 0.0013356229727151307,
 '16261': 0.00019080328181644724,
 '17156': 0.0005724098454493417,
 '17626': 0.0030528525090631558,
 '18622': 0.00038160656363289447,
 '19059': 0.0011448196908986834,
 '19525': 0.0005724098454493417,
 '19738': 0.0009540164090822362,
 '20122': 0.0007632131272657889,
 '20432': 0.0009540164090822362,
 '21866': 0.00019080328181644724,
 '22074': 0.0019080328181644724,
 '23721': 0.0005724098454493417,
 '8916': 0.0020988360999809196,
 '13556': 0.003816065636328945,
 '14485': 0.0040068689181453915,
 '8612': 0.00343445907269605,
 '615': 0.00343445907269605,
 '743': 0.001717229536348025,
 '2076': 0.00038160656363289447,
 '4515': 0.0015264262545315779,
 '5773': 0.0007632131272657889,
 '9482': 0.0040068689181453915,
 '10822': 0.00038160656363289447,
 '11175': 0.001717229536348025,
 '11604': 0.0009540164090822362,
 '14004': 0.00038160656363289447,
 '15003': 0.011829803472619728,
 '15552': 0.00343445907269605,
 '15814': 0.0022896393817973667,
 '16083': 0.0019080328181644724,
 '17932': 0.0020988360999809196,
 '20001': 0.00019080328181644724,
 '20100': 0.0022896393817973667,
 '23481': 0.0009540164090822362,
 '16258': 0.002480442663613814,
 '1356': 0.0007632131272657889,
 '1727': 0.0013356229727151307,
 '2752': 0.0007632131272657889,
 '4125': 0.0009540164090822362,
 '6667': 0.0005724098454493417,
 '6825': 0.00038160656363289447,
 '10039': 0.0020988360999809196,
 '10351': 0.0005724098454493417,
 '11082': 0.0022896393817973667,
 '14123': 0.002480442663613814,
 '16676': 0.00038160656363289447,
 '21194': 0.00019080328181644724,
 '10912': 0.0005724098454493417,
 '14534': 0.0022896393817973667,
 '17268': 0.0007632131272657889,
 '19783': 0.00019080328181644724,
 '21705': 0.0005724098454493417,
 '22836': 0.0005724098454493417,
 '2710': 0.006296508299942759,
 '62': 0.0013356229727151307,
 '106': 0.0007632131272657889,
 '260': 0.0009540164090822362,
 '2959': 0.0019080328181644724,
 '3677': 0.0022896393817973667,
 '4708': 0.0005724098454493417,
 '5172': 0.0019080328181644724,
 '5541': 0.0011448196908986834,
 '5794': 0.0005724098454493417,
 '5807': 0.001717229536348025,
 '6575': 0.0020988360999809196,
 '8458': 0.00038160656363289447,
 '10601': 0.00038160656363289447,
 '11401': 0.00038160656363289447,
 '13026': 0.0007632131272657889,
 '13205': 0.0009540164090822362,
 '13659': 0.0015264262545315779,
 '13989': 0.0019080328181644724,
 '14007': 0.0007632131272657889,
 '14009': 0.0005724098454493417,
 '14599': 0.005724098454493417,
 '15301': 0.00038160656363289447,
 '18757': 0.0015264262545315779,
 '20934': 0.0009540164090822362,
 '21543': 0.0013356229727151307,
 '22184': 0.0013356229727151307,
 '23647': 0.0022896393817973667,
 '23708': 0.0030528525090631558,
 '25916': 0.0019080328181644724,
 '26023': 0.0005724098454493417,
 '26051': 0.00038160656363289447,
 '26100': 0.00038160656363289447,
 '214': 0.0013356229727151307,
 '5435': 0.0005724098454493417,
 '6512': 0.009349360809005915,
 '10590': 0.0013356229727151307,
 '23559': 0.0005724098454493417,
 '1765': 0.0015264262545315779,
 '3032': 0.0022896393817973667,
 '5302': 0.0020988360999809196,
 '7383': 0.002480442663613814,
 '7442': 0.00343445907269605,
 '7768': 0.0019080328181644724,
 '13276': 0.004960885327227628,
 '17266': 0.001717229536348025,
 '22415': 0.0022896393817973667,
 '10794': 0.00038160656363289447,
 '7050': 0.0011448196908986834,
 '25850': 0.00038160656363289447,
 '10113': 0.0022896393817973667,
 '10657': 0.00038160656363289447,
 '12130': 0.0020988360999809196,
 '17172': 0.0015264262545315779,
 '4846': 0.004197672199961839,
 '676': 0.004960885327227628,
 '824': 0.002480442663613814,
 '2133': 0.0022896393817973667,
 '2654': 0.007250524709024995,
 '4748': 0.0036252623545124977,
 '5672': 0.00038160656363289447,
 '10549': 0.0007632131272657889,
 '12928': 0.0020988360999809196,
 '13220': 0.0020988360999809196,
 '14419': 0.0026712459454302615,
 '17330': 0.003243655790879603,
 '17439': 0.00553329517267697,
 '18487': 0.005151688609044075,
 '20850': 0.0028620492272467086,
 '22779': 0.0020988360999809196,
 '23382': 0.005724098454493417,
 '24029': 0.0009540164090822362,
 '11785': 0.0040068689181453915,
 '45': 0.009349360809005915,
 '46': 0.008013737836290783,
 '570': 0.009158557527189467,
 '773': 0.010494180499904597,
 '1653': 0.010684983781721046,
 '2212': 0.00896775424537302,
 '2741': 0.01240221331806907,
 '2952': 0.008586147681740125,
 '3372': 0.009349360809005915,
 '4046': 0.00019080328181644724,
 '4164': 0.01030337721808815,
 '4511': 0.008586147681740125,
 '4513': 0.008204541118107232,
 '5262': 0.00019080328181644724,
 '6179': 0.008776950963556573,
 '6830': 0.008586147681740125,
 '7956': 0.010684983781721046,
 '8879': 0.008586147681740125,
 '11241': 0.009349360809005915,
 '11472': 0.008586147681740125,
 '12365': 0.014691852699866437,
 '12496': 0.009158557527189467,
 '12678': 0.0007632131272657889,
 '12781': 0.010875787063537493,
 '12851': 0.008586147681740125,
 '14540': 0.008776950963556573,
 '14807': 0.011448196908986834,
 '15659': 0.008586147681740125,
 '16159': 0.0007632131272657889,
 '17655': 0.012593016599885518,
 '17692': 0.008586147681740125,
 '18894': 0.00896775424537302,
 '19961': 0.008586147681740125,
 '20108': 0.008586147681740125,
 '20562': 0.008586147681740125,
 '20635': 0.00896775424537302,
 '21281': 0.01507345926349933,
 '21508': 0.012783819881701965,
 '21847': 0.009158557527189467,
 '22798': 0.0011448196908986834,
 '22887': 0.00896775424537302,
 '23293': 0.010112573936271704,
 '24955': 0.009730967372638809,
 '25346': 0.010684983781721046,
 '25758': 0.009730967372638809,
 '934': 0.0013356229727151307,
 '5579': 0.00019080328181644724,
 '9755': 0.005914901736309864,
 '10550': 0.0007632131272657889,
 '16032': 0.0030528525090631558,
 '17331': 0.00019080328181644724,
 '17603': 0.0007632131272657889,
 '20644': 0.0011448196908986834,
 '22497': 0.0007632131272657889,
 '23387': 0.00038160656363289447,
 '23907': 0.0009540164090822362,
 '24924': 0.004197672199961839,
 '25080': 0.0026712459454302615,
 '12422': 0.0005724098454493417,
 '1339': 0.0009540164090822362,
 '3164': 0.0011448196908986834,
 '15580': 0.0013356229727151307,
 '16393': 0.0007632131272657889,
 '20478': 0.004960885327227628,
 '20956': 0.00019080328181644724,
 '3890': 0.001717229536348025,
 '5621': 0.004197672199961839,
 '8824': 0.00038160656363289447,
 '11613': 0.0005724098454493417,
 '12306': 0.0005724098454493417,
 '12860': 0.0005724098454493417,
 '14547': 0.00019080328181644724,
 '18182': 0.0005724098454493417,
 '21707': 0.0005724098454493417,
 '24696': 0.003816065636328945,
 '2661': 0.0013356229727151307,
 '7899': 0.0009540164090822362,
 '8067': 0.0009540164090822362,
 '8208': 0.0005724098454493417,
 '11132': 0.0013356229727151307,
 '11402': 0.0007632131272657889,
 '12980': 0.0015264262545315779,
 '13364': 0.0015264262545315779,
 '14969': 0.0013356229727151307,
 '16389': 0.004960885327227628,
 '18109': 0.0009540164090822362,
 '18365': 0.0026712459454302615,
 '23038': 0.006296508299942759,
 '24845': 0.0015264262545315779,
 '25379': 0.00038160656363289447,
 '13740': 0.0020988360999809196,
 '4550': 0.0040068689181453915,
 '4702': 0.00038160656363289447,
 '7264': 0.0028620492272467086,
 '13096': 0.004197672199961839,
 '14128': 0.0007632131272657889,
 '19489': 0.0030528525090631558,
 '19527': 0.0007632131272657889,
 '19784': 0.00038160656363289447,
 '22476': 0.0013356229727151307,
 '25006': 0.004197672199961839,
 '25486': 0.0005724098454493417,
 '26': 0.0009540164090822362,
 '1407': 0.0011448196908986834,
 '1488': 0.005914901736309864,
 '8219': 0.00038160656363289447,
 '10762': 0.005724098454493417,
 '11801': 0.0020988360999809196,
 '12665': 0.00019080328181644724,
 '12688': 0.0005724098454493417,
 '13142': 0.005342491890860523,
 '15108': 0.004197672199961839,
 '15321': 0.0009540164090822362,
 '20647': 0.0005724098454493417,
 '20827': 0.0015264262545315779,
 '20879': 0.0028620492272467086,
 '23614': 0.004960885327227628,
 '3909': 0.003243655790879603,
 '17979': 0.0009540164090822362,
 '3872': 0.0015264262545315779,
 '5109': 0.00038160656363289447,
 '7533': 0.0015264262545315779,
 '12409': 0.0005724098454493417,
 '20101': 0.00019080328181644724,
 '23096': 0.00019080328181644724,
 '8862': 0.0013356229727151307,
 '78': 0.0007632131272657889,
 '4877': 0.0005724098454493417,
 '7459': 0.0022896393817973667,
 '8254': 0.002480442663613814,
 '12155': 0.0005724098454493417,
 '22598': 0.00343445907269605,
 '24932': 0.0009540164090822362,
 '888': 0.0011448196908986834,
 '1520': 0.0011448196908986834,
 '6468': 0.0011448196908986834,
 '6627': 0.0005724098454493417,
 '7007': 0.004770082045411181,
 '7712': 0.00343445907269605,
 '10711': 0.004388475481778287,
 '13614': 0.0030528525090631558,
 '14102': 0.0013356229727151307,
 '18517': 0.001717229536348025,
 '18676': 0.001717229536348025,
 '23351': 0.001717229536348025,
 '23689': 0.0005724098454493417,
 '24114': 0.00019080328181644724,
 '2465': 0.0007632131272657889,
 '2592': 0.00038160656363289447,
 '3977': 0.0013356229727151307,
 '5055': 0.00019080328181644724,
 '5993': 0.002480442663613814,
 '9265': 0.00019080328181644724,
 '12334': 0.002480442663613814,
 '19890': 0.0011448196908986834,
 '20341': 0.00019080328181644724,
 '21560': 0.00038160656363289447,
 '17309': 0.00019080328181644724,
 '24833': 0.003243655790879603,
 '543': 0.0045792787635947334,
 '1958': 0.0036252623545124977,
 '2193': 0.0011448196908986834,
 '3917': 0.0005724098454493417,
 '6858': 0.0005724098454493417,
 '8148': 0.0005724098454493417,
 '9092': 0.0005724098454493417,
 '12478': 0.0019080328181644724,
 '15366': 0.0005724098454493417,
 '18125': 0.0005724098454493417,
 '18398': 0.0005724098454493417,
 '19675': 0.0030528525090631558,
 '21806': 0.0011448196908986834,
 '23693': 0.001717229536348025,
 '26196': 0.0013356229727151307,
 '10115': 0.00038160656363289447,
 '10134': 0.00019080328181644724,
 '23916': 0.00019080328181644724,
 '7893': 0.001717229536348025,
 '593': 0.0030528525090631558,
 '5510': 0.0013356229727151307,
 '9360': 0.0019080328181644724,
 '12627': 0.0028620492272467086,
 '16778': 0.0015264262545315779,
 '18037': 0.0005724098454493417,
 '18051': 0.0019080328181644724,
 '13385': 0.00019080328181644724,
 '19578': 0.00019080328181644724,
 '12386': 0.0009540164090822362,
 '13333': 0.0009540164090822362,
 '23896': 0.0028620492272467086,
 '8978': 0.0007632131272657889,
 '9017': 0.005342491890860523,
 '15170': 0.0007632131272657889,
 '15455': 0.003243655790879603,
 '16589': 0.0009540164090822362,
 '2255': 0.0009540164090822362,
 '3056': 0.0005724098454493417,
 '6158': 0.0007632131272657889,
 '7307': 0.004960885327227628,
 '7324': 0.0007632131272657889,
 '8365': 0.0022896393817973667,
 '9023': 0.00019080328181644724,
 '11444': 0.0005724098454493417,
 '12324': 0.0009540164090822362,
 '12472': 0.0020988360999809196,
 '13831': 0.0030528525090631558,
 '14746': 0.004770082045411181,
 '16128': 0.0007632131272657889,
 '17075': 0.0036252623545124977,
 '18875': 0.0015264262545315779,
 '19900': 0.0007632131272657889,
 '20000': 0.0009540164090822362,
 '20806': 0.00038160656363289447,
 '21944': 0.0007632131272657889,
 '21968': 0.002480442663613814,
 '23302': 0.0005724098454493417,
 '23665': 0.0022896393817973667,
 '23758': 0.00038160656363289447,
 '24722': 0.0007632131272657889,
 '12045': 0.00019080328181644724,
 '12287': 0.0005724098454493417,
 '14181': 0.0019080328181644724,
 '20257': 0.0005724098454493417,
 '21613': 0.0005724098454493417,
 '7510': 0.00038160656363289447,
 '197': 0.00038160656363289447,
 '8851': 0.0020988360999809196,
 '1343': 0.0007632131272657889,
 '2991': 0.00038160656363289447,
 '8299': 0.00019080328181644724,
 '15416': 0.00038160656363289447,
 '18088': 0.0015264262545315779,
 '25286': 0.002480442663613814,
 '1254': 0.0005724098454493417,
 '3420': 0.0005724098454493417,
 '10130': 0.0028620492272467086,
 '2250': 0.003243655790879603,
 '3243': 0.0005724098454493417,
 '7717': 0.003243655790879603,
 '7985': 0.0028620492272467086,
 '11015': 0.004197672199961839,
 '12085': 0.001717229536348025,
 '13714': 0.003243655790879603,
 '14767': 0.003243655790879603,
 '16056': 0.0022896393817973667,
 '16994': 0.003816065636328945,
 '17414': 0.001717229536348025,
 '18971': 0.003243655790879603,
 '19216': 0.003243655790879603,
 '20534': 0.003243655790879603,
 '21776': 0.0009540164090822362,
 '21860': 0.0015264262545315779,
 '25205': 0.003243655790879603,
 '178': 0.0015264262545315779,
 '1248': 0.0005724098454493417,
 '1403': 0.0013356229727151307,
 '2368': 0.001717229536348025,
 '2420': 0.0009540164090822362,
 '16210': 0.002480442663613814,
 '18681': 0.0005724098454493417,
 '20641': 0.00038160656363289447,
 '24762': 0.0007632131272657889,
 '2307': 0.00038160656363289447,
 '6934': 0.0026712459454302615,
 '22423': 0.0030528525090631558,
 '231': 0.005724098454493417,
 '345': 0.002480442663613814,
 '1186': 0.003243655790879603,
 '1234': 0.0013356229727151307,
 '1841': 0.0011448196908986834,
 '1997': 0.0015264262545315779,
 '2404': 0.0015264262545315779,
 '2450': 0.00038160656363289447,
 '2980': 0.0045792787635947334,
 '3409': 0.0030528525090631558,
 '5134': 0.0013356229727151307,
 '5578': 0.0013356229727151307,
 '8503': 0.0026712459454302615,
 '9341': 0.0007632131272657889,
 '9889': 0.003816065636328945,
 '12503': 0.001717229536348025,
 '13060': 0.001717229536348025,
 '13597': 0.003243655790879603,
 '16611': 0.0011448196908986834,
 '18208': 0.0045792787635947334,
 '18543': 0.0015264262545315779,
 '18866': 0.008395344399923678,
 '22421': 0.0011448196908986834,
 '22937': 0.00343445907269605,
 '23363': 0.0019080328181644724,
 '23628': 0.0011448196908986834,
 '25053': 0.001717229536348025,
 '25251': 0.00038160656363289447,
 '2982': 0.0005724098454493417,
 '4036': 0.0005724098454493417,
 '4115': 0.0009540164090822362,
 '12938': 0.0015264262545315779,
 '13032': 0.00038160656363289447,
 '19215': 0.00038160656363289447,
 '21432': 0.00343445907269605,
 '22726': 0.0005724098454493417,
 '22834': 0.0011448196908986834,
 '22966': 0.0007632131272657889,
 '23511': 0.0007632131272657889,
 '25528': 0.0005724098454493417,
 '25836': 0.0015264262545315779,
 '14376': 0.00019080328181644724,
 '8710': 0.00038160656363289447,
 '22483': 0.0015264262545315779,
 '1375': 0.00038160656363289447,
 '2846': 0.0030528525090631558,
 '5555': 0.0015264262545315779,
 '5564': 0.0019080328181644724,
 '5787': 0.0015264262545315779,
 '9721': 0.00038160656363289447,
 '10158': 0.00019080328181644724,
 '10942': 0.0020988360999809196,
 '13600': 0.00038160656363289447,
 '13929': 0.008586147681740125,
 '21075': 0.0030528525090631558,
 '21316': 0.00019080328181644724,
 '22900': 0.00019080328181644724,
 '23637': 0.0019080328181644724,
 '23770': 0.0019080328181644724,
 '25143': 0.00038160656363289447,
 '25601': 0.00019080328181644724,
 '25980': 0.00038160656363289447,
 '17394': 0.00019080328181644724,
 '18924': 0.0015264262545315779,
 '3113': 0.003243655790879603,
 '8312': 0.0013356229727151307,
 '10765': 0.0011448196908986834,
 '17538': 0.0011448196908986834,
 '25978': 0.0011448196908986834,
 '1172': 0.0015264262545315779,
 '5674': 0.00019080328181644724,
 '26194': 0.00038160656363289447,
 '375': 0.00019080328181644724,
 '1838': 0.00038160656363289447,
 '12733': 0.0019080328181644724,
 '7188': 0.0022896393817973667,
 '896': 0.0009540164090822362,
 '921': 0.00038160656363289447,
 '1508': 0.0005724098454493417,
 '6815': 0.0005724098454493417,
 '7209': 0.0005724098454493417,
 '8279': 0.0009540164090822362,
 '13008': 0.0036252623545124977,
 '18605': 0.00019080328181644724,
 '21158': 0.0005724098454493417,
 '4632': 0.0022896393817973667,
 '7844': 0.002480442663613814,
 '11053': 0.0011448196908986834,
 '11148': 0.0019080328181644724,
 '13411': 0.0015264262545315779,
 '14512': 0.0026712459454302615,
 '16594': 0.0036252623545124977,
 '16722': 0.0013356229727151307,
 '16726': 0.0005724098454493417,
 '16876': 0.0013356229727151307,
 '19954': 0.0011448196908986834,
 '19992': 0.0028620492272467086,
 '20391': 0.0013356229727151307,
 '20774': 0.0028620492272467086,
 '23403': 0.0020988360999809196,
 '4870': 0.0005724098454493417,
 '5175': 0.0011448196908986834,
 '8282': 0.0005724098454493417,
 '22046': 0.0020988360999809196,
 '2449': 0.006487311581759206,
 '4766': 0.00038160656363289447,
 '3561': 0.001717229536348025,
 '4868': 0.0011448196908986834,
 '8352': 0.00038160656363289447,
 '10456': 0.001717229536348025,
 '15365': 0.0009540164090822362,
 '16931': 0.0015264262545315779,
 '8157': 0.00019080328181644724,
 '2120': 0.00019080328181644724,
 '7713': 0.00019080328181644724,
 '19052': 0.00019080328181644724,
 '8302': 0.00019080328181644724,
 '16484': 0.00019080328181644724,
 '17778': 0.0009540164090822362,
 '2556': 0.0020988360999809196,
 '19159': 0.0011448196908986834,
 '21699': 0.0005724098454493417,
 '25382': 0.0005724098454493417,
 '8701': 0.0019080328181644724,
 '523': 0.00038160656363289447,
 '5464': 0.0011448196908986834,
 '7774': 0.0007632131272657889,
 '17379': 0.0026712459454302615,
 '18008': 0.0007632131272657889,
 '23385': 0.00019080328181644724,
 '25868': 0.00019080328181644724,
 '26170': 0.00038160656363289447,
 '1310': 0.0020988360999809196,
 '2922': 0.0005724098454493417,
 '3651': 0.0066781148635756534,
 '6891': 0.0013356229727151307,
 '10162': 0.00038160656363289447,
 '10620': 0.0005724098454493417,
 '11112': 0.0007632131272657889,
 '13174': 0.0007632131272657889,
 '14864': 0.0019080328181644724,
 '17536': 0.0019080328181644724,
 '23153': 0.00038160656363289447,
 '24340': 0.0005724098454493417,
 '11102': 0.0009540164090822362,
 '18592': 0.0009540164090822362,
 '22765': 0.00038160656363289447,
 '1896': 0.0015264262545315779,
 '6838': 0.0013356229727151307,
 '25220': 0.00038160656363289447,
 '4180': 0.0011448196908986834,
 '10055': 0.003243655790879603,
 '12637': 0.0005724098454493417,
 '1832': 0.0005724098454493417,
 '4383': 0.00019080328181644724,
 '7014': 0.00038160656363289447,
 '9397': 0.004388475481778287,
 '14344': 0.0005724098454493417,
 '14385': 0.0005724098454493417,
 '21379': 0.004197672199961839,
 '24163': 0.00038160656363289447,
 '4416': 0.0036252623545124977,
 '9241': 0.002480442663613814,
 '2620': 0.0015264262545315779,
 '3679': 0.0005724098454493417,
 '4364': 0.006296508299942759,
 '5712': 0.0026712459454302615,
 '13955': 0.0022896393817973667,
 '14003': 0.0011448196908986834,
 '15235': 0.0015264262545315779,
 '18227': 0.0007632131272657889,
 '19445': 0.00019080328181644724,
 '19495': 0.00019080328181644724,
 '20168': 0.0019080328181644724,
 '23967': 0.0028620492272467086,
 '1116': 0.001717229536348025,
 '15399': 0.0005724098454493417,
 '18222': 0.0007632131272657889,
 '21287': 0.0015264262545315779,
 '23227': 0.00019080328181644724,
 '4624': 0.0022896393817973667,
 '5355': 0.00038160656363289447,
 '6863': 0.00038160656363289447,
 '12606': 0.0013356229727151307,
 '12968': 0.0019080328181644724,
 '15770': 0.0009540164090822362,
 '21322': 0.00019080328181644724,
 '22265': 0.0020988360999809196,
 '22336': 0.0020988360999809196,
 '23099': 0.00038160656363289447,
 '23880': 0.0013356229727151307,
 '25111': 0.0009540164090822362,
 '4319': 0.0005724098454493417,
 '12758': 0.0005724098454493417,
 '22023': 0.00019080328181644724,
 '14337': 0.001717229536348025,
 '5849': 0.0005724098454493417,
 '10763': 0.00019080328181644724,
 '11121': 0.00019080328181644724,
 '11194': 0.0011448196908986834,
 '15799': 0.001717229536348025,
 '16511': 0.0007632131272657889,
 '16575': 0.0011448196908986834,
 '16945': 0.00019080328181644724,
 '21157': 0.00019080328181644724,
 '2326': 0.0013356229727151307,
 '10253': 0.00038160656363289447,
 '13190': 0.0020988360999809196,
 '14325': 0.00038160656363289447,
 '19048': 0.0022896393817973667,
 '20182': 0.0013356229727151307,
 '3595': 0.0007632131272657889,
 '18174': 0.0007632131272657889,
 '17850': 0.00019080328181644724,
 '20620': 0.00019080328181644724,
 '12380': 0.0005724098454493417,
 '3197': 0.0005724098454493417,
 '6160': 0.0005724098454493417,
 '8589': 0.0013356229727151307,
 '9417': 0.00038160656363289447,
 '9829': 0.0028620492272467086,
 '14638': 0.0009540164090822362,
 '14924': 0.004388475481778287,
 '15972': 0.00019080328181644724,
 '17228': 0.0022896393817973667,
 '18940': 0.0028620492272467086,
 '19090': 0.0005724098454493417,
 '19475': 0.0005724098454493417,
 '20207': 0.0019080328181644724,
 '22644': 0.0022896393817973667,
 '22790': 0.0015264262545315779,
 '24001': 0.0011448196908986834,
 '25228': 0.0009540164090822362,
 '9710': 0.003816065636328945,
 '3345': 0.0028620492272467086,
 '3430': 0.0009540164090822362,
 '5266': 0.0011448196908986834,
 '5995': 0.0005724098454493417,
 '7999': 0.00019080328181644724,
 '8047': 0.00038160656363289447,
 '8178': 0.00038160656363289447,
 '8868': 0.00019080328181644724,
 '10824': 0.00038160656363289447,
 '15144': 0.002480442663613814,
 '19107': 0.0011448196908986834,
 '19806': 0.00038160656363289447,
 '22439': 0.0005724098454493417,
 '23304': 0.00019080328181644724,
 '24431': 0.0015264262545315779,
 '11077': 0.0026712459454302615,
 '10211': 0.0019080328181644724,
 '14972': 0.0005724098454493417,
 '15300': 0.0036252623545124977,
 '17158': 0.0005724098454493417,
 '17162': 0.00038160656363289447,
 '17403': 0.0007632131272657889,
 '20149': 0.00038160656363289447,
 '20519': 0.0005724098454493417,
 '21389': 0.0009540164090822362,
 '22951': 0.0015264262545315779,
 '23912': 0.0005724098454493417,
 '23918': 0.0007632131272657889,
 '25589': 0.0005724098454493417,
 '6895': 0.00019080328181644724,
 '3076': 0.0007632131272657889,
 '7444': 0.005342491890860523,
 '8972': 0.0007632131272657889,
 '17308': 0.00038160656363289447,
 '20574': 0.007059721427208548,
 '21629': 0.0013356229727151307,
 '245': 0.00038160656363289447,
 '4983': 0.00038160656363289447,
 '13480': 0.0011448196908986834,
 '14562': 0.00019080328181644724,
 '15912': 0.00038160656363289447,
 '16976': 0.0020988360999809196,
 '19974': 0.0007632131272657889,
 '22245': 0.00019080328181644724,
 '414': 0.00019080328181644724,
 '8708': 0.00019080328181644724,
 '23776': 0.00019080328181644724,
 '1044': 0.0015264262545315779,
 '4975': 0.0009540164090822362,
 '5809': 0.00038160656363289447,
 '12587': 0.0028620492272467086,
 '16123': 0.0009540164090822362,
 '20303': 0.0019080328181644724,
 '4451': 0.00038160656363289447,
 '9983': 0.00019080328181644724,
 '15205': 0.0007632131272657889,
 '15666': 0.0005724098454493417,
 '15667': 0.0005724098454493417,
 '19093': 0.0009540164090822362,
 '21031': 0.00038160656363289447,
 '24330': 0.003816065636328945,
 '5840': 0.00038160656363289447,
 '6732': 0.0013356229727151307,
 '8614': 0.00019080328181644724,
 '13847': 0.0013356229727151307,
 '15081': 0.0007632131272657889,
 '22609': 0.0005724098454493417,
 '3310': 0.0030528525090631558,
 '5143': 0.00038160656363289447,
 '9735': 0.0007632131272657889,
 '17396': 0.00019080328181644724,
 '26138': 0.00038160656363289447,
 '4700': 0.00038160656363289447,
 '5606': 0.00038160656363289447,
 '1386': 0.0005724098454493417,
 '1738': 0.0005724098454493417,
 '2566': 0.0005724098454493417,
 '2720': 0.0005724098454493417,
 '5634': 0.00038160656363289447,
 '9870': 0.0005724098454493417,
 '16779': 0.00038160656363289447,
 '24475': 0.0005724098454493417,
 '25931': 0.0019080328181644724,
 '1817': 0.0009540164090822362,
 '3725': 0.0007632131272657889,
 '5366': 0.00038160656363289447,
 '15911': 0.0007632131272657889,
 '9800': 0.001717229536348025,
 '26141': 0.0007632131272657889,
 '18001': 0.00038160656363289447,
 '22177': 0.0040068689181453915,
 '25480': 0.0013356229727151307,
 '1425': 0.001717229536348025,
 '2501': 0.0019080328181644724,
 '2823': 0.0007632131272657889,
 '5136': 0.00038160656363289447,
 '7105': 0.0009540164090822362,
 '7317': 0.0013356229727151307,
 '9127': 0.0007632131272657889,
 '14615': 0.0019080328181644724,
 '17089': 0.0011448196908986834,
 '19454': 0.0005724098454493417,
 '22758': 0.0020988360999809196,
 '23841': 0.0019080328181644724,
 '3207': 0.00019080328181644724,
 '7125': 0.0019080328181644724,
 '20683': 0.0013356229727151307,
 '25419': 0.0009540164090822362,
 '2591': 0.0011448196908986834,
 '8932': 0.00019080328181644724,
 '9188': 0.0005724098454493417,
 '9773': 0.00038160656363289447,
 '13713': 0.00038160656363289447,
 '19179': 0.0013356229727151307,
 '14818': 0.0009540164090822362,
 '26190': 0.0009540164090822362,
 '3839': 0.0028620492272467086,
 '13615': 0.0005724098454493417,
 '16674': 0.002480442663613814,
 '7042': 0.0007632131272657889,
 '9269': 0.00038160656363289447,
 '11661': 0.0015264262545315779,
 '21994': 0.003243655790879603,
 '9943': 0.0020988360999809196,
 '10096': 0.00343445907269605,
 '15614': 0.0013356229727151307,
 '16368': 0.0020988360999809196,
 '17285': 0.0011448196908986834,
 '21407': 0.0011448196908986834,
 '8715': 0.0007632131272657889,
 '21142': 0.00038160656363289447,
 '21167': 0.00038160656363289447,
 '13621': 0.00019080328181644724,
 '25388': 0.00019080328181644724,
 '284': 0.00019080328181644724,
 '6427': 0.0011448196908986834,
 '16835': 0.00038160656363289447,
 '18677': 0.00038160656363289447,
 '21696': 0.00038160656363289447,
 '25854': 0.00038160656363289447,
 '25940': 0.00038160656363289447,
 '9485': 0.00038160656363289447,
 '10435': 0.0005724098454493417,
 '14698': 0.00019080328181644724,
 '21823': 0.0005724098454493417,
 '811': 0.001717229536348025,
 '9964': 0.0007632131272657889,
 '11919': 0.0007632131272657889,
 '15123': 0.0005724098454493417,
 '17021': 0.00038160656363289447,
 '19517': 0.0015264262545315779,
 '22989': 0.0009540164090822362,
 '23485': 0.0011448196908986834,
 '8069': 0.00038160656363289447,
 '2127': 0.00019080328181644724,
 '11712': 0.0022896393817973667,
 '10871': 0.0026712459454302615,
 '15829': 0.0007632131272657889,
 '16106': 0.0007632131272657889,
 '17207': 0.0019080328181644724,
 '17670': 0.0007632131272657889,
 '18286': 0.0013356229727151307,
 '18612': 0.0007632131272657889,
 '19234': 0.00019080328181644724,
 '19724': 0.0005724098454493417,
 '22876': 0.002480442663613814,
 '3412': 0.0005724098454493417,
 '894': 0.0009540164090822362,
 '4814': 0.0019080328181644724,
 '5441': 0.0015264262545315779,
 '5717': 0.001717229536348025,
 '5934': 0.003243655790879603,
 '9075': 0.00038160656363289447,
 '10623': 0.0011448196908986834,
 '11223': 0.0007632131272657889,
 '11491': 0.0005724098454493417,
 '12659': 0.001717229536348025,
 '14351': 0.001717229536348025,
 '19807': 0.00038160656363289447,
 '21324': 0.0009540164090822362,
 '21665': 0.0007632131272657889,
 '22734': 0.0005724098454493417,
 '22778': 0.0019080328181644724,
 '23246': 0.0011448196908986834,
 '25698': 0.002480442663613814,
 '7615': 0.0013356229727151307,
 '12806': 0.0011448196908986834,
 '1620': 0.0013356229727151307,
 '3922': 0.0019080328181644724,
 '9133': 0.00019080328181644724,
 '9895': 0.00019080328181644724,
 '9907': 0.0019080328181644724,
 '11791': 0.0005724098454493417,
 '18457': 0.00019080328181644724,
 '19131': 0.00019080328181644724,
 '19997': 0.00019080328181644724,
 '21491': 0.003816065636328945,
 '22426': 0.0011448196908986834,
 '22791': 0.00038160656363289447,
 '25158': 0.0005724098454493417,
 '25316': 0.0011448196908986834,
 '352': 0.0005724098454493417,
 '3996': 0.0009540164090822362,
 '10555': 0.0011448196908986834,
 '15850': 0.0009540164090822362,
 '19101': 0.0009540164090822362,
 '20533': 0.0005724098454493417,
 '21943': 0.002480442663613814,
 '22603': 0.00038160656363289447,
 '23513': 0.0007632131272657889,
 '25034': 0.0036252623545124977,
 '1280': 0.0005724098454493417,
 '5851': 0.0005724098454493417,
 '11591': 0.0015264262545315779,
 '13520': 0.0030528525090631558,
 '16921': 0.0022896393817973667,
 '21646': 0.001717229536348025,
 '17559': 0.002480442663613814,
 '1674': 0.0011448196908986834,
 '3323': 0.0009540164090822362,
 '8680': 0.0009540164090822362,
 '9184': 0.0040068689181453915,
 '13813': 0.002480442663613814,
 '19204': 0.0020988360999809196,
 '19206': 0.0005724098454493417,
 '19657': 0.0019080328181644724,
 '20373': 0.004770082045411181,
 '23204': 0.0022896393817973667,
 '409': 0.0005724098454493417,
 '2474': 0.0005724098454493417,
 '4241': 0.00343445907269605,
 '6746': 0.0005724098454493417,
 '10476': 0.0007632131272657889,
 '16568': 0.0007632131272657889,
 '21771': 0.0013356229727151307,
 '24620': 0.0005724098454493417,
 '2081': 0.00019080328181644724,
 '2664': 0.00019080328181644724,
 '12212': 0.0028620492272467086,
 '13528': 0.00038160656363289447,
 '14628': 0.00038160656363289447,
 '22574': 0.001717229536348025,
 ...}

Exercise

Plot a histogram of degree centrality of authors_graph.

Hint: plt.hist(list_of_values) will plot a histogram

(count vs degree centrality)

In [17]:
plt.hist(list(nx.degree_centrality(authors_graph).values()))
plt.show()

G = nx.erdos_renyi_graph(1000, 0.3, seed=1)
plt.hist(list(nx.degree_centrality(G).values()))
plt.show()

H = nx.barabasi_albert_graph(1000, 30, 0.3)
# K = nx.powerlaw_cluster_graph(1000, 30, 0.3)

plt.hist(list(nx.degree_centrality(H).values()))
plt.show()

# plt.hist(list(nx.degree_centrality(K).values()))
# plt.show()

Let's have a look at Connected Components of a graph.

In graph theory, a connected component (or just component) of an undirected graph is a subgraph in which any two vertices are connected to each other by paths, and which is connected to no additional vertices in the supergraph.

In [18]:
G = nx.erdos_renyi_graph(10, 0.15, seed=1)
nx.draw(G, with_labels=True)
In [19]:
print([len(c) for c in sorted(nx.connected_components(authors_graph), key=len, reverse=True)])
[4158, 14, 12, 10, 9, 9, 8, 8, 8, 8, 8, 8, 7, 7, 7, 7, 7, 7, 7, 7, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1]
In [20]:
graphs = [c for c in sorted(nx.connected_component_subgraphs(authors_graph), key=len, reverse=True)]
In [21]:
len(graphs[0])
Out[21]:
4158
In [22]:
nx.draw(graphs[10])
Shortest Path in the network
In [23]:
nx.draw(nx.erdos_renyi_graph(10, 0.15, seed=1), with_labels=True)
In [24]:
print(nx.shortest_path(graphs[0], '22504', '23991'))
print(len(nx.shortest_path(graphs[0], '22504', '23991')))
print(nx.shortest_path_length(graphs[0], '22504', '23991'))
['22504', '10350', '10801', '3310', '12545', '23991']
6
5

Exercise

Six degrees of separation, Erdos Number, Bacon Number!!

Find the '22504' number of the graph authors_graph, if there is no connection between nodes then give it the number -1. Also plot a histogram of the '22504' number.

Find the average shortest path length in the first component i.e. graphs[0]

HINT: nx.shortest_path_length

In [25]:
d = {}
for node in authors_graph.nodes():
    try:
        d[node] = nx.shortest_path_length(authors_graph, '22504', node)
    except:
        d[node] = -1
In [26]:
plt.hist(list(d.values()))
plt.show()
Let's change gears and talk about Game of thrones or shall I say Network of Thrones.

It is suprising right? What is the relationship between a fatansy TV show/novel and network science or python(it's not related to a dragon).

If you haven't heard of Game of Thrones, then you must be really good at hiding. Game of Thrones is the hugely popular television series by HBO based on the (also) hugely popular book series A Song of Ice and Fire by George R.R. Martin. In this notebook, we will analyze the co-occurrence network of the characters in the Game of Thrones books. Here, two characters are considered to co-occur if their names appear in the vicinity of 15 words from one another in the books.

Andrew J. Beveridge, an associate professor of mathematics at Macalester College, and Jie Shan, an undergraduate created a network from the book A Storm of Swords by extracting relationships between characters to find out the most important characters in the book(or GoT).

The dataset is publicly avaiable for the 5 books at https://github.com/mathbeveridge/asoiaf. This is an interaction network and were created by connecting two characters whenever their names (or nicknames) appeared within 15 words of one another in one of the books. The edge weight corresponds to the number of interactions.

Credits:

Blog: https://networkofthrones.wordpress.com

Math Horizons Article: https://www.maa.org/sites/default/files/pdf/Mathhorizons/NetworkofThrones%20%281%29.pdf

In [27]:
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
from matplotlib import cm
import community
import numpy as np
import warnings
warnings.filterwarnings('ignore')

%matplotlib inline
Let's load in the datasets
In [28]:
book1 = pd.read_csv('data/asoiaf-book1-edges.csv')
book2 = pd.read_csv('data/asoiaf-book2-edges.csv')
book3 = pd.read_csv('data/asoiaf-book3-edges.csv')
book4 = pd.read_csv('data/asoiaf-book4-edges.csv')
book5 = pd.read_csv('data/asoiaf-book5-edges.csv')

The resulting DataFrame book1 has 5 columns: Source, Target, Type, weight, and book. Source and target are the two nodes that are linked by an edge. A network can have directed or undirected edges and in this network all the edges are undirected. The weight attribute of every edge tells us the number of interactions that the characters have had over the book, and the book column tells us the book number.

In [29]:
book1.head()
Out[29]:
Source Target Type weight book
0 Addam-Marbrand Jaime-Lannister Undirected 3 1
1 Addam-Marbrand Tywin-Lannister Undirected 6 1
2 Aegon-I-Targaryen Daenerys-Targaryen Undirected 5 1
3 Aegon-I-Targaryen Eddard-Stark Undirected 4 1
4 Aemon-Targaryen-(Maester-Aemon) Alliser-Thorne Undirected 4 1

Once we have the data loaded as a pandas DataFrame, it's time to create a network. We create a graph for each book. It's possible to create one MultiGraph instead of 5 graphs, but it is easier to play with different graphs.

In [30]:
G_book1 = nx.Graph()
G_book2 = nx.Graph()
G_book3 = nx.Graph()
G_book4 = nx.Graph()
G_book5 = nx.Graph()

Let's populate the graph with edges from the pandas DataFrame.

In [31]:
for row in book1.iterrows():
    G_book1.add_edge(row[1]['Source'], row[1]['Target'], weight=row[1]['weight'], book=row[1]['book'])
In [32]:
for row in book2.iterrows():
    G_book2.add_edge(row[1]['Source'], row[1]['Target'], weight=row[1]['weight'], book=row[1]['book'])
for row in book3.iterrows():
    G_book3.add_edge(row[1]['Source'], row[1]['Target'], weight=row[1]['weight'], book=row[1]['book'])
for row in book4.iterrows():
    G_book4.add_edge(row[1]['Source'], row[1]['Target'], weight=row[1]['weight'], book=row[1]['book'])
for row in book5.iterrows():
    G_book5.add_edge(row[1]['Source'], row[1]['Target'], weight=row[1]['weight'], book=row[1]['book'])
In [33]:
books = [G_book1, G_book2, G_book3, G_book4, G_book5]

Let's have a look at these edges.

In [34]:
list(G_book1.edges(data=True))[16]
Out[34]:
('Jaime-Lannister', 'Loras-Tyrell', {'weight': 3, 'book': 1})
In [35]:
list(G_book1.edges(data=True))[400]
Out[35]:
('Benjen-Stark', 'Theon-Greyjoy', {'weight': 4, 'book': 1})

Finding the most important node i.e character in these networks.

Is it Jon Snow, Tyrion, Daenerys, or someone else? Let's see! Network Science offers us many different metrics to measure the importance of a node in a network as we saw in the first part of the tutorial. Note that there is no "correct" way of calculating the most important node in a network, every metric has a different meaning.

First, let's measure the importance of a node in a network by looking at the number of neighbors it has, that is, the number of nodes it is connected to. For example, an influential account on Twitter, where the follower-followee relationship forms the network, is an account which has a high number of followers. This measure of importance is called degree centrality.

Using this measure, let's extract the top ten important characters from the first book (book[0]) and the fifth book (book[4]).

In [36]:
deg_cen_book1 = nx.degree_centrality(books[0])
In [37]:
deg_cen_book5 = nx.degree_centrality(books[4])
In [38]:
sorted(deg_cen_book1.items(), key=lambda x:x[1], reverse=True)[0:10]
Out[38]:
[('Eddard-Stark', 0.3548387096774194),
 ('Robert-Baratheon', 0.2688172043010753),
 ('Tyrion-Lannister', 0.24731182795698928),
 ('Catelyn-Stark', 0.23118279569892475),
 ('Jon-Snow', 0.19892473118279572),
 ('Robb-Stark', 0.18817204301075272),
 ('Sansa-Stark', 0.18817204301075272),
 ('Bran-Stark', 0.17204301075268819),
 ('Cersei-Lannister', 0.16129032258064518),
 ('Joffrey-Baratheon', 0.16129032258064518)]
In [39]:
sorted(deg_cen_book5.items(), key=lambda x:x[1], reverse=True)[0:10]
Out[39]:
[('Jon-Snow', 0.1962025316455696),
 ('Daenerys-Targaryen', 0.18354430379746836),
 ('Stannis-Baratheon', 0.14873417721518986),
 ('Tyrion-Lannister', 0.10443037974683544),
 ('Theon-Greyjoy', 0.10443037974683544),
 ('Cersei-Lannister', 0.08860759493670886),
 ('Barristan-Selmy', 0.07911392405063292),
 ('Hizdahr-zo-Loraq', 0.06962025316455696),
 ('Asha-Greyjoy', 0.056962025316455694),
 ('Melisandre', 0.05379746835443038)]
In [40]:
# Plot a histogram of degree centrality
plt.hist(list(nx.degree_centrality(G_book4).values()))
plt.show()
In [41]:
d = {}
for i, j in dict(nx.degree(G_book4)).items():
    if j in d:
        d[j] += 1
    else:
        d[j] = 1
x = np.log2(list((d.keys())))
y = np.log2(list(d.values()))
plt.scatter(x, y, alpha=0.9)
plt.show()

Exercise

Create a new centrality measure, weighted_degree(Graph, weight) which takes in Graph and the weight attribute and returns a weighted degree dictionary. Weighted degree is calculated by summing the weight of the all edges of a node and find the top five characters according to this measure.

In [42]:
def weighted_degree(G, weight):
    result = dict()
    for node in G.nodes():
        weight_degree = 0
        for n in G.edges([node], data=True):
            weight_degree += n[2]['weight']
        result[node] = weight_degree
    return result
In [43]:
plt.hist(list(weighted_degree(G_book1, 'weight').values()))
plt.show()
In [44]:
sorted(weighted_degree(G_book1, 'weight').items(), key=lambda x:x[1], reverse=True)[0:10]
Out[44]:
[('Eddard-Stark', 1284),
 ('Robert-Baratheon', 941),
 ('Jon-Snow', 784),
 ('Tyrion-Lannister', 650),
 ('Sansa-Stark', 545),
 ('Bran-Stark', 531),
 ('Catelyn-Stark', 520),
 ('Robb-Stark', 516),
 ('Daenerys-Targaryen', 443),
 ('Arya-Stark', 430)]

Let's do this for Betweeness centrality and check if this makes any difference

Haha, evil laugh

In [45]:
# First check unweighted, just the structure

sorted(nx.betweenness_centrality(G_book1).items(), key=lambda x:x[1], reverse=True)[0:10]
Out[45]:
[('Eddard-Stark', 0.2696038913836117),
 ('Robert-Baratheon', 0.21403028397371796),
 ('Tyrion-Lannister', 0.1902124972697492),
 ('Jon-Snow', 0.17158135899829566),
 ('Catelyn-Stark', 0.1513952715347627),
 ('Daenerys-Targaryen', 0.08627015537511595),
 ('Robb-Stark', 0.07298399629664767),
 ('Drogo', 0.06481224290874964),
 ('Bran-Stark', 0.05579958811784442),
 ('Sansa-Stark', 0.03714483664326785)]
In [46]:
# Let's care about interactions now

sorted(nx.betweenness_centrality(G_book1, weight='weight').items(), key=lambda x:x[1], reverse=True)[0:10]
Out[46]:
[('Robert-Baratheon', 0.23341885664466297),
 ('Eddard-Stark', 0.18703429235687297),
 ('Tyrion-Lannister', 0.15311225972516293),
 ('Robb-Stark', 0.1024018949825402),
 ('Catelyn-Stark', 0.10169012330302643),
 ('Jon-Snow', 0.09027684366394043),
 ('Jaime-Lannister', 0.07745109164464009),
 ('Rodrik-Cassel', 0.07667992877670296),
 ('Drogo', 0.06894355184677767),
 ('Jorah-Mormont', 0.0627085149665795)]

PageRank

The billion dollar algorithm, PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.

In [47]:
# by default weight attribute in pagerank is weight, so we use weight=None to find the unweighted results
sorted(nx.pagerank_numpy(G_book1, weight=None).items(), key=lambda x:x[1], reverse=True)[0:10]
Out[47]:
[('Eddard-Stark', 0.045520792228306635),
 ('Tyrion-Lannister', 0.03301362462493269),
 ('Catelyn-Stark', 0.030193105286631952),
 ('Robert-Baratheon', 0.029834742227736726),
 ('Jon-Snow', 0.026834499522066232),
 ('Robb-Stark', 0.021562941297247527),
 ('Sansa-Stark', 0.020008034042864657),
 ('Bran-Stark', 0.019945786786238328),
 ('Jaime-Lannister', 0.01750784720284695),
 ('Cersei-Lannister', 0.017082604584758083)]
In [48]:
sorted(nx.pagerank_numpy(G_book1, weight='weight').items(), key=lambda x:x[1], reverse=True)[0:10]
Out[48]:
[('Eddard-Stark', 0.07239401100498237),
 ('Robert-Baratheon', 0.04851727570509938),
 ('Jon-Snow', 0.04770689062474909),
 ('Tyrion-Lannister', 0.04367437892706299),
 ('Catelyn-Stark', 0.03466703470130742),
 ('Bran-Stark', 0.029774200539800223),
 ('Robb-Stark', 0.029216183645196875),
 ('Daenerys-Targaryen', 0.02708962251302111),
 ('Sansa-Stark', 0.02696177891568311),
 ('Cersei-Lannister', 0.02163167939741891)]

Is there a correlation between these techniques?

Exercise

Find the correlation between these four techniques.

  • pagerank
  • betweenness_centrality
  • weighted_degree
  • degree centrality
In [49]:
cor = pd.DataFrame.from_records([nx.pagerank_numpy(G_book1, weight='weight'), nx.betweenness_centrality(G_book1, weight='weight'), weighted_degree(G_book1, 'weight'), nx.degree_centrality(G_book1)])
In [50]:
# cor.T
In [51]:
cor.T.corr()
Out[51]:
0 1 2 3
0 1.000000 0.870214 0.992166 0.949307
1 0.870214 1.000000 0.857222 0.871385
2 0.992166 0.857222 1.000000 0.955060
3 0.949307 0.871385 0.955060 1.000000

Evolution of importance of characters over the books

According to degree centrality the most important character in the first book is Eddard Stark but he is not even in the top 10 of the fifth book. The importance changes over the course of five books, because you know stuff happens ;)

Let's look at the evolution of degree centrality of a couple of characters like Eddard Stark, Jon Snow, Tyrion which showed up in the top 10 of degree centrality in first book.

We create a dataframe with character columns and index as books where every entry is the degree centrality of the character in that particular book and plot the evolution of degree centrality Eddard Stark, Jon Snow and Tyrion. We can see that the importance of Eddard Stark in the network dies off and with Jon Snow there is a drop in the fourth book but a sudden rise in the fifth book

In [52]:
evol = [nx.degree_centrality(book) for book in books]
evol_df = pd.DataFrame.from_records(evol).fillna(0)
evol_df[['Eddard-Stark', 'Tyrion-Lannister', 'Jon-Snow']].plot()
Out[52]:
<matplotlib.axes._subplots.AxesSubplot at 0x115951b00>
In [53]:
set_of_char = set()
for i in range(5):
    set_of_char |= set(list(evol_df.T[i].sort_values(ascending=False)[0:5].index))
set_of_char
Out[53]:
{'Arya-Stark',
 'Brienne-of-Tarth',
 'Catelyn-Stark',
 'Cersei-Lannister',
 'Daenerys-Targaryen',
 'Eddard-Stark',
 'Jaime-Lannister',
 'Joffrey-Baratheon',
 'Jon-Snow',
 'Margaery-Tyrell',
 'Robb-Stark',
 'Robert-Baratheon',
 'Sansa-Stark',
 'Stannis-Baratheon',
 'Theon-Greyjoy',
 'Tyrion-Lannister'}
Exercise

Plot the evolution of weighted degree centrality of the above mentioned characters over the 5 books, and repeat the same exercise for betweenness centrality.

In [54]:
evol_df[list(set_of_char)].plot(figsize=(29,15))
Out[54]:
<matplotlib.axes._subplots.AxesSubplot at 0x11741eb70>
In [55]:
evol = [nx.betweenness_centrality(graph, weight='weight') for graph in [G_book1, G_book2, G_book3, G_book4, G_book5]]
evol_df = pd.DataFrame.from_records(evol).fillna(0)

set_of_char = set()
for i in range(5):
    set_of_char |= set(list(evol_df.T[i].sort_values(ascending=False)[0:5].index))


evol_df[list(set_of_char)].plot(figsize=(19,10))
Out[55]:
<matplotlib.axes._subplots.AxesSubplot at 0x1159c0dd8>

So what's up with Stannis Baratheon?

In [56]:
nx.draw(nx.barbell_graph(5, 1), with_labels=True)
In [57]:
sorted(nx.degree_centrality(G_book5).items(), key=lambda x:x[1], reverse=True)[:5]
Out[57]:
[('Jon-Snow', 0.1962025316455696),
 ('Daenerys-Targaryen', 0.18354430379746836),
 ('Stannis-Baratheon', 0.14873417721518986),
 ('Tyrion-Lannister', 0.10443037974683544),
 ('Theon-Greyjoy', 0.10443037974683544)]
In [58]:
sorted(nx.betweenness_centrality(G_book5).items(), key=lambda x:x[1], reverse=True)[:5]
Out[58]:
[('Stannis-Baratheon', 0.45283060689247934),
 ('Daenerys-Targaryen', 0.2959459062106149),
 ('Jon-Snow', 0.24484873673158666),
 ('Tyrion-Lannister', 0.20961613179551256),
 ('Robert-Baratheon', 0.17716906651536968)]

Community detection in Networks

A network is said to have community structure if the nodes of the network can be easily grouped into (potentially overlapping) sets of nodes such that each set of nodes is densely connected internally.

We will use louvain community detection algorithm to find the modules in our graph.

In [59]:
plt.figure(figsize=(15, 15))

partition = community.best_partition(G_book1)
size = (len(set(partition.values())))
pos = nx.kamada_kawai_layout(G_book1)
count = 0
colors = [cm.jet(x) for x in np.linspace(0, 1, size)]
for com in set(partition.values()):
    list_nodes = [nodes for nodes in partition.keys()
                                if partition[nodes] == com]
    nx.draw_networkx_nodes(G_book1, pos, list_nodes, node_size = 100, node_color=colors[count])
    count = count + 1
nx.draw_networkx_edges(G_book1, pos, alpha=0.2)
plt.show()
In [60]:
d = {}
for character, par in partition.items():
    if par in d:
        d[par].append(character)
    else:
        d[par] = [character]
d
Out[60]:
{0: ['Addam-Marbrand',
  'Jaime-Lannister',
  'Tywin-Lannister',
  'Tyrion-Lannister',
  'Bronn',
  'Chiggen',
  'Marillion',
  'Shae',
  'Shagga',
  'Vardis-Egen',
  'Willis-Wode',
  'Colemon',
  'Chella',
  'Conn',
  'Coratt',
  'Dolf',
  'Gunthor-son-of-Gurn',
  'Harys-Swyft',
  'Kevan-Lannister',
  'Kurleket',
  'Leo-Lefford',
  'Mord',
  'Timett',
  'Ulf-son-of-Umar'],
 1: ['Aegon-I-Targaryen',
  'Daenerys-Targaryen',
  'Aggo',
  'Drogo',
  'Jhogo',
  'Jorah-Mormont',
  'Quaro',
  'Rakharo',
  'Cohollo',
  'Haggo',
  'Qotho',
  'Doreah',
  'Eroeh',
  'Illyrio-Mopatis',
  'Irri',
  'Jhiqui',
  'Mirri-Maz-Duur',
  'Viserys-Targaryen',
  'Jommo',
  'Ogo',
  'Rhaego',
  'Fogo'],
 2: ['Eddard-Stark',
  'Aerys-II-Targaryen',
  'Brandon-Stark',
  'Gerold-Hightower',
  'Jon-Arryn',
  'Robert-Baratheon',
  'Alyn',
  'Harwin',
  'Jory-Cassel',
  'Tomard',
  'Arthur-Dayne',
  'Cersei-Lannister',
  'Petyr-Baelish',
  'Vayon-Poole',
  'Arys-Oakheart',
  'Balon-Greyjoy',
  'Renly-Baratheon',
  'Barristan-Selmy',
  'Pycelle',
  'Varys',
  'Lyanna-Stark',
  'Cayn',
  'Janos-Slynt',
  'Stannis-Baratheon',
  'Rhaegar-Targaryen',
  'Gendry',
  'Howland-Reed',
  'Jacks',
  'Joss',
  'Porther',
  'Raymun-Darry',
  'Tobho-Mott',
  'Tregar',
  'Varly',
  'Wyl-(guard)',
  'Wylla',
  'Oswell-Whent',
  'Heward',
  'Hugh',
  'Lancel-Lannister'],
 3: ['Aemon-Targaryen-(Maester-Aemon)',
  'Alliser-Thorne',
  'Bowen-Marsh',
  'Chett',
  'Clydas',
  'Jeor-Mormont',
  'Jon-Snow',
  'Samwell-Tarly',
  'Albett',
  'Halder',
  'Rast',
  'Grenn',
  'Pypar',
  'Benjen-Stark',
  'Yoren',
  'Jaremy-Rykker',
  'Mance-Rayder',
  'Dareon',
  'Donal-Noye',
  'Dywen',
  'Todder',
  'Hobb',
  'Jafer-Flowers',
  'Matthar',
  'Othor',
  'Randyll-Tarly'],
 4: ['Arya-Stark',
  'Desmond',
  'Ilyn-Payne',
  'Jeyne-Poole',
  'Joffrey-Baratheon',
  'Meryn-Trant',
  'Mordane',
  'Mycah',
  'Myrcella-Baratheon',
  'Sandor-Clegane',
  'Sansa-Stark',
  'Syrio-Forel',
  'Tommen-Baratheon',
  'Balon-Swann',
  'Boros-Blount',
  'Beric-Dondarrion',
  'Gregor-Clegane',
  'Loras-Tyrell',
  'Thoros-of-Myr',
  'High-Septon-(fat_one)',
  'Marq-Piper',
  'Maegor-I-Targaryen'],
 5: ['Bran-Stark',
  'Catelyn-Stark',
  'Rickon-Stark',
  'Robb-Stark',
  'Rodrik-Cassel',
  'Luwin',
  'Theon-Greyjoy',
  'Hali',
  'Hallis-Mollen',
  'Hodor',
  'Hullen',
  'Joseth',
  'Nan',
  'Osha',
  'Rickard-Karstark',
  'Rickard-Stark',
  'Stiv',
  'Brynden-Tully',
  'Edmure-Tully',
  'Hoster-Tully',
  'Lysa-Arryn',
  'Nestor-Royce',
  'Walder-Frey',
  'Donnel-Waynwood',
  'Eon-Hunter',
  'Jon-Umber-(Greatjon)',
  'Masha-Heddle',
  'Moreo-Tumitis',
  'Mya-Stone',
  'Mychel-Redfort',
  'Robert-Arryn',
  'Stevron-Frey',
  'Tytos-Blackwood',
  'Wendel-Manderly',
  'Galbart-Glover',
  'Roose-Bolton',
  'Maege-Mormont',
  'Jonos-Bracken',
  'Lyn-Corbray'],
 6: ['Waymar-Royce', 'Gared', 'Will-(prologue)'],
 7: ['Clement-Piper', 'Karyl-Vance'],
 8: ['Danwell-Frey', 'Hosteen-Frey', 'Jared-Frey'],
 9: ['Daryn-Hornwood', 'Torrhen-Karstark'],
 10: ['Jyck', 'Morrec'],
 11: ['Mace-Tyrell', 'Paxter-Redwyne']}
In [61]:
nx.draw(nx.subgraph(G_book1, d[3]))
In [62]:
nx.draw(nx.subgraph(G_book1, d[1]))
In [63]:
nx.density(G_book1)
Out[63]:
0.03933068828704502
In [64]:
nx.density(nx.subgraph(G_book1, d[4]))
Out[64]:
0.22943722943722944
In [65]:
nx.density(nx.subgraph(G_book1, d[4]))/nx.density(G_book1)
Out[65]:
5.833542188805347

Exercise

Find the most important node in the partitions according to degree centrality of the nodes.

In [66]:
max_d = {}
deg_book1 = nx.degree_centrality(G_book1)

for group in d:
    temp = 0
    for character in d[group]:
        if deg_book1[character] > temp:
            max_d[group] = character
            temp = deg_book1[character]
In [67]:
max_d
Out[67]:
{0: 'Tyrion-Lannister',
 1: 'Daenerys-Targaryen',
 2: 'Eddard-Stark',
 3: 'Jon-Snow',
 4: 'Sansa-Stark',
 5: 'Catelyn-Stark',
 6: 'Waymar-Royce',
 7: 'Karyl-Vance',
 8: 'Danwell-Frey',
 9: 'Torrhen-Karstark',
 10: 'Jyck',
 11: 'Mace-Tyrell'}

A bit about power law in networks

In [68]:
G_random = nx.erdos_renyi_graph(100, 0.1)
In [69]:
nx.draw(G_random)
In [70]:
G_ba = nx.barabasi_albert_graph(100, 2)
In [71]:
nx.draw(G_ba)
In [72]:
# Plot a histogram of degree centrality
plt.hist(list(nx.degree_centrality(G_random).values()))
plt.show()
In [73]:
plt.hist(list(nx.degree_centrality(G_ba).values()))
plt.show()
In [74]:
G_random = nx.erdos_renyi_graph(2000, 0.2)
G_ba = nx.barabasi_albert_graph(2000, 20)
In [75]:
d = {}
for i, j in dict(nx.degree(G_random)).items():
    if j in d:
        d[j] += 1
    else:
        d[j] = 1
x = np.log2(list((d.keys())))
y = np.log2(list(d.values()))
plt.scatter(x, y, alpha=0.9)
plt.show()
In [76]:
d = {}
for i, j in dict(nx.degree(G_ba)).items():
    if j in d:
        d[j] += 1
    else:
        d[j] = 1
x = np.log2(list((d.keys())))
y = np.log2(list(d.values()))
plt.scatter(x, y, alpha=0.9)
plt.show()