# Distance in Social Networks

## 1 Introduction

When characterizing a graph one of the things to look at is how far apart the nodes are.

# from pypi import networkx

This will be the example network.

left = tuple("AAKBCCFEDEIE") right = tuple("KBBCFEGFEIJH") graph = networkx.Graph() graph.add_edges_from(list(zip(left, right))) networkx.draw(graph, with_labels=True)

## 2 Defining Distance

This section will look at how we can measure the distance between nodes.

### 2.1 Paths

A path is a sequence of nodes connected by edges. One path from D to K might be D-E-C-B-K.

left = tuple('DECB') right = tuple("ECBK") path = networkx.Graph() path.add_edges_from(list(zip(left, right))) networkx.draw(path, with_labels=True)

### 2.2 Distance

- The
*length*of a*path*is the number of edges in it. - The
*distance*between two nodes is the length of the*shortest*path between them.

dk_shortest_path = networkx.shortest_path(graph, "D", "K") print(dk_shortest_path)

['D', 'E', 'C', 'B', 'K']

length = networkx.shortest_path_length(graph, "D", "K") print(length) assert len(dk_shortest_path) - 1 == networkx.shortest_path_length(graph, "D", "K")

4

As you can see the path we saw earlier is the shortest path and the distance from D to K is 4.

### 2.3 Breadth-First Search

One way to compute the distances from one node to all the other nodes is to create a tree using Breadth-First-search. Breadth-First search will eliminate any cycles and leave us with the shortest paths to each node.

This is the tree created for the node A.

tree = networkx.bfs_tree(graph, "A") positions = networkx.drawing.nx_agraph.graphviz_layout(tree, prog="dot") networkx.draw(tree, positions, with_labels=True)

print(networkx.shortest_path_length(graph, "A"))

{'J': 5, 'D': 4, 'H': 4, 'F': 3, 'K': 1, 'A': 0, 'B': 1, 'C': 2, 'E': 3, 'G': 4, 'I': 4}

Looking at the shortest path-lengths to *A*, you can see that *J* is is the furthest away, with 5 edges separating them, while *B* and *K* are the closest with only 1 hop.

## 3 Graph Distance

This looks at how you can answer questions about the graph as a whole.

### 3.1 Average Distance

One measure is the average of the distances between ever pair of nodes.

print(networkx.average_shortest_path_length(graph))

2.5272727272727273

The average distance for our example is around two and a half edges.

### 3.2 Diameter

The *diameter* of a graph is the maximum distance between any of the pairs of nodes. Note that *distance* is always the shortest path between nodes, so this isn't the longest path in the graph.

print(networkx.diameter(graph))

5

The greatest distance is 5 hops in our example.

### 3.3 Eccentricity

This is the largest distance between a node and all the other nodes.

print(networkx.eccentricity(graph))

{'J': 5, 'D': 4, 'H': 4, 'F': 3, 'K': 5, 'A': 5, 'B': 4, 'C': 3, 'E': 3, 'G': 4, 'I': 4}

Looking at the output we can see that A, J, and K all have eccentricities matching the diameter. According to the Online Etymology Dictionary, *eccentric* means an orbiting object that doesn't have the earth at the center of its orbit. More literally, it means out of center (or off center).

### 3.4 Radius

The radius is the minimum eccentricity in a graph.

print(networkx.radius(graph))

3

So the *radius* is the smallest of the largest distances for all the nodes.

### 3.5 Periphery

This is the set of nodes whose *eccentricity* is equal to the *diameter* (5 in our case).

print(networkx.periphery(graph))

['J', 'K', 'A']

Looking at the output and the graph, the diameter of the graph is the distance from A to J or K to J.

### 3.6 Center

This is the set of nodes whose *eccentricity* is equal to the *radius* of the graph (3 in this example).

print(networkx.center(graph))

['F', 'C', 'E']

positions = networkx.drawing.nx_agraph.graphviz_layout(graph, prog="dot") networkx.draw(graph, positions, with_labels=True)

Looking at the graph, you can see that F, C, and, E do in fact form the center triangle.

## 4 Karate Club

This looks at the network created by the relationships between members of a karate club that is on the verge of splitting up. Each node is a member of the club and the edges represent that the incident edges interacted with each other outside of the club (and were thus assumed to be friends). Members who didn't interact with each other outside of the club aren't represented in the data set.

The instructor wanted to raise fees while the officers didn't. Eventually the instructor was fired and his supporters left with him.

karate = networkx.karate_club_graph() networkx.draw(karate, with_labels=True)

networkx.draw_circular(karate, with_labels=True)

You can see that there are some central characters in the club, notably 0, 32, and 33.

degrees = ((node, karate.degree(node)) for node in karate.nodes()) degrees = ((node, degree) for node, degree in degrees if degree > 10) print("Node\tDegree") for node, degree in degrees: print("{}\t{}".format(node, degree))

Node | Degree |
---|---|

0 | 16 |

32 | 12 |

33 | 17 |

The cut-off of 10 degrees was somewhat arbitrary, there are two nodes with degrees 9 and 10 respectively, but you can see that these three nodes were the most connected members of the club.

### 4.1 What is the average distance?

print(networkx.average_shortest_path_length(karate))

2.4

The path lengths are relatively short, on average.

### 4.3 Eccentricity

print(networkx.eccentricity(karate))

{0: 3, 1: 3, 2: 3, 3: 3, 4: 4, 5: 4, 6: 4, 7: 4, 8: 3, 9: 4, 10: 4, 11: 4, 12: 4, 13: 3, 14: 5, 15: 5, 16: 5, 17: 4, 18: 5, 19: 3, 20: 5, 21: 4, 22: 5, 23: 5, 24: 4, 25: 4, 26: 5, 27: 4, 28: 4, 29: 5, 30: 4, 31: 3, 32: 4, 33: 4}