The only guide you need to understand the NetworkX package in Python.

The only guide you need to understand the NetworkX package in Python.

The NetworkX package is one of the most comprehensive packages when it comes to visualizing and extracting information from large networks such as train routes, voting patterns and the like. In this guide we will outlay the most essential code that you can use in order to analyze complex networks and visualize them at the same time!

Let’s start by importing the packages that you will need for this:

We are now going to build a simple network from scratch using the code shown below:

The output of the code above is shown below: 

Let’s make sense of the code we just wrote. First we create a ‘graph’ in order to store our network. We then clear the graph to ensure that it’s empty. Then we proceed to add ‘edges’.

What are edges? Edges are the lines that connect two nodes together. What are nodes? From the graph above – ‘Temple’ and ‘Embankment’ are nodes and the line that connects ‘Temple’ and Embankment’ together is the edge.

We then draw this network using the nx.draw_random() function. The function takes the graph – ‘tube1’ as it’s main argument. We then have a set of arguments that we can configure to customize the network in the way that we want.

Let’s now create a circular form of the network shown above and add colors to the edges so that it makes the network visually comprehensive:

The output of the code is shown below: 

The code above simply uses the nx.circular_layout() function on the graph to create a circular network. We then add all the edges of the network to the variable ‘edges’ and iterate over the colors we defined earlier using a list comprehension.

We then use the nx.draw() function with the original network, the circular network as the main arguments and a few other arguments to customize the how the network looks.

We will now learn how to build and extract useful information from networks using data frames. For the purpose of this tutorial we are going to use the London Underground’s zone 1 data that can be found here – zone1

Let’s first read the data frame into Jupyter  Notebook and create a network using it:

We can then compute information about the network like the number of nodes and edges that it has:

Let’s take a look at what the data frame that we are working with looks like:

We are basically going to the station_name1 and station_name2 columns as our nodes and connect the two using edges. The ‘line’ column tells us which line the two stations are on. For this exercise the other columns are not of significance.

Let’s now visualize this network:

The network is displayed below: 

Let’s say that Line 7 has shut down due to an accident in the line. Let’s now find the shortest route from Baker Street to Temple station. We can do this using the code shown below:

The output of the code is shown below:

This shows us that we need to take the 5 stations between Baker and Temple which indicates that it’s the shortest route between these stations.

Cliques in networks are a group of objects that are associated or connected with each other in some way or the other. In our example cliques would be formed by stations that linked to each other. We can find the cliques in our network using the code shown below:

The output of the code is shown below: 

This shows us that Euston Square is linked to King’s Cross station and is also linked to Great Portland Street.

Congratulations! You have now built yourself a strong foundation in creating, visualizing and extracting information from networks. To understand the complete power of the NetworkX package in python read the documentation given here – NetworkX Documentation.

Happy Network building!