The fundamental guide to geographic analysis using Python!

0
The fundamental guide to geographic analysis using Python!

Analyzing data and drawing insights based on various locations can be a key driver of statistically significant insights for any business.

For instance a bank might like to know which districts or states of a country have the highest number of customers defaulting on their loans and take action based on these findings.

In this guide we explore how we can take full advantage of the ‘geopy’ package in python in order to extract information from geographic locations.

In the first part of this guide we are going to explore how we can use the ‘geopy’ package in order to print out all possible addresses of a location that we give it. We can do this using the code shown below:

The output of the code above is illustrated below: 

In the code above we first import the required packages and create a Nominatim() object which contains all the locations. We then use the geolocator.geocode() function and specify a location of our choice as an argument. We ensure that the exactly_one argument is set to False to prevent the function from printing out only one location. We then use a for loop to print out all possible addresses for “Westminster, London”.

In order to extract the latitude and longitude coordinates of the first address we use the code shown below:

Let’s assume we have a random location X. We want to find the distance between location X and ‘my_coords’. We can do this using the code shown below:

The code above results in an output as shown below: 

Shape files are geographic files that contains information about a location such as the USA or London. These files can be found all over the internet, usually available for free. They can be used to plot out geographic data and further enhance our data manipulation capabilities. Let’s take a look at what the shape file of the city of London looks like:

The resulting shape file is illustrated below:

From the shape file shown above we can see that it contains information about the location name, it’s GSS_CODE, the area of the location in hectares, the ‘Borough’ (District) it’s located in and it’s geometry. The geometry in the above shape file is a ‘POLYGON’ and it contains information about the co-ordinates of the particular location.

We can now plot the entire map of London using the shape file above with the code shown below:

This results in a plot as illustrated below: 

We can change the styling of the plot using the “cmap” argument as shown below:

This results in a differently styled plot as illustrated below: 

We are now going to point to the “location” variables that we created that has the latitude and longitude coordinates on the map created above. We can do this using the code shown below:

In the code above we are creating a GeoDataFrame for the location information. One of the arguments of the GeoDataFrame() function is the Point() function which creates a point object of the location’s coordinates to mark on the map.

We then convert the GeoDataFrame – ‘my_loc’ into the Coordinate Reference System used by the map.

Finally we plot the location on the map with along with everything in a 5 mile radius around the location as illustrated below:

The black star marker shows us the location of the point that we are interested in.

How can we now apply the ‘geopy’ package to a real world scenario? Let’s plot all the points in london where there is a bike stationed for the general public to use. We can do this using the code show below:

The resulting plot containing all the bike points is illustrated below: 

This guide covers the fundamentals that you will need to know in order to work with spatial and geographic data.

Happy Spatial and Geographic Analysis!

LEAVE A REPLY