Splunk Has Just Levelled Up In Geospatial Visualisation
Splunk's Annual User Conference, .Conf 2015 was awash with software releases and product launches. One of the more exciting visualisation features introduced in Splunk 6.3 is the Choropleth map. What is a choropleth map? A choropleth map is a thematic map in which areas are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map, such as population density or per-capita income (thanks Wikipedia!). Such a visualisation introduces a whole different way to analyse and present findings in Splunk. I thought that a great way to get a handle on this new capability was to roll my sleeves up and create a few geospatial visualisations.
A region or area can be represented on a map through the use polygons. What is a polygon? A polygon is a collection of points representing a two dimensional surface. A choropleth visualisation is all about rendering one or more polygons on a map.
In Splunk, determining what polygons to render on a map is determined through the use of a Geospatial Lookup, a feature that was released alongside the choropleth map in Splunk 6.3. Geospatial Lookups are very powerful and do a range of things, including resolving coordinates into polygons. Our usage is going to be a little simpler - we are going to take a country name defined within a dataset and link it with a FeatureID within the geospatial lookup as a means for retrieving polygons for the purposes of rendering regions on a map.
Here's a super simple example. Say we wanted to render the polygons representing the country of Australia. Luckily, Splunk comes packaged with two geospatial lookups out-of-the-box - countries of the world (geo_countries) and the states of the US (geo_us_states). We will use the countries of the world geospatial lookup as follows:
| stats count | eval countryname="Australia" | geom geo_countries featureIdField="countryname"
You get the following output:
Note that the geom field has been appended to the event by the geospatial lookup. This field contains the necessary coordinates to define the polygons necessary to render the shape of Australia as follows:
Example 1: Sequential Geospatial Visualisations - GDP Per Capita
So let's start with something relatively straightforward. We are going to use Splunk's new lookup type, the geospatial lookup, to visualise GDP per capita of the countries of the world. As discussed above, Splunk comes packaged with two geospatial lookups out-of-the-box - countries of the world (geo_countries) and the states of the US (geo_states). We will use the countries of the world geospatial lookup combined with World Bank GDP per capita data obtained from data.worldbank.org. These stats were downloaded in CSV format. This CSV file contains the following attributes - country name, associated three letter country code (e.g. AUS represents Australia), and GDP per capita figure. This CSV file is going to be used as an input into the geospatial lookup with a search as follows:
The end result is the following choropleth.
Note that a lot of the countries are shaded in white, owing to the fact that over 80% of countries in the world have a GDP per capita of $US20,000 or less.
Example 2: Categorised Geospatial Visualisation Using A Custom Feature Collection
Our first example showed a choropleth using one of Splunk's built in geospatial lookups. However, we may be interested in regions other than countries or US states. It is definitely possible to create your own geospatial lookup and this is exactly what I did for this example. In this example, I am going to visualise Australian federal electoral boundaries categorised by political party. To do this, I needed two data sources:
- Electoral boundary geospatial data. I retrieved this from the Australian Electoral Commission website, www.aec.gov.au. Note that the AEC provides geospatial data in shape file (SHP) format, whereas Splunk requires geospatial data to be in KML format. This meant I had to convert from SHP to KML. I will not talk about the specific steps that I did to achieve this. The excellent blog post by Splunker Michael Porath discusses this conversion process in detail. All I will say here is that after the conversion process, we have a KMZ (compressed KML file) called AU_electorates.kmz which underpins a geospatial lookup definition called au_electorates.
- Sitting parliamentary member data. This was downloaded from aph.gov.au as a CSV file. This contains details of each member of parliament. The two fields of data that we are interested in are party and electorate. This CSV file is going to be as an input into the search as follows:
The above search results in the following visualisation:
Example 3: Visualisation Of The Impending Zombie Apocalypse!!!
Splunk's new geospatial capabilities are a real enabler, allowing us to represent data and scenarios in a whole bunch of innovative and useful ways. However, the greatest thing about it is that now I can use Splunk to visualise the global impact of the impending Zombie Apocalypse (its not a matter of if, just when!!). Here it is:
Oh yeah, the above analysis may contain a little bias...