Stats, Maps n Pix: January 2019

Sunday, 20 January 2019

A little post about a geogif

This is a short post about map-based animated gifs (i.e. the humble geogif). It's not about general elections in the UK because that would not be a sensible topic. It's a bit about design choices, just in case anyone is interested, and it's also a bit about the method and data. Here's the final gif, below, which I posted on twitter recently. I wanted to make something that told the story of the 2017 General Election in a short time frame - in this case 30 seconds.

This is about 4.7MB

In the original version of this, the large font was too thin so in this version it's bold and easier to read. You could probably tell what it shows and how long it takes to show it, but with this text there is no doubt. This stems from the redundancy principle that you'll hear John Burn-Murdoch talk about in relation to his work with the FT. I've used Raleway as the font, though I think perhaps I should have chosen something slightly cleaner. I'm not that keen on the numbers with Raleway, though it is a lovely font. By the way, this was all done with QGIS Atlas and only one map layer. See bottom of page for the sources.

The constituency names are displayed incredibly quickly below the main text (for 46 milliseconds each). Not because I want them to be read; more for effect so that people can pick out names they know as it runs through. Of course, it's also so that when you pause it you can see which constituency we're up to (more relevant in other formats). Then when the name of a constituency appears, the corresponding hexagon to the right has a black border to it. Not thick enough here probably, but you should be able to see it.

Since the results file from the House of Commons Library is so detailed, I was able to add the time each result was declared, which I did as a digital clock. Again, adding a bit of redundancy in making the times work like a clock, just in an attempt to speed up cognition. Below the clock the number shows the order of the result (from 1 to 650) and whether the result was a hold or a gain from another party. I added a green banner to the top left to say when the General Election was. I did use the numeric format for this in an earlier version, but there are two problems with this: i) I think it's more difficult to process compared to seeing the word 'June' in there; and ii) we can't have our US friends thinking the election was on August 6th. I like to avoid doubt in such situations. That's why I moved house on the 8th of August 😉.

I also wanted to add a chart because you can't tell just by looking who actually wins the most seats. That's why in the final version I added the animated bar chart in the top right (previously it was a boring, non-animated one). This was also done using QGIS Atlas. I think this helps tell the story of the night as it's often the case that in more urban, tightly-packed constituencies (where Labour do well) the counts are quicker and they race off to an early lead. It also gives it a bit of jostling for position/race feel to it, which I like.

At 03:12 Labour are in the lead

According to the data, it wasn't until Daventry, the 498th declared result, that the Conservatives gained the lead from Labour. This happened at 04:09 in the morning.

Celebration or desolation, depending upon your allegiance (possibly)

"Hey, why did you lump my favourite political party into the 'other' category"?

Fair comment. You're right, it's not very nice, but I didn't want to add ten bars so this is what we have. Plus, I thought it wouldn't be too much of a stretch since the lower numbers of the individual groups within the 'Other' category mean they can easily be counted.

I added the numbers to the bars for completeness, and because it wasn't that difficult, and of course the party labels below the bars.

All I wanted to do here was tell the story of the 2017 General Election in a 30 second gif, deliver a lot of information in doing it and have a bit of fun. The idea is that you watch it multiple times to get the story of the evening. Each frame is on for about 0.046 seconds, so the whole thing lasts about 30 seconds. Well, it would have lasted 29.9 seconds but I kept the last frame on for 5 seconds to give viewers a breather and some time to process things before it re-loops.

Now for a few words on the method.

Method
The basic method is to create a set of individual images which are then used as single frames displayed for a short duration in a gif. I do this using QGIS Atlas. You can create gifs from a set of images in a number of different ways, but I use GIMP because it works really well, is open source, not that difficult and you can optimise the animation so the file size is low. The method is described in an earlier blog post I did, from about half way down the page.

When exporting the gif you can choose the duration of each frame, but I like to have the final frame on for longer, at least a few seconds, so that people have time to take in what they are seeing after the rapid fire sequence of frames before it. There's a nice script you can use to set the frame duration of all frames automatically and then you can change the final one manually.

Export options for gifs in GIMP

What I normally do is export individual images from QGIS Atlas at 300dpi and then I batch re-size them in IrfanView, normally to max width/height of 1000 pixels. IrfanView is simple but surprisingly powerful and I've used it for about 15 years now. It can do just about anything, apart from solve Brexit (though I suspect it could do that too). I have found that I get a higher image quality doing it this way, rather than exporting images at 1000 pixels from QGIS.

If I need to convert images to mp4 instead of gif, then I'll use ffmpeg, which is a command line tool. If that scares you, then you'll probably find ezgif very useful - and I believe it uses FFmpeg behind the scenes anyway. The advantage of mp4 is of course that you can more easily pause/rewind and add music and suchlike. But if you upload a gif to twitter it will actually convert it to a slightly different gif format that can be paused.

When I follow this method, gifs on twitter always look nice and crisp.

In a QGIS Atlas of this kind, it's normally the case that only one feature would be showing at a time, but thanks to a tip from GIS guru Ian Turton I've done it cumulatively so that previous features remain on the map and it builds up until the map is fully coloured in.

Software

As above, it's a mix of QGIS, GIMP, IrfanView, FFmpeg, and hardly ever anything else for geogifs. If you've not used FFmpeg for creating/editing/converting video files then you might want to check it out. It can be confusing but it's really, really powerful. I also used it recently to create an mp4 file from a series of Google Earth Studio images, as below (though I cheated and added in the text and audio using Filmora).

Check this out for the method

Data sources

For the hex grid of the 650 UK Parliamentary Constituencies, I used Ben Flanagan's file, which you can download from ArcGIS online. This file is licensed under a Creative Commons Attribution 4.0 International License, which in simple terms means you can use it, adapt it, share it and so on. For the colours, I used those from the BBC website because a) I like them, b) people are familiar with them and c) they know what works. I also used the 'detailed results by constituency' file (see bottom of this page) from the brilliant team at the House of Commons Library.

Wednesday, 2 January 2019

QGIS Atlas by Field

This post explains how to use the Atlas function in QGIS to automate the production of maps from a single layer in order to produce a map output for each column in an attribute table. The Atlas tool is designed to allow users to iterate through the rows in an attribute table, but not through columns so if you want to produce a map for each variable in a single dataset (e.g. annual house prices for small areas) then you may find yourself at a dead end, as I did. Then I saw how Ed Hampson at Savills very elegantly solved the problem so I'm sharing it here. I know of some other workarounds but this solution is my clear favourite as it allows for the production of as many map outputs as you need in a very efficient manner. I've shared the project file in this online folder so you can download it and play around with it yourself. This was done in QGIS 3.4. This is written with the assumption that you already know QGIS and how to use the Atlas tool, so I won't cover that here.

This is the final project, which you can download

The basic principle of this approach is that for the Atlas coverage layer you use a lookup file, which in my case was a table with one row for each of the column headers I wanted to map in my dataset (as shown below). As you can see, the individual rows follow the pattern md_1995, md_1996 and so on. This is because the layer I want to create maps from has one column for each of these fields, from md_1995 to md_2016 - this is the median house price for each area from 1995 to 2016. If you're at all confused and want to see how it works, just download the project folder and take a look yourself by opening the QGIS project I have shared - all the screenshots here come from that.

This is the table I used

Now here's the bit I really like - when setting up the Atlas in Print Layout, the coverage layer here is the table shown above and the Atlas 'Page name' is the 'Field' column in the table above. The normal way would be to set the coverage layer to be some kind of geographic layer and then QGIS would iterate through it spatially, but this is a kind of non-spatial solution that allows us to map by column instead.

The 'Field' here is the Field column in the table shown above

The really important thing you need to do to make this all work is to map the data using eval ( @atlas_pagename ) as the Column to be mapped. Since we have set our Page name in the Atlas as 'Field', and since the data in the Field column in our Atlas coverage layer has the same names as the attributes we want to map (i.e. md_2004, md_2005, md_2006, etc.) when we iterate through the Atlas it will map a different variable each time. Just remember to use a data range in the symbology that allows you to show the highest and lowest value across your entire dataset, otherwise you might find you have some missing values on some maps. For example, if I had set the symbology data range here using the 1995 house price, then by 2016 loads of areas would be above the max value from 1995 and they wouldn't show up on the map.

This is the clever bit

What does 'eval' do, I hear you cry... Well, it's basically this, as described in the QGIS documentation:

'eval' - Evaluates an expression which is passed in a string. Useful to expand dynamic parameters passed as context variables or fields.

Once you've got everything looking the way you want, you can then export the Atlas in the usual way and do all sorts of interesting things with the outputs, like creating a gif for example as I have shown below. I did a tutorial for creating these in an earlier blog post.

How much!!!! Not inflation-adjusted, obviously.

I've added some comments to the expressions in the QGIS project I've shared and these are also shown below to give you a better idea of what's going on here. I know it's a bit tricky to understand without actually seeing it.

This is just how I set the output file names

This is just what I used to automate the year label

This is what I used to set the highest value each year

This is how I dynamically sized the bars in the chart

This just shows you all the different layers - only 4

You'll see above that this is a normal Atlas, set up in the normal way. The only difference is the way I've set up the coverage layer and how I have set the column to be mapped using the eval expression. Everything else works in the same way. I have also used the aggregate function to pull out the highest value for each year (in the top right of the layout) and I used it to set the height of the individual bars in the chart. I have shared all the individual map outputs from this in the folder, and they look like this.

Final map output for 2016

Okay, I think we're done here but if you get stuck feel free to get in touch. I'm easy to find online or on twitter.

Notes: if you try to load the zipped project folder QGIS may throw up a warning saying it can't find the 'field_transposed_for_atlas' layer but don't worry, it's there. You just need to Browse to the folder where it is located (i.e. the layers folder in the download). I should perhaps really thank Jochen Schwarze here because of his StackExchange tip to Ed on this. The map colours are ones I borrowed from a very nice looking FT map by Steve Bernard. Note that you can add comments inside QGIS expressions using /*comment goes here*/ and after an expression using --comment goes here. In the file names I think it says 'MSOA' but the areas used are actually LSOAs. The house price data is from HM Land Registry but very helpfully compiled as median values by LSOA by the excellent CDRC team. I added a few place names using OS open data. Easy to remember url for the folder is bit.ly/qgis-atlas-by-field