Tuesday, 19 September 2017

Buildings of Great Britain

One of the great things about Great Britain is of course Ordnance Survey, our 225 year old national mapping agency. Since a lot of their data was made open in the last decade it has become easier than ever to explore and map the country (particularly if you're some kind of GIS boffin). I should point out before going any further that Ordnance Survey covers Great Britain only, not Northern Ireland. For that, you'll need to look at Ordnance Survey Northern Ireland, who also have a good selection of open data. In the past, I've created sets of building data for English cities, done some calculations on how much land is taken up by golf courses, and a variety of other things, including creating 3D building models using OS open data. Today I'm sharing some shapefiles of all buildings in Great Britain

London has lots of red buildings

Why have I done this? The main reason is that I want to share these complete files with others who might need them and don't want to - or know how to - patch together separate tiles of Ordnance Survey open data. If you need some building footprint data, you can just download the set you need and zoom to your chosen area, or extract what you need from an individual shapefile. I used the OS OpenMap - Local product for this, so the detail in the buildings is very good, as you can see below in the zoomed-in extract.

The reason I did this in the first place was because I wanted to come up with a number for the percentage of the land area of Great Britain that is covered by buildings. There has been some debate about this topic, and it was covered in an FT Fact Check piece in 2016 by Kate Allen partly in response to the claim that golf courses cover more land than housing. My calculations using new OS data revealed that Great Britain is 0.54% golf course (1,256 sq km, about the same area as Greater Manchester). But do buildings cover more than this? Yes they do. 

Click to enlarge - you can see lots of detail

Using the data I'm sharing here, I calculated that buildings in the UK cover 1.35% of the land. I reported this previously in a tweet that was quite widely shared. To my embarrassment, for the GB figure I used the UK area as the denominator so the figure reported there was a little low, though the England, Scotland and Wales figures were and are accurate. Here's the important information you need, if you're ever faced with an awkward silence at a party.

> Great Britain is 1.35% buildings

> England is 2.0% buildings
> Scotland is 0.4% buildings
> Wales is 0.9% buildings

The Great Britain figure equates to about twice the area covered by Greater London - 3,150 sq km. However, the OS OpenMap - Local product isn't the most detailed building-level data covering Great Britain. For that, you'd need to look at OS MasterMap, a much bigger job. That is, unless you are Mike Gale and Tom Armitage at Edina and you have all this information in a lightning-fast database ready to query. They very kindly went beyond the call of duty and did some calculations and confirmed that my figures are pretty much spot on. They did loads of other really cool stuff too, but more on that another time perhaps.

If you click on the link, you'll see the set of shapefiles I created in a Dropbox folder. It contains some licence information, a few sample images, plus the following shapefile sets:

  • All buildings in Wales in a single shapefile
  • All buildings in Scotland in a single shapefile
  • All buildings in the North of England in a single shapefile
  • All buildings in the Midlands in a single shapefile
  • All buildings in the South West of England in a single shapefile
  • All buildings in the South East of England in a single shapefile

The complete set, and individual files, are pretty big, since they cover large areas and have millions of individual polygon features in them. This isn't exactly the best way to view and map this data, I know. That's obviously an understatement. But I also know that it's the format many people know and love, and want to work with. So, if you want to play around with buildings or use this as background mapping, be my guest. 

Data notes
Giant shapefiles make the world go round, okay. More seriously, there are better ways to download and view this data, but that's for another blog post. On a related note, see this from Emu Analytics on a very cool project which utilises OS building data. If you download the files and look at the attribute table in your GIS of choice, you'll see that I've added an area column for each polygon showing the area in square metres. 

Using this to query the dataset can be interesting - e.g. to find the largest building. Be aware, however, that the OS OpenMap - Local data still has a degree of generalisation in it so sometimes separate buildings can be merged together - e.g. if they are very close together. But I know from what Mike and Tom did with MasterMap data that this doesn't affect the final calculation much at all, thankfully.

As for the golf courses vs. housing thing, I know that's not strictly-speaking solved definitively either. The reason for this is that we have no way of knowing exactly what area houses cover vs non-residential buildings. As far as I know, even Ordnance Survey don't know this, and probably can't know this from what data we have available to us. However, the vast majority of buildings are residential (not sure on the %) and I'm very confident that housing covers a much bigger area than golf courses, I just can't say how much.

Friday, 11 August 2017

US 2017 Total Solar Eclipse Animation

I've only just started paying attention to the fact that there's a total solar eclipse in the United States on 21 August this year. What made me really take notice were the imaginative, amusing and bizarre maps people have been posting online - all good fun and often pretty interesting too. What seems to me to be missing - and absolutely essential, of course - is an animated eclipse gif showing the path of totality, towns and cities, a bit of terrain and some roads. So, I grabbed the geodata from NASA and spent a couple of hours playing around with it to make this thing you see below. This is an extract of the full file, in gif format. The full gif is over 100MB, so I've created a video file instead for the final file at the end of this post, below - and also posted on twitter. Click an image to see it in full size.

Note the small part of the path of totality in Montana

Nashville looks like the place to be

I also extracted a series of animations covering the whole US, in separate parts, so that the whole of the path of totality has been giffed, as it were. These are pretty big files and if I add them all here not only will it may crash your browser but it will also probably be a bit too bamboozling with all the animations going on at one time. Here's part 1 and part 2 below. The rest I'll keep to myself for now.

Part 1 - at 50 frames per second
Part 2 - also at 50fps

I just did this out of curiosity and to learn a bit more about the event. I thought it would be interesting to see for myself where the eclipse will go. I remember here in the UK back in 1999 when we had a total eclipse, but I didn't manage to get down to Cornwall to see it properly. Just take a look at this video to see how long ago this now seems. I was in Glasgow at the time and although it didn't get fully dark it was pretty interesting and the birds were very confused. Also, if I didn't do an eclipse map now I might never remember again. The next total eclipse in the UK isn't until 2090 and I just get the feeling I won't be on here making animated gifs of it then.


Notes: mapping software was QGIS 2.14, using the Atlas function to extract the frames for the gif. I used the NASA shapefiles to show where the path of totality and centre line are. I added some place names, roads and US boundary and state lines from Natural Earth and I also added an ESRI shaded relief base layer to give more of sense of the underlying physical geography. I patched the gif together in GIMP and animated the movement of the sun across the land using the vertices of the eclipse centre line. I've added a little glowing corona round the black dot, which is of course supposed to represent the sun.

There were about 1800 points in the original NASA eclipse path file so I used every second one to generate the animation because it still creates a reasonably smooth image. For a proper set of scientific maps, I suggest you check out the NASA Total Eclipse pages, as they are full of great information and maps. They also have an animated gif.

Monday, 17 July 2017

What Percent Golf Course is Your Area?

There has been a fairly long-running debate in the UK media about the extent to which golf courses cover the landscape. I've been working on a related topic recently, so when Ordnance Survey last week released an open dataset containing golf course coverage I thought I'd have a go at answering the question myself. The answer I arrived at was that Great Britain is about 0.54% golf course - or about 1,256 square kilometres (about the same area as the whole of Greater Manchester). 

The local authority with the largest area of golf courses is Woking in Surrey, at 10.74%. I originally calculated it at just over 7% but I had to make some edits due to the number of courses that straddle the local authority boundary and weren't being included. I've done the calculation a couple of times and also manually so I'm sure it's pretty accurate.

Woking - 10% golf course

Edinburgh - about 4.2% golf course

Why am I doing this? Do I hate golf and golfers? No, not at all. I used to play golf (a fine game) and have some golf clubs in my shed so I couldn't say I'm part of any anti-golf lobby. The reason is that I like to know things like this, particularly when such figures are used to debate issues like housing and the availability of land in some parts of the country. The fact that Woking appears to be 10% golf course is not a criticism, it just appears to be a fact. But I do find these numbers interesting. Here's the top 20 by local authority - and remember that Ordnance Survey just covers Great Britain not the whole UK so there are no Northern Ireland figures here. I'm not aware of a similar dataset that would allow me to do these calculations for Northern Ireland.

Is this a lot? Well, that depends

I've also done some summary statistics for Great Britain and the different countries within it. From this, we can see that Great Britain is about 0.54% golf course, England 0.74%, Scotland 0.28% and Wales 0.34%. From my analysis of the data, this does seem to include golf driving ranges as well. I'm not sure about crazy golf - don't think that's included but it would probably be a tiny fraction of land. The same goes for 'pitch and putt' type areas

Golf courses take up about the same area as Greater Manchester

If you want to explore the data yourself, I've made it available in a Google sheet - and also added in area codes and region names. According to my calculations, only 10 local authority areas in Great Britain have no golf courses within them, as follows:

  • Tamworth
  • Adur
  • City of London
  • Camden
  • Hammersmith and Fulham
  • Islington
  • Kensington and Chelsea
  • Lambeth
  • Tower Hamlets
  • Westminster

Finally, at the suggestion of David O'Sullivan, I had a little look to see if there was a correlation between the percent of each area covered by golf courses and the percent voting for Brexit. There really isn't.

Not much of a relationship at all

Data credit: Contains Ordnance Survey data © Crown copyright and database right 2017

Data note: to deal with the issue of golf courses that straddle local authority boundaries, I cut them in two using a local authority boundary file. My original calculations didn't do this and it turns out it makes a bit of a difference in some areas, including Woking, where the courses on boundaries end up not being counted. I'm pretty confident these figures are accurate - based on the underlying data - but of course other people have calculated similar numbers in the past and arrived at different results. Some of these were based on estimates but my calculations are based on the new Ordnance Survey dataset so I'm confident they are at least close to the correct figure.

Monday, 12 June 2017

General Election 2017: some maps and data

It's 2017 and there has just been a General Election in the UK. In case you're reading this in the future, I'm talking about the first General Election of 2017. Today's post is a brief comment, plus a few maps, starting with one on the Labour win in Kensington in London. To summarise briefly: Labour won a seat that many people thought they couldn't, and won by 20 votes with Emma Dent Coad becoming the new MP in a gain from the Conservatives. 

But this is an immensely wealthy area, isn't it? Yes and no. There is a lot of money here but also a good deal of poverty, as you can see from the map below, where the red areas are among the 20% most deprived in England. As with most things, however, it's not a simple story and the result is perhaps not that shocking when you look at the map, even if some have named it the 'UK's richest constituency' (and there is an argument for that view). It's also definitely not 'London's Richest District', as one report puts it.

Kensington is a very different area north to south - click to enlarge

Some numbers, to help put things in context... If we rank the 533 English constituencies by deprivation, using the Indices of Deprivation 2015 measure calculated by the House of Commons Library, the Kensington constituency ranks 178 out of 533 - just on the edge of the most deprived third in all of England. One look at the map tells us that within the area there is considerable variation, with some parts much more deprived than the national average and some parts less so. In total, 22 of Kensington's small areas (LSOAs) are in the 20% most deprived in England - and none are in the 20% least deprived. The picture is very similar if you look at other indicators, particularly those related to income.

However, in London and beyond, the name 'Kensington' has it seems become synonymous with wealth, opulence and 'the elite'. This may be part of the Kensington story, but it's by no means the full story. I just wanted to take a closer look at what the data say in order to figure out if this is really such a surprise. As I did so, I also homed in on a few other constituencies. In addition, I attempted to look at the correlation between Conservative vote share and level of deprivation by constituency. I did this for 2015 and 2017 and although the results are not that surprising - i.e. the more deprived an areas is, the lower the Tory vote - some of the outliers are quite interesting. Here are the two charts - below. I've labelled some of the interesting ones as you can see below. 

A pretty strong relationship in 2015, as you'd expect

The correlation is a bit weaker here - everyone has ideas about why

Let's take a look at Sheffield Hallam because it's such an outlier - and also I live in Sheffield (but not there). It was Nick Clegg's constituency but is now Labour (Jared O'Mara MP). It's also the 9th least deprived constituency in England so is something of an anomaly, though of course not so much when you look at voting history and demographics. Still, the map is interesting in itself - it bucks the general pattern of the least deprived areas voting Conservative.

It has a high student population, among other factors

I then wanted to look at a couple of other places to see what things looked like on the ground. First up is Islington North (Jeremy Corbyn MP) and Islington South and Finsbury (Emily Thornberry MP). Note the little patch of blue around Highbury in an otherwise quite deprived area. This also contrasts with stereotypes of Islington as some kind of land of milk and honey. There is considerable wealth here but it sits beside large areas of deprivation.

Note the little blue patch near Highbury

Then I looked at the most deprived Conservative-voting constituency. This is currently Walsall North, which is ranked 31 out of 533 constituencies on the deprivation measure. Again, there are local and historic explanations for this but I don't want to dwell on those now.

Surprising? Maybe, maybe not.

Last of all, I wanted to look at a constituency where a Labour win truly would be a shock - for this, I looked at the least deprived constituency in all of England: North East Hampshire. When Labour win here I think we can all agree that it would be a shock, just as if the Conservatives won Walton in Liverpool (85.7% Labour, 8.6% Conservative in 2017). Actually, the latter might just be the biggest shock in the history of the world. Mind you, these days you never know what's going to happen next.

The Conservatives got 65.5% of the vote here.

Addendum: I saw a great histogram by Owen Boswarva looking at party by median age in each constituency so I attempted something vaguely similar for deprivation deciles and party. It's not at all surprising but I did find it interesting so am posting it here too.

Click to enlarge - the pattern is to be expected, but quite interesting

Notes: I do know, of course, that there will always be a strong linear relationship between deprivation and % voting Tory - or Labour for that matter. The point here is that because this is true, and because Kensington is actually quite deprived, the result there is less of a shock than some are claiming. Also, ranking deprivation at the scale of constituencies masks lots of underlying variation - but that's partly why I mapped it at LSOA. If we had LSOA General Election results that would be interesting. The scatterplots were interesting to me not because of the obvious linearity but because a) the relationship changed a good bit between 2015 and 2017 (UKIP effect?) and b) the big residuals - e.g. Sheffield Hallam. The ones which defy the general pattern are the ones I'm interested in. Basically, Kensington is right on the trend line and maybe it's because many more people from the deprived parts of Kensington voted this time - plus rich remainers. Finally, a lack of deprivation is not the same as affluence but on any measure you'll find they correlate strongly.

Wednesday, 24 May 2017

The Great Polish Map of Scotland (aka The Mapa Scotland)

Last week I was in Edinburgh to give a talk at the Edinburgh Earth Observatory seminar series at the University of Edinburgh, so I thought I'd try to see The Great Polish Map of Scotland - also known as the Mapa Scotland - before heading back down south. As you can see below, I did manage to go, but I am not quite tall enough to get a good view of it, so I have embedded a video below to give you a proper view of what it's like. I've also posted photos of the very informative signs that have been put up, in addition to a few more views. It's still being renovated so perhaps I didn't visit at the best of times but I'm really glad I got to see it - it's said to be the world's biggest topographical map.

The Great Polish Map of Scotland is located in Eddleston in the Scottish Borders, about 45 minutes south of Edinburgh. When I went on a Saturday morning with an old friend the roads were pretty quiet but it's definitely reachable in under an hour either way. You can see the location in the map below.

Just a short drive and you're there

The Map is actually in the grounds of Barony Castle Hotel, and when we went we parked up in the hotel car park. Above the front door of the hotel you'll see a very ominous message - "Prepare to Meet Thy God" - but since I've not got to the bottom of that yet, I'll leave it there for now and just show you what the Map looks like in the photos below.

Yes, welcome to our hotel!

You go round to the left side of the hotel and then follow the signs to Maczek's Map, as you can see from the next two images. Then it's through the gate and across the bridge over Dean Burn (in case you didn't know, in Scotland and some other parts of the UK we tend to call a stream a 'burn').

I think that's Maczek rather than MacZek

Almost there - you can see the bridge here too

Okay, once you're at the Map you'll see nice new green railings surrounding it. I'm reliably informed by Addy Pope - great Scottish adventurer, ESRI boffin and local person - that you used to be able to walk all over the map but given the new fence and ongoing restoration I thought that might now be frowned upon, so I stayed on the right site of the fence. I'm going to be a bit controversial now on two points. First, I was a little disappointed. Not by the map, but by the fact that I couldn't get a better view of it. I'm close to 2 metres tall but that's not enough even when you're on top of the newly constructed viewing platform. That brings me to the second point. I really wish the viewing platform was higher. These are kind of unfair things to say given the excellent restoration work going on but I do hope someone reads this and gives them tons of money to build a 50m high viewing tower. That would be amazing. Planning permission might be an issue.

This makes a big difference - I just wish it was higher
A couple of views of the map now follow. The first was taken from the viewing platform and the second from the west side of the map. At this point I should probably say that it's not technically The Great Polish Map of Scotland in the sense that Orkney and Shetland are missing. I'm from the north of Scotland so I notice these things... I can understand the omission though - what is there now took years to build.

This was as high as I could get my camera

Extra points if you recognise where this is
One of the great things about the Map is the information signs all the way round that give you the history of the map. I've taken photos of all of them so hopefully you can click the images to read the text but I've also done a few zoomed-in ones, just in case.

Inspired, of course, by a 1958 map of Belgium

I didn't know about this

A close up from the image above

A few images of construction

How the map was made - closer view above

Scotland and Poland have many connections - see above

As you can see, North Uist remains under cover (Uist = "you-ist")
I wasn't really complaining about the viewing platform itself - it's a great addition and allows you to get a nice overview of the Map - but I do think it would be a much better experience if the tower could be higher. I'm sure everyone thinks that and it's such an obvious, annoying thing to say. Anyway, I took this image to show how it was funded.

Lottery funding for the viewing platform

And that's it. I am very glad I went, but it would have been better if a) it wasn't raining - though this is always a risk in Scotland and b) I could fly. In that respect, I think the best views of this are to be had by the few drone videos on YouTube, one of which is posted above. Finally, despite the ominous sign at the entrance of the hotel, I can confirm that they actually serve a very good cup of tea and there wasn't even a hint of death. In fact, the staff were most welcoming.

Do widzenia (for now).

Sunday, 7 May 2017

General Election 2015: the view from second place

In my last blog post I shared a shapefile with the current UK constituency boundaries, which included a lot of other data. One of the variables included was who came second in the 2015 UK General Election. I thought it would be interesting to map this and also include a couple of widgets using the new Builder tools in CARTO (formerly CartoDB). I wanted to do this because I knew UKIP came second in 120 constituencies and I wanted to see where. I also wanted to post an interactive version of the data from my shapefile so people could explore it themselves. The first map below shows who came second in each constituency in 2015 and if you click an area you'll get more information - winner, MP, and so on. Using the widgets below you can then select by winning party and margin of victory, should you want to quickly identify marginal seats, for example.

Here's what the pop-up looks like

In the next map, I've used the 'Majority in 2015' widget to select only those areas with a majority of 3,000 or less and this then updates the 'Winner in 2015' widget so that you can see 41 of these constituencies voted Labour in 2015 and 36 were Conservative.

Many of these could be considered true marginals

At the other end of the scale, I then used the widget slider to select all those constituencies in 2015 which had a majority of 15,000 or more. The final map below shows this. As you can see, 153 of these were Conservative constituencies and 59 were Labour. The colours on the map - remember - are who came second in 2015. So is this a 'no chance of winning here' map? Possibly. I wouldn't be holding out for any shocks though.

Fiddle around with this map here

Here's what the map looks like when you show Labour, Conservative and then UKIP second place finishes.

Labour came second in 253 constituencies in 2015

The Conservatives came second in 181 constituencies in 2015

UKIP came second in 120 constituencies in 2015

I didn't make this so that I could comment on it so have a go yourself in the full screen version.

Sunday, 23 April 2017

Getting ready for #GE2017 - a big shapefile

I'm probably as unmoved as anyone else about the forthcoming General Election, but to get my head back into gear for it I thought I'd try to put together a full UK constituency shapefile of all 650 constituency results from the 2015 General Election, using data from a variety of sources. I'm sharing it here in the hope that people will find it useful, and that it might save you some work. If you spot an error, let me know and I'll try to fix it. There are other shapfiles out there, but to my knowledge there isn't a detailed complete UK (as opposed to GB) file that has all results, MPs and so on. I'm also sharing this here in the hope that we can move away from hex maps. I think they are nice and useful in many cases but I'd like to see a move back to the standard geographic representation in this election - hence, I am trying to promote Hexit. Anyway, here's an obligatory geogif I made with the file, using the 'time results declared' field.

The 2015 General Election in 30 seconds - phew

So, what's in the file? Well, I've tried to include a lot of stuff, sourced variously from the British Election Study, from the UK Parliament Data website, the Census and the devolved administrations of the UK. I have also calculated some variables myself, such as constituency area and the order in which results were declared. Key variables include:

  • PCONCODE - this is the ONS code for each constituency. It makes it possible to join lots of other data to the file. 
  • REGN - name of the sub-UK region each constituency is in - i.e. the old Government Office Regions in England, plus Northern Ireland, Scotland and Wales.
  • SECOND - which party came second in a constituency in 2015.
  • ELECT15 - the number of people in the electorate in 2015.
  • MAJ - size of the majority for the sitting MP.
  • TIME - time the results were declared. The very last column has this in 24H format, but you can also see from the ORDER2015 field which order they are actually in.
  • MPFIRST, MPLAST, MPNAME - the first, last and full name of each MP.
  • Winner15 - this contains the full party name of the winning party. The WINNER field contains the abbreviated party name.
  • POP2015 - this contains the mid-year population estimate for each constituency for 2015. I also added in the 18+ population, since it makes a bit more sense to do this, even though it is not the same as the electorate figure. 
  • Others - they should be self-explanatory but the list of Sources below will help if you are confused by any of these.

I hope you find this useful. If you want to download it, it can be accessed here. If you spot any glaring errors, please let me know. Who is going to win the 2017 General Election? My only prediction is that there will be lots of interesting maps and that the patterns on them may look a bit different.

Data notes: I have added a QGIS qml style file to the zipped data folder. This means that if you add the shapefile to QGIS it will display in the familiar colours of each political party. This happens because the qml file has the same name as the shapefile. The colours are matched from the BBC election results page from 2015. I tried very hard to ensure complete UK coverage, so I have patched data together from multiple UK sources but in a few cases I don't have variables for Northern Ireland. This is because the spreadsheet from the British Election Study I sourced some data from covers only GB. The zipped folder name for the current file version is uk_650_wpc_2017_full_res_v1.8.

Sources: General Election 2015 results, from the UK Parliament Data pages. The British Election Study updated Excel file. Northern Ireland constituency boundaries were sourced from OpenDataNI, via their resources page. For Great Britain, I used the constituency boundaries available on the ONS Geography Portal pages - the 2016 boundaries. For the most recent mid-year population estimates, I used data from the National Records of Scotland, NISRA data for Northern Ireland mid-year population estimates and ONS mid-year population estimates for England and Wales. The map data contains OS data © Crown copyright and database right 2017. Similarly, the other data contains National Statistics data © Crown copyright and database right 2017.

Acknowledgements: I would like to thank Ian Turton for suggesting the little QGIS Atlas function tweak which enables the cumulative animation you see above. For more on this, see the related Stack Exchange post where I asked the question.

Friday, 31 March 2017

Visualising a lot

This post is about visualising 'a lot', because it's something I've been thinking about as I write part of a book on GIS. The basic idea I'm exploring here is that when you have a dataset and want to somehow simply visualise 'a lot' - e.g. because the volume of data seems overwhelming - then there are different ways to approach it. For example, if you had millions of points on a map, you could use a hex-binning technique to give a standardised per-area figure, or you could do some kind of visual aggregation or summary in chart form. Or, to convey 'a lot' as a kind of visual device, you could perhaps just do a visual data dump, as I did in this example. Today's 'lot' is from the Gun Violence Archive dataset for the United States in 2015, compiled and released by The Guardian and collated by the Gun Violence Archive. I opted for a fast animation to visualise 'a lot', which I have now updated with a running total (in yellow). Let's go straight to the gif now, showing all gun homicides, one frame per day, for 2015 (and fast - 10 frames per second).

It's supposed to be overwhelming - click it for full size

When I looked at the original dataset at first, which includes, more than 13,000 gun deaths, my immediate thought was 'that's a lot'. All things are relative, of course, but in a global context it's hard to argue against this, particularly when you compare the data to other developed nations. The dataset has precise lat/long details for each incident and also the date and number killed and injured. I then summarised the data by day, plotted the locations as single points and then created 365 frames for this animated gif. It's not supposed to be readable at the micro scale of individual days or incidents, because I wanted to focus attention on the volume of data. A video version that you can pause or play more slowly is embedded below. I also did a slightly slower animated gif, at 5 frames per second, which of course is still somewhat overwhelming, shown below. Update: I have also added a cumulative version, prompted by Simon Rogers, and thanks to a bit of help from Ian Turton.

This is the same as above, but a little slower (73 seconds in total)

In this version, it's cumulative - click to enlarge and start from beginning

The individual frames were created in QGIS and in relation to the max and min values per day you can see those below. The largest number of gun deaths in a single day in 2015 was on July 5th and the lowest was on May 22nd. The mean number killed per incident was 1.12 and the mean per day in 2015 was 35.8 (for a total of 13,067).

The peak month overall in 2015 was also July

This was the only day that the number killed was below 20

There are just over 11,600 incidents recorded in the database but it's quite difficult to get your head around at a national scale. The Guardian already published some great localised mapping of this data, if you're interested. With this example I was just trying to experiment with ways that quickly and simply convey the idea of 'a lot'. The fast animation using thousands of data points is one way of doing this. It's designed with repetition and replay in mind, and the point is not to highlight individual datapoints or days, but to create a kind of cognitive mash where the end result is that you can take away some detail - e.g. most days have between 20 and 50 gun deaths - and also see the locations do, as you'd expect, mirror underlying population patterns. But only to a point. If you look closely you can see that some places are over or under-represented.

There are many ways to powerfully visualise this kind of data, including much more nuanced interactive methods of the kind produced by FiveThirtyEight. My approach here is non-interactive on purpose, but of course it is less visually appealing too. But then I also think that making something beautiful out of something so ugly is not what I want to be doing. All I wanted to achieve was to highlight the volume in the data in a way that anyone could understand and by using one frame per day and plotting the location points I think I'm just about there.

If you're interested in looking at any of the individual frames for a given day, take a look at the Google Drive folder below. You can see individual dates to the top left of each image and also in the file name of each image.

See all 365 individual days here

Notes: in the Guardian's original csv, I found that the date formats were a bit messed up, so I fixed this and added in some new, corrected date fields to the right of the spreadsheet. I also added in individual columns for day and month. I'm not a gun campaigner, this was just an interesting dataset for me to use. If you have any questions, feel free to get in touch. This data covers homicides only, no suicides. I updated this post on 5 April 2017, to include cumulative totals in the maps. Updated again on 10 April 2017 to include a cumulative version. It looks a bit ugly at the end but then it's a pretty 'ugly' dataset. I thought this was another interesting way of displaying the data.

Sunday, 26 February 2017

Train Stations of Great Britain

In my ongoing quest to answer the burning questions of our times, I have decided to continue my data-based boffinry by looking at a couple of questions I sometimes think of when zipping up and down the country on the train. I'm sure I can't be the only one, so here are some results that I've had saved up for a while. The first question is, 'which parts of Great Britain are furthest from a train station'? The second is 'how many train stations are there in each local authority or parliamentary constituency?'. Yes, I know I need to get out more but if you're reading this you probably do too - so take a look at the first two maps below.

Not exactly earth shattering, but some interesting snippets

You can click on this to see a bit more detail

Not entirely unexpected patterns here. In part, I also did this to use as teaching material in the future (it uses a basic GIS operation) and I used 30km just because it produces an interesting result. You can see the area around Bude in North Cornwall is England's largest area without a station. This issue has been raised in parliament many times, including in 2014 by the previous MP for the area. The furthest areas from stations are all in the mostly sparsely populated north and west Highlands, but also in and about the Cairngorms and the Borders - though the latter has just got a lot smaller thanks to the re-opening of the Borders Railway. West Wales and a bit of North Wales is also not on the map in this regard. There is also a tiny sliver of land in Yorkshire that sits just outside this 30km buffer distance. Some zoomed in maps follow...

This is just on the Scotland-England border

Around Bude in North Cornwall (and a bit on Exmoor)

A zoomed in map of train station deserts in the Highlands

The Norfolk train-free zones

The West Wales no-rail-zone

Looking for trains in the Yorkshire Dales? Avoid this bit.

Okay, so having answered one burning question, let's briefly turn to the other. How many areas in Great Britain (and I'm just referring to the island of Great Britain) do not have a station? For Local Authorities, I make it 12 out of 376 and for Westminster Constituencies, I make it 49 out of 630. I've screenshotted the two files here but you can also explore them yourself in Google Drive

Many stations in the largest areas, obviously

Same as above - e.g. Highland coves a larger area than Wales

What should we conclude from this? Not much, but It's quite interesting to look at the local authorities or constituencies that do not have a train station - of which there are 2,557 listed in the Office of Rail and Road 2015-16 data that I used for this. The next two maps show where there are no stations - but there are possibly a couple of small inaccuracies (Kensington and Chelsea being one as three stations are right on the border there).

This is very interesting

If you've read this far, you should get out more

Okay, so that's about it. Some data notes below if anyone is interested. Also, the spreadsheets in the Google Drive folder have passenger entry and exit data - i.e. the headline 'passengers' figures that are used to identify the busiest stations - e.g. Waterloo with nearly 100 million in 2015-16. I have also added in average, max, min and sum figures on passengers for the aggregated local authority and parliamentary constituency numbers. Hours of fun.

Data notes: follow this link to get the 2015-16 data on stations that I used here - including the eastings and northings for station locations. I got the boundaries from the excellent ONS Geography Portal and they are, of course Crown Copyright (but also open data). As in, Contains OS data © Crown copyright and database right (2017). The data are compiled by Steer Davies Gleave on behalf of the Office of Rail and Road and they are accompanied by this interesting two page summary. In addition to the two spreadsheets, I have also uploaded the images in this post to the Google Drive folder. Train station vs railway station? I'm not bothered about this, or with data is/data are.