Stats, Maps n Pix: 2016

Wednesday, 28 December 2016

Interactive Terrain Mapping in QGIS

During last year's winter break I did a little animation of Arctic sea ice, which eventually ended up being shown in slightly modified format in Svalbard Museum - and they even sent me a nice hat for my efforts. This year's spare time project was to look a bit more at terrain mapping in QGIS, with a purpose. Since I'm a Highlander in exile I dream of the hills quite a lot, and the mountains in particular. If you follow mountain weather reports for the Highlands, you'll hear of fairly regular avalanches. They may not have the ferocity of avalanches elsewhere in the world, but they can be deadly. The Scottish Avalanche Information Service do a great job of recording, documenting, mapping and educating here so I thought I'd add this data to some 3D interactives using Minoru Akagi's Qgis2threejs QGIS plugin and Ordnance Survey Open Data. If that sounds like gobbledygook to you, never mind - just look at the little animation below to see one of the outputs. The red dots are the locations of recorded avalanches since 1991. You can also play around with it in your web browser - it's quite good fun.

Avalanche data is from the Scottish Avalanche Information Service

I extracted the avalanche data from the SAIS webmap and just plotted the locations in QGIS. I decided to focus on an area I know relatively well - the Cairngorm mountain area. It's not high by world standards, at 1,245m/4,085 feet (6th highest in the UK), but at 57 degrees north and nothing much protecting it from the fierce winter weather, conditions up there can be extreme. In fact, I remember growing up sitting many times on the old chairlift as it swung in the wind! Not for the faint hearted. Anyway, as is the way with these things, the area I wanted to look at was split between map tiles, as you can see below in this QGIS screenshot.

I think there must be a name for the law of split map tiles

Methods-wise, I just patched together a few map tiles, clipped out the area I wanted to focus on, overlaid the avalanche location data, generated a hillshade with an azimuth of 180 and elevation of 27 (to simulate shadows at noon on 26 December), and then added some Ordnance Survey map tile data on top of it. I also did a version with Google satellite imagery, just to see what kind of result I could get. I've posted a little gif of the satellite version below, followed by more of the images. You can also play around with a smaller version of this in your web browser. It should also work on phones, but you might have to use three fingers to pan around.

This uses Google satellite imagery - interactive here

What you'll see below in the next five images are just screenshots from the final interactive from my web browser. You should see them in larger size if you click on them. I've modified some of the settings slightly but basically it's quite an easy thing to do in QGIS with the Qgis2threejs plugin. The updated document on this (18 Oct 2016) is also available in a single, handy pdf. If you want to use this tool, you'll find lots of useful tips in here.

Full view of the final interactive - looking north

Focused on Loch Avon (pronounced Loch A'an)

This shows the main areas of avalanche activity in one view

This is a view with the Lairig Ghru pass in the foreground

Another more vertical view of the southern part of the area

I like the effect here with the hillshade layer providing some sense of depth and shadow. There is no vertical exaggeration here and in the interactive version you can zoom in quite far before it becomes pixellated. I also wanted to see what kind of quality I could get by adding in a Google satellite layer and then exporting that. For this, I used the OpenLayers plugin in QGIS (though QuickMapServices is also really good for this). The final version was pretty big as I exported at a high resolution, but here are some snapshots.

The full view of the satellite version - click to enlarge

Loch Etchachan - watch this video!

A different view, showing a cluster of recorded avalanches

The resolution here is actually quite good

This is a view from 'over the back' of the ski area

One final overview image, just for completeness

As I said above, I also did a little extract of this focusing on a smaller area around Loch Avon. The satellite imagery changes part way along the loch but it's still quite pleasing. The interactive version works pretty well in the browser and the image quality here is also pretty good (I exported this at 200% in the Qgis2threejs settings).

Loch Avon - a nice little video from March 2016

It's not always winter here - take a look

Along the way, I almost forgot why I started to map this - the purpose was to plot some interesting and important data on top of the terrain in order to try to understand more about which areas are particularly prone to avalanches. Along the way, I learned more about Mike Spencer's snow hydrology research and PhD (more on that here) and found out some stuff that will help me update my teaching material on this topic. I did begin to look at the Ben Nevis area as well (below), but I decided to focus on the Cairngorms instead this time.

25 years of avalanche locations (red dots) for the Nevis Range

That's all for now. Thanks to the kind of data collected by the SAIS, Ordnance Survey Open Data, and great tools like QGIS and Qgis2threejs, it's becoming much easier to explore, analyse, visualise and understand important datasets. That is kind of what I was attempting here, as a little holiday experiment.

Thanks for reading.

Map data: Contains OS data © Crown copyright and database right 2016

Thursday, 22 December 2016

Creating a 3D city model using open data for England

This is a short informational post on how to derive building heights using open data, for areas in England. I am also sharing the data in the visuals you see below. However, if you have access to similar data in your country then you can replicate the method. It's basically the same as others have posted elsewhere, so I don't claim this is a first - I'm just posting it in case others find it useful. And I've also added local authority names to the building outline files, hence the different colours you see. First of all, though, here's an example of what you get at the end of the process.

You can find the raw data in shapefile format here

Of course, this alone would be no fun without having a little interactive version in the web browser, so here you go - knock yourself out! The buildings are coloured by local authority - blue for Trafford, purple for Salford and green for Manchester.

Click for a little interactive extract of the first image

Here's the basic method for adding heights to the building polygons.

Download some building footprint data. I used OpenMap Local from OS OpenData and this comes in named tiles (as per the naming conventions of the British National Grid - NH, NN, NS and so on). I used data from the SJ tile, covering Greater Manchester. When you download it you'll get loads of layers but in the data folder you should see one called Building.

This building data doesn't come with any height attributes so we need to find a way to add that in. This can be done by using free, open LIDAR data available for England and Wales via the Environment Agency. You can also get aerial photography here, by the way. The data come in a variety of resolutions but I chose the 1m version as I wasn't overly concerned about precision. If you download the 25cm resolution LIDAR data the files will be huge. I downloaded tile SJ89 and both the DSM and DTM products. The DTM is a digital terrain model so it gives us the lay of the land, as it were, and the DSM is a digital surface model so includes the height of anything on the land beneath. The image below probably explains it best.

Source: StackExchange

I then add the DSM and DTM files to QGIS (I used version 2.14) as a virtual raster. Otherwise you'll end up with loads of individual raster tiles if you add them one by one because the SJ89 LIDAR data comes in lots of little chunks. You can find out more about this - and how to add a virtual raster - in this extremely useful blog post. The brilliant Owen Boswarva also has some really useful info and links on his mapgubbins site, so check that out too.

Because the DTM gives us the height of the land and the DSM the height of any features on it, we need to use the Raster Calculator in QGIS to calculate the height of buildings. But fear not! This is also very simple, and is explained in the blog post linked to in the last bullet point. The only thing I did differently is I just manually selected the area of buildings that covered the SJ89 tile using the Selection tools in QGIS (and then saved it as a new shapefile layer).

Before using Zonal Statistics in QGIS to calculate building heights - and bear in mind they are not exact - I just had a virtual raster which was obtained by subtracting the DTM from the DSM layer in QGIS, plus a buildings layer which covered the same area, as you can see below. It was these two layers I used to derive the building heights.

This area is centred on Greater Manchester

Then I used the Zonal Statistics tool in QGIS to calculate a mean, min and max height for each building polygon from the LIDAR data. You could just use the mean but I wanted more information so I just left checked these boxes when running Zonal Statistics. The method is also described by Brendan, here, but you should note that in version 2.14 this tool will be in the Raster menu. Note that you may find nothing happens after running Zonal Statistics - don't worry, QGIS has just added the new fields (max, etc) to your shapefile without any fuss so check the attribute table.

Here's an example of my final shapefile, with new height attributes

Once you've done that, you should have the ability to analyse building heights and produce a fairly credible 3D model of your area of interest. Just bear in mind that things like pitched roofs, curved roofs and other shapes will not come out when using this Zonal Statistics method. For that, you could of course use the raw DSM data if you wanted.

That's it really. I'll finish with a final big version to which I have added some labels. Right click and you should be able to see it full size in a new window or browser tab.

Hastily patched together, but I hope you get the idea...

Friday, 16 December 2016

The 8 English Regions of a Federal UK

The political geography of England can be a contentious topic, particularly when it comes to the craft of drawing boundaries on maps. In fact, people can get quite agitated about it, and probably with good reason. But it’s too important to ignore, particularly at a time when the very nature of the UK as a political entity is being questioned and challenged. In this short essay I outline what I think new regions of a Federal England could look like, based on a combination of cultural, historic, administrative, economic and geographic factors. The boundaries you see below are my interpretation of ideas sent to me by Philip Brown and Nathan Pearce, two former students of the then Department of Town and Regional Planning in Sheffield (since re-named) who now work in professional planning practice. This doesn’t mean they are right, but they do at least have some grounding in the subject, and an unhealthy interest in regional maps and planning. I do too, so I have taken their note and turned it into a series of maps. The final map is shown below, to kick things off. Read the full text to see how I got there. You can see high resolution images of all maps here. You should also be able to right-click maps to open them in larger format in a new browser tab.

Read the rest of the post for the full story

Let’s put the names to one side for a minute because I'll come back to that later. Let’s forget too the big differences in size you can see in the first map. There is some logic there too, I promise. The question of what the English regions should look like is not one with a single correct answer, but I think it is possible to arrive at some kind of best fit compromise where everyone is happy. Okay, that was a lie. But I do think we can draw lines that people can live with and that make sense and work well for governing, planning and economic activity that also reflect historic and cultural factors - and without the need to re-draw local authority boundaries once again.

Previous regional boundaries have made some sense, but I feel they have never been quite right, for a number of reasons. For example, when I did my PhD it was about the North West region of England, which at the time had recently inherited Cumbria from the old ‘North’ region. Westmorlanders may have been aghast, but it happened anyway so I think we need to revisit a bit of the history of all this before going any further, lest we repeat the mistakes of the past.

Background

I recently co-authored a paper on US megaregions with Garrett Nelson, and after seeing the level of interest it generated, it struck me once again just how much people care about this kind of thing. In England, I think people may care even more, particularly because there has never really been a suitable regional structure - and because pre-existing geographies are layered through the centuries and embedded in our minds. In the context of devolution for other parts of the UK, I can understand why there is unease in parts of England - particularly when there is the perception of ‘London vs the rest’ - though I won’t dwell on that here. What I will say is that the idea of the English regions as ‘the dog that never barked’ by Christopher Harvie (1991), is one that resonates strongly today. Some kind of federal arrangment makes sense, but the question of how to make it work is a vexed one.

I saw in the news recently the leader of the Scottish Labour Party raise the question of a future Federal England within a UK and giving more power to the English Regions. It’s a sentiment echoed elsewhere and in the context of the post-Brexit/Indyref 2 question of what the UK will look like in the future it’s a very important one. On the whole, I’m in favour of this kind of federal arrangment, if only we can just figure out what shape the regions should take, how such a system would function and who would decide how it all works. This structure could provide a useful political and economic counterbalance to London, going beyond the current ‘powerhouse’ or 'engine' style plans for the North and Midlands in England and making permanent a decentralised, fairer political power structure across the country.

As I said above, though, this is a difficult topic to discuss because boundaries drawn the ‘wrong’ way can end up threatening people’s identity, among other things. I was reminded of this recently when Jonn Elledge of the New Statesman and CityMetric published a small English Regions map and Twitter responded somewhat impolitely. So, it’s a difficult topic, but that shouldn’t stop us discussing it. That’s why I waded in to a Twitter discussion between Jonn Elledge, Duncan Weldon and Chris Cook the other day to offer my thoughts, as you can see below.

See some more comments on this on Twitter

Before going any further, though, I think it is useful if we look briefly at some historic, ceremonial and administrative boundaries to see a little bit of what has come before.

Old boundaries

At this point, I could post any number of different maps that define English ‘regions’, but I’m going to stick to three geographies. You can see lots of others via a quick search, including some interesting telephone call-based ones for the entire country. I use the term ‘region’ somewhat loosely here of course.

First, let’s look at what are termed ‘ceremonial counties’, available from Ordnance Survey. These counties include modern-day Greater London but also a big mix of other, less familiar shapes. These counties have an important history so I wanted to make sure I took them into account and referred to them when making the Brown-Pearce regions of England you see below and above.

These should be pretty familiar

Second, we have what are considered ‘historic counties’, such as Westmorland. These have largely disappeared from our maps, but not from our minds, and this is important. In fact, they often retain a special significance in people’s minds and it’s common to see letters addressed, or goods labelled using these historic names. This kind of emotional attachment is both logical and understandable, so we should pay attention to it in any regional rejig. For example, Middlesex is a name that pops up regularly but it’s much rarer to see it on a modern map. Nonetheless, it is an important mental and historic geography, even if it is no longer an official administrative one. I mention this to highlight the fact that the sense of identity tied to historic place names is powerful and significant. The way modern Greater London intersects with these historic counties is particularly interesting because it has mostly been forgotten, though not by many who live there or who are from there.

Yorkshire is as big as ever

Note the Middlesex/Hertfordshire 'wiggle'

The two examples above may not be ‘regions’ in a formal administrative sense, but they fit my definition here, which is basically bounded spaces that go beyond the local. That sounds a bit waffly, but we must remember that ‘region’ is a term we could apply to Uttar Pradesh, with over 200 million people, or to the old North East Government Office Region of England, with 2.6 million people. Talking of which, the first map below shows English Government Office regions before they were abolished in 2010. As you can see when I add in some of the earlier boundaries in the second map, these align with historic notions of regions in places, but not in others. It’s also worth remembering that ‘region’ itself is often an extremely loaded term. Try arguing that Scotland is just another region, for example, and you’ll see what I mean.

Ye Olde Regions of England

Once again, with meaning

These GOR boundaries weren’t loved; and most administrative geographies aren’t. They were at times a useful governing device but they are not the kind of regions we could get excited about, or angry about; unlike, say, the 1974 reorganisation of local government in England. One issue that strikes me here is how London is too ‘small’ from an economic geography point of view - for more on this, read this excellent piece by Barney Stringer. That is, it’s ‘underbounded’. Greater London contains about 8.6 million people but its economic power and orbit stretches much further beyond the boundary, into and beyond the metropolitan green belt.

I mention this here not to annoy everyone in Berkshire or Surrey, but because it features as one of the guiding principles in the Brown-Pearce regionalisation below. I also agree that if there were to be some kind of Federal England with new regions, then London should be bigger. But how big? 13.42 million people big, that's how big. Read on.

After watching the ‘regions of England’ Twitter exchange from the sidelines, Philip Brown got in touch to say that he and Nathan Pearce (a proud Janner) had something up their sleeves on this subject. Here’s a little extract of what Philip wrote.

"Basically Nathan Pearce and I, when we probably should have been studying for our coursework or something, once had the exact same discussion you, Chris Cook, Duncan Weldon and John Elledge have just had on Twitter."

"Except we were in a University computer room so spent far too long on it as we tried to solve the question of English devolution on cultural ties, geography, and also whilst largely respecting history too, lol! Hopefully these factors combined would make it tolerable to most in modern day England too."

The Brown-Pearce plan splits England into eight ‘Kingdoms’ and promises to ‘solve UK wide devolution forever’… (I like their confidence, albeit tongue-in-cheek). Thankfully, Philip still had the original Word document from when he did this in about 2010, so he sent me it and I republish it below for everyone to marvel at. Please don’t send him hate mail. He’s a very nice chap and a proud Yorkshireman.

"Each Kingdom to have an Assembly of similar powers to those presently in Greater London or Wales, but Parliament shall remain sovereign.

Dumnonia: The counties of Cornwall & Devon.

Wessex: All of the South West England Government Region bar Dumnonia.

Thames & Solent: The counties of Berkshire, Buckinghamshire, Oxfordshire and Hampshire.

Southern: The counties of Surrey, Sussex & Kent.

East Anglia: The East of England Government Region.

Northumbria: Everything from the Scottish border south to the southern borders of Cheshire, Greater Manchester, South Yorkshire & North Lincolnshire with the Dark Peak of the Peak District National Park also included within Northumbria (see Peak District Core Strategy for map of the Park's three regions, namely the Dark Peak, White Peak and South West Peak).

Mercia: Largely the Government Regions of the East Midlands and West Midlands, minus the Dark Peak. Notes on Mercia: the former coalfields of North Derbyshire/North Nottinghamshire are welcome to join Northumbria if they wish. Similarly Lincolnshire is welcome to join East Anglia if they wish - perhaps via a referendum.

County of London: The present county of Greater London plus approximately 10 miles in all directions, pending appropriate boundaries, but with the Mayor of London given strategy over the entire Metropolitan Green Belt. Any modifications outside of the County of London must be made in conjunction with the appropriate local authority and Kingdom. If existing District Councils fall approximately within this 10 mile extension, they shall become new London Boroughs by right, other areas shall either be subsumed within existing or newly created Boroughs, pending appropriate boundaries.

Reflecting on this 5 years on it still largely works so feel free to create it as a map for Messrs Cook, Elledge and Weldon; if you genuinely are so inclined."

Unless you have an encyclopaedic knowledge of English administrative geography, this is quite hard to visualise, so here’s a second map of the Brown-Pearce regions, this time with population figures for each region, which I calculated using 2015 mid-year estimates from the ONS, plus some towns and cities. A brief word or two on methods follows this.

I've added in a few towns and cities this time

Methods

This is not a serious policy proposal for the re-establishment and re-organisation of English regions, but I am serious about provoking discussion on the subject. So, for practical reasons, I used existing local authority boundaries as the basis for creating the regions you see above. There are 326 in total across England and in the map legend you can see how many local authorities fall into each region.

For Dumnonia (from the Brythonic kingdom in Sub-Roman Britain), Wessex and Thames & Solent I followed the spirit of the Brown-Pearce plan to the letter. Dumnonia covers the same area as the present counties of Devon and Cornwall, for example. Their Southern region and East Anglia are as described but minus those local authorities which are within 10 miles of the Greater London boundary. This may seem like an arbitrary cut-off, but when you look at where the metropolitan green belt is and the wider commuting patterns of the area, it makes a lot more sense. This can be seen reasonably well in the County of London zoomed-in map below.

The bright lines are commuting flows. The dark green is green belt.

Mercia and Northumbria are more straightforward - I kept these as described but made sure to put High Peak in the Northumbria region, rather than in Mercia - as I think this makes sense on a number of fronts, including the geographic. Note Philip’s generosity when he writes ‘the former coalfields of North Derbyshire/North Nottinghamshire are welcome to join Northumbria if they wish’. I’m not massively keen on some of the names, so I’ve had a go at re-branding them at the end.

Following their instructions and using precise measurements with my GIS toolbox, I ended up with a pleasingly round shape for the County of London, apart from a couple of areas which I thought shouldn't be in the ‘County of London’ (for fear of civil war, but also functional economic reasons). One example was Royal Tunbridge Wells, so I removed this and put it back into the Southern region. Once I’d sorted this little tweak I was left with something I was happy with. I also calculated the land area of each region, as you can see in the third iteration of the map, below.

This time with area metrics, in square kilometres

At first glance, this map might seem somewhat odd, but I think it is actually a pretty good representation of what a Federal England could look like. For example, we have three very large regions in terms of population, with the County of London (13.42m), Mercia (10.34m) and Northumbria (15.28m) the big hitters all with more than 10 million people. Northumbria has the most local authorities, at 73, one more than the County of London which could be thought of as some kind of symbolic gesture, but it’s just a coincidence. The Brown-Pearce plan threrefore increases the number of ‘London Boroughs’ from 33 to 72. I'm sure that won't cause any political problems.

The County of London is quite a nice shape but of course it kind of tramples over some important historic and ceremonial boundaries as you could see above. But I would argue - and have before - that on a functional economic basis London is actually much bigger than Greater London. Things like commuting, housing and economic growth need appropriate economic geographies if they are to be governed and planned properly. Too many important issues are stifled by inappropriate boundaries - a topic not unrelated to the wider political turmoil we find ourselves in here and in the United States.

The way spaces are divided has very significant implications in the real world. In the same way that political gerrymandering can skew the balance of power in elections, economic underbounding can limit growth and opportunities for development. This is not a new argument, of course, and I’m making it simplistic here, but there is a need to engage with this question more seriously, particularly for London.

Set in contrast to the three big regions are five others with varying characters, histories and identities. The large variations in population I don’t consider to be a problem. So long as the structures are right, this can work well. The examples of US states or German Lander may offer useful comparators here even if they are far from perfect.

But, like I said above, this is really just food for thought for now because I was intrigued and impressed with the regions set out by Philip and Nathan. One thing I wasn’t so keen on was the names, so I’ll end by addressing that.

Summing up

The slightly experimental, playful nature of this piece got lost somewhere above when I began talking about gerrymandering and economic underbounding, so to bring it back down to earth and in the slightly frivolous pre-Christmas spirit that I approached this in, I decided to take the liberty of re-naming and/or re-branding some of the regions. I’ve therefore posted the map below, followed by some further explanation.

Definitely not entirely serious

Northumbria becomes The North of England, because I think it fits better. I know that the historic area of Northumbria covers a much wider region than what we think of today, but it would be a stretch for me or anyone else in South Yorkshire or Merseyside to accept this. I think most people could live with this new name, particularly if they still have a local authority or city region to cling to.

Mercia suffers slightly from the same issues as Northumbria so I have re-named this rather grandly as The Heart of England. There is some historical basis for this and I think it sounds nice, so that’s that.

Dumnonia? Well, I give Philip and Nathan credit for knowing this but it’s just too obscure for me. So, I’m going to go with The Sunshine Coast. Hmmm. What about the inland bits? That is a good point, but I couldn’t think of anything better so I’m happy to receive suggestions but I like the positive vibes generated by the word and the image. And, also, as a native Highlander it's always seemed a very sunny place to me.

Wessex is a difficult one because it’s so fixed in the popular imagination without people really having a precise idea of where it is. Somewhere near Bristol? Something to do wtih Thomas Hardy? That’s why I’ve changed it to Greater Wessex. Including Great or Greater in things always seems to work as people like to be Great. But, slightly more seriously, I think it adds a nice bit of fuzziness that helps soften the blow for non-Wessexian Wessexers.

Thames and Solent works for me and I can’t think of an improvement so I’m leaving it at that. I may just change the ‘and’ to ‘&’. This may actually have been part of the original plan anyway.

For East Anglia, I’m going with Greater again because I think this helps highlight its size and scale. It means that we have Greater Anglia the region and also the train operating company but that’s unavoidable. Greater Anglia it is. This 'Greater' thing really is a winner.

Southern is just bit vague for my liking as it could just refer to the south of England so since it draws from the existing historic counties there I have just added that to the end to make it Southern Counties. I like the way this sounds and we need one that pays homage to historic counties in this way by actually having the word 'counties' in the name.

Finally, we had County of London. If London is going to trample over existing or historic county boundaries that people know and love, I think we at least have to respect that and not use the county designation for the name of a new region. For that reason, and drawing upon international examples, I simply re-named this Metro London. This partly gets to the fact that these areas are definitely not ‘London’ proper but part of its wider economic sphere of influence. They may still be in (e.g.) historic Buckinghamshire but they retain very powerful, important ties to the London metropolitan area. We also have the Metropolitan Green Belt here, which made me think the name was a good fit. Here are those regions one final time, then, with the old Government Office Regions overlaid on top.

New ones are better than the old, I feel

So, that’s that. As I’ve said, this is entirely experimental and is intended to provoke discussion. I’d be very keen to see what other people come up with, either in relation to drawing different boundaries or re-naming the ones presented here. The reason I used the year 2020 in the maps and title of the blog post is that, post-Brexit, we might actually need some ideas on all this. Maybe, if it happens.

Also, in case you were wondering, Wales remains Wales, Scotland remains Scotland and Northern Ireland remains Northern Ireland. This was deliberately just about England.

If you made it this far, thanks for reading.

Saturday, 19 November 2016

Great Britain: Population 7.4 billion

My recent post on the EU's new global population density datasets got me thinking once again about the issue of population density across the world - and how it varies hugely. Some people think England is particularly crowded and some would probably say that Great Britain as a whole is quite a tightly packed little island. But of course this is all relative. I was reminded of this recently when I discovered that the Philippines is now on Google Street View. Since I had a few spare moments and because my brother lives in Manila I went for a little tour around the city, and was struck by the sheer density of it. As it turns out, Wikipedia and other sources say the City of Manila is the most densely populated city on earth, with over 41,000 people per square kilometre. This is followed by another Metro Manila area (Pateros), at over 30,000, and then Dhaka in Bangladesh at over 28,000. Where am I going with this? Using these figures as a reference point, I decided to see whether the entire population of the world - currently about 7.4 billion - could fit in the island of Great Britain. The answer is yes. Some maps and a few words below to help explain...

I've cut out a chunk of Manila and tiled it over GB - somewhat bigly

If you scale things a little more closely to the real world, you begin to get a sense of what this kind of density would look like on the ground - and remember that in some parts of the world people do live at these densities. Just not in the South West of England and time soon, thankfully.

I believe getting planning permission for this might be an issue

To the other end of the country now, around the far north east corner of Scotland, including Wick (current population about 7,000). Not much room to breathe here. There isn't much room left for roads or train lines or parks or anything else, so day to day life might be just a little complicated.

Transport, waste, communications and a few other things would be a bit tricky

There are about 7,400,000,000 people in the world now, according to current best estimates, and the land area of the island of Great Britain is about 210,000 square kilometers. The maps here don't have lochs and lakes cut out but my calculations do take this into account. So, this gives us a population density of 35,238 people per square kilometre if we had to accommodate the whole world in Great Britain. Remember that is a lower density than the City of Manila (i.e. the inner part of Manila with a population of 1.7 million, rather than the whole of Metro Manila - with 13 million). Let's look at a few more maps now.

Merseyside and surrounding area

Central London, with a slighly wonky looking Thames

For reference, there are about 300 people per square kilometre in Great Britain at present. There are about 5,500 people per square kilometre in London and about 6,300 in Tokyo. New York City has a population density of about 11,000 and Paris is quite tightly packed, at about 21,000 per square kilometre (for the 20 arrondissements). Manhattan has about 26,000 people per square kilometre.

There is loads of stuff on the internet about this general topic, including the excellent Per Square Mile by Tim De Chant. The most densely populated country is Macau, at just over 21,000 people per square kilometre. If all this metric stuff is confusing, then I can tell you that in imperial units the density needed to accommodate the world in Great Britain is about 90,000 people per square mile. No matter how you measure it, that's a lot. Even Manhattan only has 67,000 people per square mile.

The obvious question now of course is what we should do with the rest of the world. Turn it into a park? Nature reserve? Museum? I'm joking of course - there is also a more serious point here. I'm just trying to put some perspective on the issue of population density across the globe and how we measure it. It's tempting to look out the window or use our day to day lives to assess what's 'normal' - and of course this is natural. But when I've been looking more closely at the GHSL global population datasets recently I have been amazed at just how densely populated some cities are - as you can see a little bit from my previous blog post on the topic.

London and the surrounding area - not actually all that dense

Notes: okay, this was mainly just a little bit of fun but I did want to try to put it into perspective, not least because I'd also read a myth-buster piece by Simon Rogers and James Ball about the claim that everyone in the world could fit on the Isle of Wight. Here's the bit of Manila I used for the urban area in the maps above. I'm not seriously suggesting we concrete over Great Britain and attempt this so please don't send me hate mail about the green belt! Also, here's a piece from 2013 that does the same kind of thing, though not with Great Britain. Finally, and somewhat ironically since this was partly a procrastination thing, the 2013 piece was originally by Tim Urban of Wait But Why. His very amusing procrastination TED talk on YouTube now has over 4 million views.

Friday, 4 November 2016

The 435 Congressional Districts of the United States in one giant poster

In his last State of the Union Address, President Barack Obama called for an end to gerrymandering - the process of drawing political boundaries in a way that favours one party over another. I knew a bit about the topic from my time in the US, but I wanted to see what all 435 Congressional Districts looked like, so I ran off a set of maps. This was later picked up by WIRED* and shared quite widely. This week, I saw US elections guru Stephen Wolf had published a new shapefile of Congressional Districts which included the revised Florida, North Carolina and Virginia boundaries. I had a few extra moments due to something being cancelled at the last minute, so I ran off a set of maps and turned them into a gif and a poster. I shared the gif on twitter but am posting more material here in case anyone is interested. First of all, here's the massive poster with all 435 Congressional Districts, arranged alphabetically by state and District number.

Not to scale - it's about comparing shapes - bigger version

You should be able to click on the above image to see the labels more clearly. If you want to download a really gigantic version, have a look here for the 13MB, 16,527x16,841 pixel monstrosity. Shapes alone can't necessarily tell us whether an area is clearly 'gerrymandered' or not but it's fair to say that in some cases it's a pretty good sign!

I also created two gifs which animate through all 435 Districts at different speeds. The first one below is the one I previously shared on Twitter and the other one is a slightly slower version. The intention of the first was to leave just enough dwell time on each frame so that you can perceive the variety of shapes but also see all 435 in under a minute. In the second, I'm trying to allow more cognitive processing time.

It's supposed to be somewhat hypnotic

The first one is pretty fast, with only a tenth of a second for each Congressional District. The version below shows each District for half a second, so might be a little bit more useful but then again it takes longer to run through.

The definitive gerrymandering gif? Maybe not, but it's a quick summary

Here I have been experimenting with display techniques partly as a way of educating myself on an important subject but also partly to figure out what the best way of representing the data in an easily digestible way is. I also posted each of the individual frames from the gifs to a Google Drive folder in case anyone wants to use them.

I like the small multiple approach of the poster better in many ways because you don't have to wait to scroll through the images and also you can make visual comparisons between, say, frame 2 and frame 200 without having to wait for the gif to loop through to it. Also, you don't have to interact with it in the same way so I find it more accessible and actually made it this way so that it could be printed out as a large poster and used as a focal point for discussion and debate - which I think is one of the things maps can be very useful for.

Finally, here's the poster in two different colours. There's also a folder with all three versions in different sizes.

Does it look better or worse in green? Bigger version

A classier colour, I feel - bigger version

Here's the Obama 'gerrymandering' part - source

*The person who wrote that WIRED piece has since been 'moved on' owing to some sub-optimal journalistic practices but the piece remains online under an editorial note.

Sunday, 23 October 2016

The Global Human Settlement Layer: an amazing new global population dataset

At the recent Habitat III conference on housing and sustainable development in Quito, Ecuador (17-20 October 2016) the European Commission launched a new Global Human Settlement (GHS) dataset. That's what this post is about. Before that, here are some basic data details - Landsat data from 1975, 1990, 2000 and 2014 were processed and analysed in order to produce three different GHS products: one on population (GHS-POP), one on built-up areas (GHS-BUILT) and one city model dataset (GHS-SMOD). But this is already getting too technical, so let's look at some maps - I've just created a few in 3D for fun in order to give you an idea of what the population dataset looks like. Each image below uses the 250 metre resolution one (there is also a 1km cell version for population).

I've added a few place names and extruded cells by population

The London example is pretty interesting and I think provides a nice overview of settlement patterns both in terms of distribution and density. As you can tell, this is just a small chunk of the data - it covers the whole world so is a pretty big file - but more on that below. For that reason, I took a smaller extract to explore it further and for this I exported the United States as a separate file and looked at four metro areas I thought would be interesting from a populaton density point of view: San Francisco, Los Angeles, Houston and New York. The highest population value in any single 250m cell in the London example above was 1,595, so I also thought it would be interesting to compare them to the US cities. Let's take a look, starting with the Bay Area around San Francisco.

That big spike in the north? That's San Quentin State Prison

This was also just a little extract, but again it gives an interesting view of population density. The population spike in the north of the Bay Area surprised me but then I looked at it more closely. The data showed a figure of 4,856 people in that 250m cell, which seemed pretty high so I dug a bit deeper. The Wikipedia page for San Quentin State Prison tells us there are 4,223 prisoners (137% of capacity) and another quick search tells me there is employee housing there too, so this figure stacks up. The next highest value is in San Francisco, so this makes sense too. But what about Los Angeles - how did that compare?

This is just a part of the wider Los Angeles metro area

The highest population value in any of the 250 metre cells in Los Angeles was 2,285, which I was a little surprised at because I didn't think it would be much higher than London. This was just a quick and dirty extract, so no labels here (or scale bars, sorry) but you do get a sense of the urban density and distribution of settlements here. Somewhere I did think would show much less density was Houston, and I was proved right here, as you can see below.

The sprawling metropolis of Houston

The highest population figure in any one 250 metre cell in Houston was 813, according to the GHS dataset. This is of course not surprising but I found it quite interesting to see it like this. Finally, I wanted to see what New York and its wider metro area looked like. I thought it would beat San Francisco for density, and it did.

An obvious spike in population density in most New York City Boroughs

The highest population figure for New York City per 250 metre cell was 6,189. This makes sense when you think that a tall residential apartment building can easily fit within one such cell - and in fact multiple buildings can. Mind you, it's still a pretty high figure.

These examples are from the 2014 dataset but there is so much else to see, if you have the time and skills to explore it. I'm at risk of becoming addicted to it, so I'll have to restrain myself. For now, I recommend that you check out the European Commission web pages on the data.

The rest of this post includes more technical information, possibly of interest to only a few data/GIS nerds with nothing better to do with their lives.

About the data (and yes, it's open and free)

The most important thing is to know where to get the data (once you've read about what it is) but this can involve endless clicking so here's a FTP link to the downloads. I've focused on the population part of the dataset here and it comes in TIF format. The 250 metre resolution one is about 626MB in size, so you need to have a decent machine to work with it. In terms of map projection, it comes as World Mollweide (EPSG:54009).

Quite a big file, but not too bad considering it's global

Here's the Copyright text file for the datasets

You can then open the file in your chosen GIS - I've shown a couple of examples of this below; one with ArcGIS and one with QGIS (the dataset notes file specifically mentions both of these). I have found it easier to work with in QGIS so far. When you open them at first you won't be very impressed - some further styling is needed. Also, in ArcGIS the high value suggests an impossible figure and in QGIS the values go from zero to zero - again, this just needs some tweaking in order to display something meaningful.

Notice the strange high values - that's not right!

Yep, nobody lives on earth (0 to 0 in the values on the left of the image)

Once you get the data on screen, you can start to style it and get something meaninful in front of you. Here's an example from ArcGIS, where you can see that there is some 'blockiness' in the data in some areas - it's not perfect at 250m resolution so at times the 1km product may be better on this front.

The high value of 7368 seems more reasonable here

Now let's take a look at a more cleanly styled view, this time for England. As you can see below, this now gives us quite a nice overview of the settlement pattern for the country.

This is just the original raster dataset, zoomed in

Since I extracted the data for just the United States, I also have a nice separate 250m cell version of that. I actually converted this to a vector layer in QGIS (and it's about 850MB) so here's what that looks like for the lower 48 states. I think this is quite pleasing to the eye. Click to make it bigger - it's a good approximation of the settlement pattern of the United States.

This is a vector version of the 250 metre population dataset for the US

One thing I haven't yet got to the bottom of is what the maximum population of any single 250m cell is. In both ArcGIS and QGIS, the maximum seems to be 634,492 - which isn't right. You definitely can't fit that many people in a 250 metre square! Hopefully someone will get to the bottom of this. I think this figure might come from aggregated blocks of cells in the data but so far I haven't had time to figure it out.

How to work with the data

Working with the data is quite tricky so here are a few tips for how I dealt with it in QGIS. This last part describes how I extracted a subset of the original massive 626MB TIF so that I could work with smaller chunks and then convert it to vector format for doing some 3D maps. All I did was load up the full 250 metre resolution population dataset and then went through the steps you can see in the screenshots below.

This is the original dataset, zoomed to Liverpool and Manchester

You can then just select an area of the TIF to extract by clicking and dragging

Using the new TIF, I then convereted it to a vector layer

This is the new vector layer, zoomed and symbolised

Finally, I decided to do a little bit of experimenting with the 2.5D symbology options in QGIS (available from version 2.14 onwards). The images at the top of the post were done in ArcScene (part of ArcGIS) but ideally I'd have done this in Blender instead - but that would have taken too much time. Also, I'm waiting to see what Steve Bernard and others might do with this dataset - there are so many possibilities and so far my Blender skills are really limited.

This might break your computer if you try too big an extract (e.g. an entire country)

Finally, a zoomed in version of the above viz

There's so much that you could do with this data for research purposes, or just for fun, but the first hurdle is getting your head round the data and how to work with it. This post is just intended as a small contribution in that vein. I hope some find it helpful.

Notes: the GHS population dataset is a giant raster (TIF format) of 626MB when compressed. I created an uncompressed version (by mistake) and it was 33GB! There are 141,969 columns and 60,829 rows in the full raster - this adds up to 8.6 billion cells, so I don't recommend trying to convert the whole thing to a vector image because it won't work and is not a good idea anyway. The creation of the dataset was supported by the Joint Research Centre (JRC) and the DG for Regional Development (DG REGIO) of the European Commission, together with the international partnership GEO Human Planet Initiative. Lots of very clever individuals contributed to the project, and you can find out more about the team on the GHSL people pages.