Wednesday 17 August 2016

Research with QGIS, R and speaking to people

I recently led a piece of research for the Joseph Rowntree Foundation on disconnected neighbourhoods - basically, it looked at the UK's most deprived areas and how connected or disconnected they are to their wider cities in relation to jobs and housing. You can read the brief Findings or Full Report here. This post is just about the methods we used and some of the outputs. We used open source software QGIS and R for the analysis (led by Ruth Hamilton) and we also spoke to policymakers across the country (Rich Crisp and Ryan Powell).

You can see the full report here

We looked at those areas in England, Northern Ireland, Scotland and Wales that fell within the 20% most deprived on the national deprivation indices in each nation and then explored data relating to household moves and commuting (plus lots more). We updated and developed two area typologies to help us make sense of the data - and to see how things changed we produced riverplots (Sankey diagrams) in R (this was done by the briliant Ruth Hamilton).

Created with the Riverplot package in R

I also did a little bit myself with open source software, including updating a 'divided cities' type graphic I produced in the past - looking at the spatial split between most and least deprived parts of 13 cities across the UK, as you can see below. See Sheffield in particular for a very stark divide.

Red = most deprived, blue = least deprived

Our colleagues Rich Crisp and Ryan Powell then spoke with more than 140 policymakers in cities across the UK - another nice 'open source' method. You can read more about this in Chapter 6 of the report but the bottom line is that if we want things to change for the better then we need to take a different approach to urban policy - a more inclusive approach.

As I said, there are two typologies, and we then combined these into a matrix in an attempt to understand area types a bit better and then suggest possible policy responses that might make sense. What each category means is explained in the report but in the figure below you can probably make sense of what 'Gentrifier' areas are and what those labelled 'Disconnected' are.

The shaded areas might be a good policy focus to begin with

The point of this blog post, however, was just to highlight how useful and effective open source software now is, and can be, in real-world research. Advocates will already know this but many more have yet to make the leap so hopefully this will provide just a little bit of inspiration or motivation to do so.

We produced hundreds of maps for the project (too many for the report) so you can probably find one for your area in the online folders. Two examples - one for each typology - are shown below.

Residential typology map of Birmingham

Travel to work typology map for Glasgow

For more on the methods used to develop the typology, see the Annexes in the Full report.

Monday 1 August 2016

How long is the coastline of Great Britain?

This is a bit of a long read, so if you really want to know the answer to the question in the title of this post, it's very simple: it depends upon how you measure it. Or, you could say that the coastline of the island of Great Britain is infinitely long. But this doesn't really help anyone who wants to walk or kayak or swim round this island, so I'll attempt to answer the question here. Take a look at the image below and you'll see that I've calculated the distance of the coastline round the island of Great Britain as 11,023 miles. 

Quite a lot of coastline for a small island

But hold on a minute, I also calculated it again and got an answer of 3,876 miles, as you can see below. What's going on here? Well, the first image is an extremely detailed digitised representation of the coastline of Great Britain and surrounding islands (bearing in mind 'detailed' is a relative concept). This first map is represented by 2,282,000 individual vertices which create the polygons you see in the image above. 

In the second map, only 0.1% of these vertices are retained, so the geographical features you see below are represnted by 2,282 individual vertices. You can't see much different between the two at the scale you view them at here but if you were trying to navigate your way into a harbour or sea loch on the west coast of Scotland, for example, it would make a big difference. Click the first image to enlarge it and then compare it to the next one and you will see some differences, but nothing too drastic.

The coastline length is a function of how you measure it

At this point, you might be thinking 'hasn't this got something to do with fractals and Benoit Mandelbrot?' - and you'd be right. He wrote a very famous paper in Science in 1967 on exactly this topic, entitled 'How long is the coast of Britain'. The answer is that there really is no definitive answer - it's all about how you measure it. But let's say you want to swim or kayak around the coastline of Great Britain and nearby islands. How far would you have to travel? I tried to calculate this based on a 1km distance from the shoreline and concluded that it could be done by covering fewer than 2,000 miles - even though the coastline seems to be a lot longer. After all, you wouldn't want to go in and out of every little cove and estuary.

Be my guest

I created a little gif based on different ways of measuring the British coastline, starting off with a file that included 100% of the vertices from my original Ordnance Survey map layer (see notes below for more on this). I then created files with fewer and fewer vertices retained, all the way down to a non-sensical shape which retained hardly any of the original points. This is what I got - at 2 seconds per frame (note '% of vertices retained' figure in each image):

Coastline length at different measurement scales

It's a bit difficult to see the difference between some of these images at this scale, so I also zoomed in to the west coast of Scotland to produce another little animation. This time, you can really see more of the difference between the layers I produced. The figures on the graphics indicate what percentage of the original vertices were retained in each case. Below, this, I have also provided a still image with different versions of the coast overlaid on top of each other, just to demonstrate the impact of reducing the number of vertices on the representation of the coastline, and hence its length.

This shows Morar, Mallaig and Loch Nevis 

Each line represents a different level of generalisation

I then decided to take a smaller island and extract the individual vertices (also known as nodes) that make up the shapes you see in the maps above. For this, I chose the Isle of Skye because it's one of the biggest British islands and the coast is highly irregular and indented. Using the version of the original shapefile where I retained 1% of the original vertices, Skye is represented by 772 individual nodes joined together to make a single polygon, as you can see below.

This produces a pretty good approximation of the coastline of Skye for most purposes. At this resolution, the coastline of Skye comes in at 330 miles (530km), compared to 456 miles (733km) at the original resolution. But of course we need to remember that if we had digitised around every single rock around the coastline the length would be nearly infinite. If you measured the coastline with a matchstick, for example, you'll get an extremely high value (and a sore back).

Skye represented with a polygon comprised of 772 vertices

Here's what this looks like when you show them one by one, in an animated gif - just to give you an idea of how it is plotted spatially. This is shown at 15ms per frame, so the dot fairly zooms around the coastline. All of this also gives you a little insight into how a GIS deals with geometry and what goes into the shapes that you see on your screen. It also helps explain why the very detailed, highly accurate spatial data files we can download from Ordnance Survey aren't always the most appropriate ones to use in small scale mapping. Or, maybe I just wanted to make another geogif, but either way I think I learned something.

A dot going round the Isle of Skye at 99,000 mph (forever)

So, how long is the coastline of Great Britain? Well, if you want to swim or kayak around all islands then you should think about training for a distance of around 2,000 miles and if you want to walk the coastline of Great Britain only then it's most likely going to be a bit more, or maybe a bit less - but that depends upon how you plan your route. Despite all the uncertainty, however, I think we can all agree that you'll need to go more than 1,024 miles.

Yes, this is Britain (kind of)

Last of all, I also did a little gif showing the 174 vertices of Great Britain when the file is massively reduced - so I'll end with this.

Another one, just for fun

Notes: I used the OS OpenData Boundary Line product for the coastline. This was a polyline file so I converted it to a polygon and then generalised it several times using the Visvalingam algorithm in mapshaper. Contains OS data © Crown copyright and database right 2015. You'll see if you search online that my measurements are close to those of others - so I'm at least as right or wrong as some people. If you're interested, you might want to look up the coastline paradox as well and, of course, Lewis Fry Richardson. Other big British islands? After the island of Great Britain, it's Lewis and Harris at 741 miles of coastline (1,193km), the mainland of Shetland at 692 miles (1,113km), Skye at 456 miles (733km) and North Uist at 334 miles (537km). Remember that this refers to coastline length and not land area.