Following on from my last blog about open data from the ordnance survey, I've been thinking about what we can use all this free stuff for. So how about trying to estimate coastal flood risk for all England, Wales and Scotland?
As any fule kno, you need highly accurate elevation data, locations and types of residential and non-residential properties, effective flood propagation models and information on location and type of flood defences, as well as information on flood water levels from rivers and the sea. National agencies like the Environment Agency and SEPA spend a lot of money gathering this kind of data, and use it to estimate risk at the national scale.
Most of this data isn't available free - but how far can we get with the free data that is available?
Here's a list of free stuff we can use for estimating flood risk in Great Britain (I can't seem to find equivalent open data for Northern Ireland):
Quantum GIS | Excellent open source GIS software. |
Spatialite | Open source geodatabase for analysis of big data sets. |
Postcode Open | Point data for postcodes - gives a point for each unit postcode, but not boundary polygons. |
Landform Panorama | Elevation on a 50m grid. Accuracy is probably around 2-3m. |
Estimates of Extreme Sea Conditions - Spatial Analyses for the UK Coast | Estimates of extreme water levels - doesn't actually cover NI despite the title. |
The cost of the summer 2007 floods | Estimates of the damages per property from the 2007 floods. |
Flooding in England - a national assessment of flood risk | The "official" estimate of national flood risk from the Environment Agency. |
Flooding in Wales - a national assessment of flood risk | The "official" estimate of national flood risk from the Environment Agency, also available in Welsh. |
National Flood Risk Assessment for Scotland | Some more "official" figures for Scotland from SEPA. |
(I've used "official" in quotes because I don't want to place more trust in these figures just because they come from the appropriate government agency.)
And here's the recipe:
- Sample elevation from the Panorama data in QGIS - there's a nifty plugin to do this called "Point Sampling Tool".
- Load the postcode data into spatialite. Make life easier by deleting everything over 20m above sea level - we'll assume these will not be at risk from coastal flooding.
- Load the Estimates of Extreme Sea Conditions data into spatialite - we get a series of points round the GB coastline, each with data on water levels for extreme event probabilities.
- Use spatialite to join each postcode to the nearest sea conditions estimate point.
- We now have a set of extreme water levels, and a ground elevation for each postcode. Use these to estimate the probability of flooding, using a log-probability model.
- Count the number of properties in appropriate probability bands, by assuming there are roughly 16 properties per unit postcode (the UK has 1.7 million unit postcodes, and 27 million addresses).
- Use the probability to estimate the annual average damages, assuming damages of £25k per property if it floods, taken from "The cost of the summer 2007 floods" report.
P>0.005 (1 in 200) | Number of properties from this analysis | "Official" Estimate |
England | 1.4 million | 1.2 million |
Wales | 100 000 | 140 000 |
Scotland | 90 000 | 26 000 |
If we look at the number of properties at risk, we may have a reasonably good answer (or at least one that roughly agrees with the "official" figures), even using data that's not well suited to flood modelling. But using this technique to estimate annual average damages (if we measure the damages from flooding over many years, and calculate the average per year) gives around £30 billion for England, £2 billion each for Wales and Scotland. The "official" estimate is around £1 billion for England and £200 million for Wales (there are no figures for Scotland). So not so good if we start to think about cost of flooding, rather than the number of properties affected.
![]() |
Postcode locations in colour coded flood risk bands - a lot of central London has an estimated flood probability of >1 in 75. Fortunately we have the Thames Barrier. |
There are big contributions to the risk from business premises. They can be "hot spots" of risk - a bus garage in Carlisle 2005 and a caravan showroom in Tewkesbury in 2007 were significant contributions to damages. The estimates from the 2007 summer floods indicate that the if non-residential damages were accounted for, this would roughly double the risk.
The error in elevation data is also too big for this kind of application. The difference in water levels between 1 in 10 000 and 1 in 1 (i.e. we see a water level greater than this on average once a year) is as low as 1m in some places, so an error of 2-3m equates to a big error in flood probability.
But despite the holes, I quite like this result - close enough to the "official" estimate to be interesting, but not suspiciously so. I wonder what would happen if we could get access to flood defence and better elevation data? I also want to look at the social aspects of this - are poorer communities more or less at risk? There's a load of UK Census data available that I want to use to try to answer this question, but first I need a holiday...