OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Mapping Primary Care Trust (PCT) Data, Part 1

The launch or official opening or whatever it was of the Open Data Institute this week provided another chance to grab a snapshot of notable folk in the community, as for example demonstrated by people commonly followed by users of the #ODIlaunch hashtag on Twitter. The PR campaign also resulted in the appearance of some open data related use cases, such as a report in the Economist about an analysis by MastodonC and Prescribing Analytics mapping prescription charges (R code available), with a view to highlighting where prescriptions for branded, as opposed to the recommended generic, drugs are being issued at wasteful expense to the NHS. (See Exploring GP Practice Level Prescribing Data for some of my entry level doodlings with prescription data.)

Quite by chance, I’ve been looking at some other health data recently, (Quick Shiny Demo – Exploring NHS Winter Sit Rep Data), which has been a real bundle of laughs. Looking at a range of health related datasets, data seems to be published at a variety of aggregation levels – individual practices and hospitals, Primary Care Trusts (PCTs), Strategic Health Authorities (SHAs) and the new Clinical Commissioning Groups (CCGs). Some of these map on to geographical regions, that can then be coloured according to a particular measure value associated with that area.

I’ve previously experimented with rendering shapefiles and choropleth maps (Amateur Mapmaking: Getting Started With Shapefiles) so I know R provides one possible environment for generating these maps, so I thought I’d try to pull together a recipe or two for supporting the creation of thematic maps based on health related geographical regions.

A quick trawl for PCT shapefiles turned up nothing useful. @jenit suggested @mastodonc, and @paulbradshaw pointed me to a dataset on Google Fusion Tables, discovered through the Fusion Tables search engine, that included PCT geometry data. So no shapefiles, but there is exportable KML data from Fusion Tables.

At this point I should have followed Paul Bradshaw’s advice, and just uploaded my own data (I was going to start out with mapping per capita uptake of dental services by PCT) to Fusion Tables, merging with the other data set, and generating my thematic maps that way.

But that wasn’t quite the point, which was actually an exercise in pulling together an R based recipe for generating these maps…

Anyway, I’ve made a start, and here’s the code I have to date:

##Example KML: https://dl.dropbox.com/u/1156404/nhs_pct.kml
##Example data: https://dl.dropbox.com/u/1156404/nhs_dent_stat_pct.csv

install.packages("rgdal")
library(rgdal)
library(ggplot2)

#The KML data downloaded from Google Fusion Tables
fn='nhs_pct.kml'

#Look up the list of layers
ogrListLayers(fn)

#The KML file was originally grabbed from Google Fusion Tables
#There's only one layer...but we still need to identify it
kml=readOGR(fn,layer='Fusiontables folder')

#This seems to work for plotting boundaries:
plot(kml)

#And this:
kk=fortify(kml)
ggplot(kk, aes(x=long, y=lat,group=group))+ geom_polygon()

#Add some data into the mix
#I had to grab a specific sheet from the original spreadsheet and then tidy the data little...
nhs <- read.csv("nhs_dent_stat_pct.csv")

kml@data=merge(kml@data,nhs,by.x='Name',by.y='PCT.ONS.CODE')

#I think I can plot against this data using plot()?
plot(kml,col=gray(kml@data$A.30.Sep.2012/100))
#But is that actually doing what I think it's doing?!
#And if so, how can experiment using other colour palettes?

#But the real question is: HOW DO I DO COLOUR PLOTS USING gggplot?
ggplot(kk, aes(x=long, y=lat,group=group)) #+ ????

Here’s what an example of the raw plot looks like:

plot_pct

And the greyscale plot, using one of the dental services uptake columns:

thematicPlot_pct

Here’s the base ggplot() view:

ggplot_pctMap

However, I don’t know how to actually now plot the data into the different areas? (Oh – might this help? CRAN Task View: Analysis of Spatial Data.)

If you know how to do the colouring, or ggplotting, please leave a comment, or alternatively, chip in an answer to a related question I posted on StackOverflow: Plotting Thematic Maps from KML Data Using ggplot2

Thanks:-)

PS The recent Chief Medical Officer’s Report makes widespread use of a whole range of graphical devices and charts, including cartograms:

CMO cartogram

Is there R support for cartograms yet, I wonder?! (Hmmm… maybe?)

PPS on the public facing national statistics front, I spotted this job ad yesterday – Head of Rich Content Development, ONS:

The postholder is responsible for inspiring and leading development of innovative rich content outputs for the ONS website and other channels, which anticipate and meet user needs and expectations, including those of the Citizen User. The role holder has an important part to play in helping ONS to realise its vision “for official statistics to achieve greater impact on key decisions affecting the UK and to encourage broader use across the country”.

Key Responsibilities:

1.Inspires, builds, leads and develops a multi-disciplinary team of designers, developers, data analysts and communications experts to produce innovative new outputs for the ONS website and other channels.
2. Keeps abreast of emerging trends and identifies new opportunities for the use of rich web content with ONS outputs.
3. Identifies new opportunities, proposes new directions and developments and gains buy in and commitment to these from Senior Executives and colleagues in other ONS business areas.
4. Works closely with business areas to identify, assess and commission new rich-content projects.
5. Provides, vision, guidance and editorial approval for new projects based on a continual understanding of user needs and expectations.
6. Develops and manages an ongoing portfolio of innovative content, maximising impact and value for money.
7. Builds effective partnerships with media to increase outreach and engagement with ONS content.
8. Establishes best practice in creation of rich content for the web and other channels, and works to improve practice and capability throughout ONS.

Interesting…

Written by Tony Hirst

December 7, 2012 at 2:46 pm

Posted in Data, Rstats

Tagged with

4 Responses

Subscribe to comments with RSS.

  1. The lead for the PCT shapefile/KML was great… but Google has it sadly wrong – Gateshead isn’t on the southwest peninsula and Hampshire isn’t in North London… still, good start!

    Gary Warner

    December 9, 2012 at 4:10 pm

    • Hi Gary – sigh… I haven’t yet looked (even generally, let alone closely) at how PCTs map to shapefiles in the KML feed. I was still at 101 in trying to render the things (there’s an answer to the Stack Overflow question that I still need to try out.) So some of the mappings are wrong, are they? Buggrit :-( That was one reason I wanted an “approved” KML/shapefile from a .gov.uk or .nhs.uk site…

      If you post, or find, a correct version anywhere, please let me know:-)

      Tony Hirst

      December 9, 2012 at 4:22 pm

  2. http://edina.ac.uk/ukborders/description/easyd_datasets.shtml

    (appears to have a PCT shapefile (post-2002 changes)) – I haven’t logged in and tried it though

    Pete Mitton

    December 11, 2012 at 1:59 pm

  3. Thank you. I’m currently struggling with learning R and extracting something meaningful from the GP prescribing data at the same time. This post is very helpful. I don’t have access to EDINA, but fortunately, some kind soul requested them from the ONS and posted them as open data. I found them here: http://www.sharegeo.ac.uk/handle/10672/33.


Comments are closed.

Follow

Get every new post delivered to your Inbox.

Join 820 other followers

%d bloggers like this: