What do you do when you scrape a data set, geocode it so you can plot it on a map, and find that the geocoding isn’t quite as good as you’d hoped?
I’d promised myself that I wasn’t going to keep on posting “yet another way of scraping data into Google spreadsheets then geocoding it with a Yahoo pipe” posts along the lines of Data Scraping Wikipedia with Google Spreadsheets, but a post on Google Maps mania – Water Quality Google Map – sent me off on a train of thought that has sort of paid dividends…
So first up, the post got me thinking about whether there are maps of Blue Flag beaches in the UK, and where I could find them. A link on the UK page of blueflag.org lists them: UK Blue Flag beaches, (but there is a key in the URL, so I’m not sure how persistent that URL is).
Pull it into a Google spreadsheet using:
Publish the CSV:
Geocode the beaches using a Yahoo pipe – rather than using the Pipe location API, I’m making a call to the Yahoo GeoPlanet/Where API – I’ll post about that another day…
Grab the KML from the pipe:
Now looking at the map, it looks like some of the markers may be mislocated – like the ones that appear in the middle of the country, hundreds of miles from the coast. So what it might be handy to do is use the scraped data as a buggy, downloaded data set that needs cleaning. (This means that we are not going to treat the data as “live” data any more.)
And here’s where the next step comes in… Google MyMaps lets you seed a map by importing a KML file:
The import can be from a desktop file, or a URL:
Import the KML from the Yahoo pipe, and we now have the data set in the Google MyMap.
So the data set in the map is now decoupled from the pipe, the spreadsheet and the original Blue Flag website. It exists as a geo data set within Google MyMaps. Which means that I can edit the markers, and relocate the ones that are in the wrong place:
And before the post-hegenomic tirade comes in (;-), here’s an attempt at capturing the source of the data on the Google MyMap.
So, to sum up – Google MyMaps can be used to import an approximately geo-coded data set and used to tidy it up and republish it.
PS dont forget you can also use Google Maps (i.e. MyMaps) for geoblogging