Wrangling Data With “Free” Tools – LASI13 Workshop Round-Up
I’m fortunate enough to be visiting the LASI13, the Learning Analytics Summer Institute, in Stanford this week, and got to lead a workshop session yesterday on tools for tinkering and playing with data.
The presentation I prepped can be found on Slideshare – LASI13 datawrangling tools – though as ever I didn’t get through all the slides, and, as ever again, went slightly off-piste at various points. (The session was a 2hr 15 session, split 1h30 and 45 mins; I reckon the whole slidedeck would be a 4hr session; as it was, we got as far as grabbing data out of Facebook and into OpenRefine, with a v brief tease about starting to analyse the data in Gephi.)
I mentioned several tutorial posts and resource pages in the session – here a few links to some of them:
- if search limits for use in Google searches are new to you, I like site: for searching sites or domains (eg site:open.ac.uk or site:edu); filetype: for searching by document type (eg filetype:xls or filetype:pdf); for limiting by document titles, intitle: and for limiting by terms that appear in a url, inurl:
- Examples of charts built around an “accession axis” – @mediaczar’s How should Page Admins deal with Flame Wars?, an example of Visualising Activity Around a Twitter Hashtag or Search Term Using R and What Happened Then? Using Approximated Twitter Follower Accession to Identify Political Events.
- For an intro to using Google Spreadsheets as a database, see Asking Questions of Data – Garment Factories Data Expedition (and don’t forget the =importHtml and =importData spreadsheet formulae for scraping data into a spreadsheet from the web via a URL).
- If you need an intro to Google Fusion Tables, here’s the example I mentioned: Using Google Fusion Tables for a Quick Look at GCSE/A’Level Certificate Awards Market Share by Examination Board
- If you want to follow through on OpenRefine/LODRefine, there’s lots more to know and I’m still working through tutorials to cover some of that. Grabbing Facebook friends likes using OpenRefine varies slightly from the w/s recipe; I’ll post a new version over the next week or two. For now, if you want to parse the JSON data, the magic phrase is forEach(value.parseJson()['data'],v,[v.category,v.name,v.id].join('::')).join('||'). You then need to Edit cells – split multi valued cells (by ||) then Edit column – split into several columns (using :: as the separator). I’ve posted a whole host of OpenRefine tutorials using the OpenRefine category on the this blog.
- Gephi tutorials – “Drug Deal” Network Analysis with Gephi (Tutorial) (a reworking of a tutorial I find elsewhere on the web…); there’s also a multi-part tutorial for looking at Facebook friends networks starting here: http://blog.ouseful.info/2010/04/16/getting-started-with-gephi-network-visualisation-app-my-facebook-network-part-i/
There’s a bundle of my recent tutorial style posts on data related activities on the School Of Data blog – Recently Elsewhere: Over On the School of Data Blog.
If anyone still here at LASI would like to chat further about data related skills development, let’s grab a table… similarly if you’d like to learn more about the School of Data data expedition model as a possible learning or training exercise. Finally, if you’re looking for folk to run data skills workshops, let’s talk;-)
PS if I’ve missed any links you think should be here, let me know and I’ll add them…