<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>OUseful.Info, the blog... &#187; EDINA</title>
	<atom:link href="http://blog.ouseful.info/tag/edina/feed/?withoutcomments=1" rel="self" type="application/rss+xml" />
	<link>http://blog.ouseful.info</link>
	<description>Trying to find useful things to do with emerging technologies in open education</description>
	<lastBuildDate>Thu, 23 May 2013 14:40:45 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='blog.ouseful.info' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>OUseful.Info, the blog... &#187; EDINA</title>
		<link>http://blog.ouseful.info</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://blog.ouseful.info/osd.xml" title="OUseful.Info, the blog..." />
	<atom:link rel='hub' href='http://blog.ouseful.info/?pushpress=hub'/>
		<item>
		<title>Another Blooming Look at Gource and the Edina OpenURL Data</title>
		<link>http://blog.ouseful.info/2011/06/08/another-blooming-look-at-gource-and-the-edina-openurl-data/</link>
		<comments>http://blog.ouseful.info/2011/06/08/another-blooming-look-at-gource-and-the-edina-openurl-data/#comments</comments>
		<pubDate>Wed, 08 Jun 2011 09:02:06 +0000</pubDate>
		<dc:creator>Tony Hirst</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[Infoskills]]></category>
		<category><![CDATA[Insight]]></category>
		<category><![CDATA[Tinkering]]></category>
		<category><![CDATA[Visualisation]]></category>
		<category><![CDATA[EDINA]]></category>
		<category><![CDATA[gource]]></category>
		<category><![CDATA[jiscad]]></category>
		<category><![CDATA[openurl]]></category>

		<guid isPermaLink="false">http://blog.ouseful.info/?p=5631</guid>
		<description><![CDATA[Having done a first demo of how to use Gource to visualise activity around the EDINA OpenURL data (Visualising OpenURL Referrals Using Gource), I thought I&#8217;d trying something a little more artistic, and use the colour features to try to pull out a bit more detail from the data [video]. What this one shows is [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.ouseful.info&#038;blog=325417&#038;post=5631&#038;subd=ouseful&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Having done a first demo of how to use Gource to visualise activity around the EDINA OpenURL data (<a href="http://blog.ouseful.info/2011/06/07/visualising-openurl-referrals-using-gource/">Visualising OpenURL Referrals Using Gource</a>), I thought I&#8217;d trying something a little more artistic, and use the colour features to try to pull out a bit more detail from the data [<a href="http://www.youtube.com/watch?v=lbsqMYmQWog">video</a>].</p>
<span class='embed-youtube' style='text-align:center; display: block;'><iframe class='youtube-player' type='text/html' width='700' height='424' src='http://www.youtube.com/embed/lbsqMYmQWog?version=3&#038;rel=1&#038;fs=1&#038;showsearch=0&#038;showinfo=1&#038;iv_load_policy=1&#038;wmode=transparent' frameborder='0'></iframe></span>
<p>What this one shows is how the <em>mendeley</em> referrals glow brightly green, which &#8211; if I&#8217;ve got my code right &#8211; suggests a lot of e-issn lookups are going on (the red nodes correspond to an issn lookup, blue to an isbn lookup and yellow/orange to an unknown lookup). The regularity of activity around particular nodes also shows how a lot of the activity is actually driven by a few dominant services, at least during the time period I sampled to generate this video.</p>
<p>So how was this visualisation created?</p>
<p>Firstly, I pulled out a few more data columns, specifically the <em>issn</em>, <em>eissn</em>, <em>isbn</em> and <em>genre</em> data. I then opted to set node colour according to whether the <em>issn</em> (red), <em>eissn</em> (green) or <em>isbn</em> (blue) columns were populated using a default reasoning approach (if all three were blank, I coloured the node yellow). I then experimented with colouring the actors (I think?) according to whether the <em>genre</em> was article-like, book-like or unkown (mapping these on to add, modify or delete actions), before dropping the size of the actors altogether in favour of just highlighting referrers and asset type (i.e. issn, e-issn, book or unknown).</p>
<p><tt>cut -f 1,2,3,4,27,28,29,32,40 L2_2011-04.csv  &gt; openurlgource.csv</tt></p>
<p>When running the Pythin script, I got a &#8220;NULL Byte&#8221; error that stopped the script working (something obviously snuck in via one of the newly added columns), so I googled around and turned up a little command line cleanup routine for the cut data file:</p>
<p><tt>tr &lt; openurlgource.csv -d '&#092;000' &gt; openurlgourcenonulls.csv</tt></p>
<p>Here&#8217;s the new Python script too that shows the handling of the colour fields:</p>
<pre class="brush: python; title: ; notranslate">import csv
from time import *

# Command line pre-processing step to handle NULL characters
#tr &lt; openurlgource.csv -d '&#092;&#048;00' &gt; openurlgourcenonulls.csv
#alternatively?: sed 's/\x0/ /g' openurlgource.csv &gt; openurlgourcenonulls.csv

f=open('openurlgourcenonulls.csv', 'rb')

reader = csv.reader(f, delimiter='\t')
writer = csv.writer(open('openurlgource.txt','wb'),delimiter='|')
headerline = reader.next()

for row in reader:
	if row[8].strip() !='':
		t=int(mktime(strptime(row[0]+&quot; &quot;+row[1], &quot;%Y-%m-%d %H:%M:%S&quot;)))
		if row[4]!='':
			col='FF0000'
		elif row[5]!='':
			col='00FF00'
		elif row[6]!='':
			col='0000FF'
		else:
			col='666600'
		if row[7]=='article' or row[7]=='journal':
			typ='A'
		elif row[7]=='book' or row[7]=='bookitem':
			typ='M'
		else:
			typ='D'
		agent=row[8].rstrip(':').replace(':','/')
		writer.writerow([t,row[3],typ,agent,col])</pre>
<p>The new gource command is:</p>
<p><tt>gource -s 1 --hide usernames --start-position 0.8 --stop-position 0.82 --user-scale 0.1 openurlgource.txt</tt></p>
<p>and the command to generate the video:</p>
<p><tt>gource -s 1 --hide usernames --start-position 0.8 --stop-position 0.82 --user-scale 0.1 -o - openurlgource.txt |  ffmpeg -y -b 3000K -r 60 -f image2pipe -vcodec ppm -i - -vcodec libx264 -vpre slow -threads 0 gource.mp4</tt></p>
<p>If you&#8217;ve been tempted to try Gource out yourself on some of your own data, please post a link in the comments below:-) (AI wonder just how many different sorts of data we can force into the shape that Gource expects?!;-)</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ouseful.wordpress.com/5631/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ouseful.wordpress.com/5631/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.ouseful.info&#038;blog=325417&#038;post=5631&#038;subd=ouseful&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.ouseful.info/2011/06/08/another-blooming-look-at-gource-and-the-edina-openurl-data/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/abbd9f90565ce9ae4d065d93a81d8c03?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Tony Hirst</media:title>
		</media:content>
	</item>
		<item>
		<title>Playing With Large (ish) CSV Files, and Using Them as a Database from the Command Line: EDINA OpenURL Logs</title>
		<link>http://blog.ouseful.info/2011/06/04/playing-with-large-ish-csv-files-and-using-them-as-a-database-edina-openurl-logs/</link>
		<comments>http://blog.ouseful.info/2011/06/04/playing-with-large-ish-csv-files-and-using-them-as-a-database-edina-openurl-logs/#comments</comments>
		<pubDate>Sat, 04 Jun 2011 22:23:58 +0000</pubDate>
		<dc:creator>Tony Hirst</dc:creator>
				<category><![CDATA[Infoskills]]></category>
		<category><![CDATA[Library]]></category>
		<category><![CDATA[awk]]></category>
		<category><![CDATA[EDINA]]></category>
		<category><![CDATA[openurl]]></category>
		<category><![CDATA[sed]]></category>

		<guid isPermaLink="false">http://blog.ouseful.info/?p=5598</guid>
		<description><![CDATA[You know those files that are too large for you to work with, or even open? Maybe they&#8217;re not&#8230;. Picking up on Postcards from a Text Processing Excursion where I started dabbling with Unix command line text processing tools (it sounds scarier than it is&#8230; err&#8230; maybe?!;-), I thought it would make sense to have [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.ouseful.info&#038;blog=325417&#038;post=5598&#038;subd=ouseful&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>You know those files that are too large for you to work with, or even open? Maybe they&#8217;re not&#8230;.</p>
<p>Picking up on <a href="http://blog.ouseful.info/2011/06/03/postcards-from-a-text-processing-excursion/">Postcards from a Text Processing Excursion</a> where I started dabbling with Unix command line text processing tools (it sounds scarier than it is&#8230; err&#8230; maybe?!;-), I thought it would make sense to have a quick play with them in the context of some &#8220;real&#8221; data files.</p>
<p>The files I&#8217;ve picked are intended to be intimidating (maybe?) at first glance because of their size: in this post I&#8217;ll look at a set of <a href="http://openurl.ac.uk/doc/data/data.html">OpenURL activity data from Edina</a> (24.6MB download, unpacking to 76MB), and for a future post, I thought it might to interesting to see whether this approach would work with a dump of local council spending data from OpenlyLocal (73.2MB download, unzipping to 1,011.9MB). </p>
<p>To start with, let&#8217;s have a quick play with the OpenURL data: you can download it from here: <a href="http://openurl.ac.uk/doc/data/thedata.html">OpenURL activity data (April 2011)</a></p>
<p>What I intend to do in this post is track my own preliminary exploration of the file using what I learned in the &#8220;Postcards&#8221; post. I may also need to pick up a few new tricks along the way&#8230; One thing I think I want to look for as I start this exploration is an idea of how many referrals are coming in from particular institutions and particular sources&#8230;</p>
<p>Let&#8217;s start at the beginning though by seeing how many lines/rows there are in the file, which I downloaded as <em>L2_2011-04.csv</em>:</p>
<p><tt>wc -l L2_2011-04.csv</tt></p>
<p>I get the result <em>289,691</em>; older versions of Excel used to only support 65,536 rows per sheet, though I believe more recent versions (Excel 2007, and Excel 2010) can support over a million; Google Apps currently limits sheet sizes to up to 200,000 cells (max 256 columns), so even if the file was only one column wide, it would still be too big to upload into a single Google spreadsheet. Google Fusion Tables can accept CSV files up to 100MB, so that would work (if we could actually get the file to upload&#8230; Spanish accent characters seemed to break things when I tried&#8230; the workaround I found was to split the original file, then separately upload and resave the parts using Google Refine, before uploading the files to Google Fusion tables (upload one to a new table, then import and append the other files into the same table).</p>
<p>..which is to say: the almost 300,00 rows in the downloaded CSV file are probably too many for many people to know what to do with, unless they know how to drive a database&#8230; which is why I thought it might be interesting to see how far we can get with just the unix command line text processing tools.</p>
<p>To see what&#8217;s in the file, let&#8217;s see what&#8217;s in there (we might also look to the <a href="http://openurl.ac.uk/doc/data/whatare.html">documentation</a>):</p>
<p><tt>head L2_2011-04.csv</tt></p>
<p>Column 40 looks interesting to me: <em>sid</em> (service ID); in the data, there&#8217;s a reference in there to <em>mendeley</em>, as well as some other providers I recognise (EBSCO, Elsevier and so on), so I <em>think</em> this refers to the source of the referral to the EDINA openurl resolver (@ostephens and @lucask suggested they thought so too. Also, @lucask suggested &#8220;OpenUrl from Endnote has ISIWOS as default SID too!&#8221;, so we may find that some sources either mask their true origin to hide low referral numbers (maybe very few people ever go from endnote to the EDINA openurl resolver?), or to inflate other numbers (Endnote inflating apparent referrals from ISIWOS.)</p>
<p>Rather than attack the rather large original file, let&#8217;s start by creating a smaller sample file with a couple of hundred rows that we can use as a test file for our text processing attempts:</p>
<p><tt>head -n 200 L2_2011-04.csv &gt; samp_L2_2011-04.csv</tt></p>
<p>Let&#8217;s pull out column 40, sort, and then look for unique entries in the sample file we created:</p>
<p><tt>cut -f 40 samp_L2_2011-04.csv | sort | uniq -c</tt></p>
<p>I get a response that starts:</p>
<p><tt>12<br />
1 CAS:MEDLINE<br />
1 EBSCO:Academic Search Premier<br />
7 EBSCO:Business Source Premier<br />
1 EBSCO:CINAHL<br />
...</tt></p>
<p>so in the sample file there were 12 blank entries, 1 from <em>CAS:MEDLINE</em>, 7 from <em>BSCO:Business Source Premier</em> and so on, so this appears to work okay. Let&#8217;s try it on the big file (it may take a few seconds&#8230;) and save the result into a file (<em>samp_uniqueSID.csv</em>:</p>
<p><tt>cut -f 40 L2_2011-04.csv | sort | uniq -c &gt; uniqueSID.csv</tt></p>
<p>This results of the count will be in arbitrary order, so it&#8217;s possible to add a <em>sort</em> into the pipeline in order to sort the entries according to the number of entries. The column we want to sort on is column 1 (so we set the <em>sort</em> <tt>-k</tt> key to <em>1</em>; and because <em>sort</em> sorts into increasing order by default, we can reverse the order (<tt>-r</tt>) to get the most referenced entries at the top (the following is NOT RECOMMENDED&#8230; read on to see why&#8230;):</p>
<p><tt>cut -f 40 L2_2011-04.csv | sort | uniq -c | sort -k 1 -r &gt; uniqueSID.csv</tt></p>
<p>We can now view the <em>uniqueSD.csv</em> file using the <em>more</em> command (<em>more uniqueSD.csv</em>), r look at the top 5 rows using the <em>head</em> command:</p>
<p><tt>head -n 5 uniqueSID.csv</tt></p>
<p>Here&#8217;s what I get as the result (treat this with suspicion&#8230;):</p>
<p><tt>9181 OVID:medline<br />
9006 Elsevier:Scopus<br />
7929 EBSCO:CINAHL<br />
74740 <a href="http://www.isinet.com:WoK:UA" rel="nofollow">http://www.isinet.com:WoK:UA</a><br />
6720 EBSCO:jlh</tt></p>
<p>If we look through the file, we actually see:</p>
<p><tt>1817 OVID:embase<br />
1720 EBSCO:CINAHL with Full Text<br />
17119 mendeley.com:mendeley<br />
16885 mendeley.com/mendeley:<br />
1529 EBSCO:cmedm<br />
1505 OVID:ovftdb</tt></p>
<p><em>I actually was alerted to this oops when looking to see how many referrals were from <em>mendeley</em>, by using <tt>grep</tt> on the counts file (if <tt>grep</tt> complains about a &#8220;Binary file&#8221;, just use the <tt>-a</tt> switch&#8230;):</p>
<p><tt>grep mendeley uniqueSID.csv</tt></p>
<p><tt>17119 mendeley.com:mendeley<br />
16885 mendeley.com/mendeley:</tt></p>
<p>17119 beat the &#8220;top count&#8221; 9181 from OVID:medline &#8211; obviously I&#8217;d done something wrong!</em></p>
<p>Specifically, the sort had sorted <em>by character</em> <strong>not</strong> by numerical value&#8230; (17119 and 16885 are numerically grater than 1720, but 171 and 168 are less (in string sorting terms) than 172. The reasoning is the same as why we&#8217;d index aardman before aardvark).</p>
<p>To force <tt>sort</tt> to sort using numerical values, rather than string values, we need to use th <tt>-n</tt> switch (so now I know!):</p>
<p><tt>cut -f 40 L2_2011-04.csv | sort | uniq -c | sort -k 1 -r -n  &gt; uniqueSID.csv</tt></p>
<p>Here&#8217;s what we get now:</p>
<p><tt>74740 <a href="http://www.isinet.com:WoK:UA" rel="nofollow">http://www.isinet.com:WoK:UA</a><br />
34186<br />
20900 <a href="http://www.isinet.com:WoK:WOS" rel="nofollow">http://www.isinet.com:WoK:WOS</a><br />
17119 mendeley.com:mendeley<br />
16885 mendeley.com/mendeley:<br />
9181 OVID:medline<br />
9006 Elsevier:Scopus<br />
7929 EBSCO:CINAHL<br />
6720 EBSCO:jlh<br />
...</tt></p>
<p>To compare the referrals from the actual sources (e.g. the aggregated EBSCO sources, rather than EBSCO:CINAHL, EBSCO:jlh and so on), we can split on the &#8220;:&#8221; character, to create a two columns from one: the first containing the bit before the &#8216;:&#8217;, the second column containing the bit after: </p>
<p><tt>sed s/:/'<em>ctrl-v&lt;TAB&gt;</em>'/ uniqueSD.csv | sort -k 2 &gt; uniquerootSID.csv</tt></p>
<p>(Some versions of sed may let you identify the tab character as \t; I had to explicitly put in a tab by using ctrl-V then tab.)</p>
<p>What this does is retain the number of lines, but sort the file so all the EBSCO referrals are next to each other, all the Elsevier referrals are next to each other, and so on.</p>
<p>Via an <a href="http://stackoverflow.com/questions/3934423/awk-conditional-sum-from-a-csv-file/3934863#3934863">answer on Stack Overflow</a>, I found this bit of voodoo that would then sum the contributions from the same root referrers:</p>
<p><tt>cat uniquerootSID.csv | awk '{a[$2]+=$1}END{for(i in a ) print i,a[i] }' | sort -k 2 -r -n &gt; uniquerootsumSID.csv</tt></p>
<p>Using data from the file <em>uniquerootSID.csv</em>, the <em>awk</em> command sets up an array (<em>a</em>) that has indices corresponding to the different sources (EBSCO, Elsevier, and so on). It then runs an accumulator that sums the contributions from each unique source. After processing all the rows (<em>END</em>), the routine then loops through all the unique sources in the <em>a</em> array, and emits the source and the total. The <em>sort</em> command then sorts the output by total for each source and puts the list into the file <em>uniquerootsumSID.csv</em>.</p>
<p>Here are the top 15:</p>
<p><tt><a href="http://www.isinet.com" rel="nofollow">http://www.isinet.com</a> 99453<br />
EBSCO 44870<br />
OVID 27545<br />
mendeley.com 17119<br />
mendeley.com/mendeley 16885<br />
Elsevier 9446<br />
CSA 6938<br />
EI 6180<br />
Ovid 4353<br />
wiley.com 3399<br />
jstor 2558<br />
mimas.ac.uk 2553<br />
summon.serialssolutions.com 2175<br />
Dialog 2070<br />
Refworks 1034</tt></p>
<p>If we add the two Mendeley referral counts that gives ~34,000 referrals. How much are the referrals from commercial databases costing, I wonder, by comparison? Of course, it may be that the distribution of referrals from different institutions is different. Some institutions may see all their traffic through EBSCO, or Ovid, or indeed Mendeley&#8230; If nothing else though, this report suggests that Mendeley is generating a fair amount of EDINA openurl traffic&#8230;</p>
<p>Let&#8217;s use the <em>cut</em> command again to see how much traffic is coming from each unique insititution (not that I know how to decode these identifiers&#8230;); column 4 is the one we want (remember, we use the <em>uniq</em> command to count the occurrences of each identifier):</p>
<p><tt>cut -f <strong>4</strong> L2_2011-04.csv | sort | uniq -c | sort -k 1 -r -n  &gt; uniqueInstID.csv</tt></p>
<p>Here are the top 10 referrer institutions (columns are: no. of referrals, institution ID):</p>
<p><tt>41268 553329<br />
31999 592498<br />
31168 687369<br />
29442 117143<br />
24144 290257<br />
23645 502487<br />
18912 305037<br />
18450 570035<br />
11138 446861<br />
10318 400091</tt></p>
<p>How about column 5, the <em>routerRedirectIdentifier</em>:</p>
<p><tt>195499 athens<br />
39381 wayf<br />
29904 ukfed<br />
24766 no identifier<br />
 140 ip</tt></p>
<p>How about the publication year of requests (column 17):</p>
<p><tt>45867<br />
26400 2010<br />
16284 2009<br />
13425 2011<br />
13134 2008<br />
10731 2007<br />
8922 2006<br />
8088 2005<br />
7288 2004</tt></p>
<p>It seems to roughly follow year?!</p>
<p>How about unique journal title (column 15):</p>
<p><tt>258740<br />
 277 Journal of World Business<br />
 263 Journal of Financial Economics<br />
 263 Annual Review of Nuclear and Particle Science<br />
 252 Communications in Computer and Information Science<br />
 212 Journal of the Medical Association of Thailand Chotmaihet thangphaet<br />
 208 Scandinavian Journal of Medicine &amp; Science in Sports<br />
 204 Paleomammalia<br />
 194 Astronomy &amp; Astrophysics<br />
 193 American Family Physician</tt></p>
<p>How about books (column 29 gives ISBN):</p>
<p><tt>278817<br />
1695 9780470669303<br />
 750 9780470102497<br />
 151 0761901515<br />
 102 9781874400394</tt></p>
<p>And so it goes..</p>
<p>What&#8217;s maybe worth remembering is that I haven&#8217;t had to use any tools other than command line tools to start exploring this data, notwithstanding the fact that the source file may be too large to open in some everyday applications&#8230;</p>
<p>The quick investigation I was able to carry out on the EDINA openurl data also built directly on what I&#8217;d learned in doing the Postcards post (except for the voodoo awk script to sum similarly headed rows, and the sort switches to reverse the order of the sort, and force a numerical rather than string based sort). Also bear in mind that three days ago, I didn&#8217;t know how to do any of this&#8230;</p>
<p>&#8230;but what I do suspect is that it&#8217;s the sort of thing that Unix sys admins play around with all the time, e.g. in the context of log file hacking&#8230;</p>
<p>PS so what else can we do&#8230;? It strikes me that by using the date and timestamp, as well as the institutional ID and referrer ID, we can probably identify searches that are taking place: a) within a particular session, b) maybe by the same person over several days (e.g. in the case of someone coming in from the same place within a short window of time (1-2 hours), or around about the same time on the same day of the week, from the same IDs and searching around a similar topic).</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ouseful.wordpress.com/5598/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ouseful.wordpress.com/5598/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.ouseful.info&#038;blog=325417&#038;post=5598&#038;subd=ouseful&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.ouseful.info/2011/06/04/playing-with-large-ish-csv-files-and-using-them-as-a-database-edina-openurl-logs/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/abbd9f90565ce9ae4d065d93a81d8c03?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Tony Hirst</media:title>
		</media:content>
	</item>
	</channel>
</rss>
