OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Archive for December 2011

Over on F1DataJunkie, 2011 Season Review Doodles…

leave a comment »

Things have been a little quiet, post wise here, of late, in part because of the holiday season… but I have been posting notes on a couple of charts in progress over on the F1DataJunkie blog. Here are links to the posts in chronological order – they capture the evolution of the chart design(s) to date:

You can find a copy of the data I used to create the charts here: F1 2011 Year in Review spreadsheet.

I used R to generate the charts (scripts are provided and/or linked to from the posts, or included in the comments – I’ll tidy them and pop them into a proper Github repository if/when I get a chance), loading the data in to RStudio using this sort of call:

require(RCurl)

gsqAPI = function(key,query,gid=0){ return( read.csv( paste( sep="",'http://spreadsheets.google.com/tq?', 'tqx=out:csv','&tq=', curlEscape(query), '&key=', key, '&gid=', curlEscape(gid) ), na.strings = "null" ) ) }

key='0AmbQbL4Lrd61dEd0S1FqN2tDbTlnX0o4STFkNkc0NGc'
sheet=4

qualiResults2011=gsqAPI(key,'select *',sheet)

If any other folk out there are interested in using R to wrangle with F1 data, either from 2011 or looking forward to 2012, let me know and maybe we could get a script collection going on Github:-)

Written by Tony Hirst

December 30, 2011 at 3:59 pm

Posted in Data, digital storytelling, Rstats

Tagged with ,

News, Courses and Scrutiny

with 3 comments

I think I may have confused Stephen Downes yesterday with my notes around consultation based courses, so here are some more loosely connected thoughts that will probably only serve to muddle the situation further, at least for now…;-)

Take the forthcoming UK Parliamentary Communications Green Paper that will lead to a revision of the legislation surrounding communications in the UK. In part, this will draw on the DCMS Communications review carried out earlier this year according to the following process: “An open letter was published on 16 May 2011 asking a broad range of questions about the communications sector. All non-confidential responses to the letter were published on 7 December 2011. Submissions received will be used to inform the Green Paper.” (The public submissions are available as a individual documents in either RTF or PDF format.)

The open letter [PDF] included a series a questions relating to communications policy. For example:


Q6. What are the competing demands for spectrum, how is the market changing and how can a regulatory framework best accommodate any rapidly changing demands on spectrum and market development?

Q12. What barriers are there to innovation in new digital media sectors, including video games, telemedicine, local television and education?

In a consultation-framed course, the consultation questions may be thought of as part of the assessment model. One of the aims of the course is to provide “students” taking the course with the knowledge, skills and understanding required to provide a considered response to some or all of these questions.

Note that we may wish to qualify the reading of a question, or wrap it with additional criteria; for example, we might tune Q6 above along the lines of: “What particular issues are likely to arise in the 300MHz to 3GHz band?”, or something like that!

In the Related Information section of the Communication Review, links were provided to a Research report [on] the Contribution of the digital communications sector to economic growth and productivity in the UK and the Government’s broadband strategy among other things. In a sense, we have been gifted some “course readings”. There are also opportunities to dip into research that maybe doesn’t get read (or scrutinised) as widely as it might, in the form of Parliamentary Library Research Briefing papers.

So that’s part of the jigsaw: reviews, consultations, calls for evidence all involve policy makers soliciting evidence and opinion around a topic area that may include technical considerations. Where questions are asked, these may form part of the reflection/self-assessment/course assessment framework. The original call may itself be viewed as a high level syllabus of the topics to be addressed in the course. The course can then address these issues with reference to teaching material (for example, if we’re considering innovation, we met call out some introductory OpenLearn materials on “Characteristics of consumers and the market”.

Whilst the aim of the review, consultation or piece of proposed legislation may not in itself go too deeply into technical areas, it can be used to provide the SPEL (social, political, ethical, legal) context around a technology area and provide a jumping point off for a technical lesson in that subject area (for example, we may want to consider the similarities and differences between wired networks and wireless networks; or we may need to get up to speed on what optical fibre networks are good for.

Part of the story then, is to try to take the lazy route to curriculum development, and reuse someone else’s, which in this case also amounts to a repurposing of a document or process that wasn’t intended as a course to provide some of the content, topic, cohort discovering and pacing components of a course.

This repurposing lends an element of authenticity and relevance to the course of study (though as mentioned in my previous post, we must be wary that the course is not used as a vehicle for delivering propaganda).

What the approach may also do is increase the amount of scrutiny around a review or route to legislation. In the post No Minister: Any chance for the Communications Act?, Guardian Professional writer Dick Vinegar notes:

Last time around, in 2003, Lord Puttnam, a film director with the right blend of artistic and technical expertise, carried out a pre-legislative scrutiny. I believe that this knocked the heads of broadcasters (fluffies) and comms engineers (techies) together to produce a good bill. From what I have heard so far, I am not sure whether this time around we will get such a mature, ‘two cultures’ approach.

By providing a view over a consultation, or review that is course-like, we can maybe increase the amount of scrutiny involved in the process and also (maybe) deepen people’s understanding of the issues.

The course view thus provides a structured pathway through the relevant issues at a deeper level than provide by the typical supporting documentation, or perhaps just in a more reflective way. The course also provides a way in to citizen engagement from individuals who just want to explore the topic.

The consultation-framed course also provides a way of straddling news and academia, an area that has also interested me in a lifelong learning context for some time.

This could manifest itself in a couple of ways. For example, long form news articles could feature “academic” breakout boxes using OERs sourced from the course, or course discussions could be positioned around issues raised in recent news articles; in a wider context, entry routes to the course may be provided through the news media, from readers who want to know a little more about the issues involved within a particular consultation area (c.f. News, Analysis, Academia and Demand Education or Educative Media?).

Another interesting feature that arises out the consultation based course learning journey is that “authentic assessment opportunities” present themselves: for example, a student may submit an actual response to the consultation, or, if they entered via the news route, write a letter to the editor. Writing responses in the form of research briefing papers also provides another format for producing work that may be used to demonstrate understanding and knowledge in a meaningful and potentially useful way, as well as an assessable way.

The tone with which reviews or consultations are presented is also interesting from an educational perspective, in that the questions that are asked may be open and may not have a single right answer. (On the other hand, in calls for technical expert evidence, there may well be “correct” answers which the evidentiary call is intended to discover.) This frames the learning activity in the context of “we don’t know what the right answer is, but we need to find out/learn more. That is, the consultation is in some sense modeling part of the lifelong learning behaviour we want to inculcate in our students (learning is not just for school or university, right?!;-)

Is there a demand for such an exercise though? Again referring to the Guardian Professional article:

In the run up to the green paper, Westminster has been awash with conferences and seminars with titles like ‘What should be in the new Communications Bill?’ and ‘Dear Jeremy…’ (Hunt). Most of the speakers at these portentous events have been full of patriotic hyperbole and statements of the obvious. “The next Comms Act should focus on ensuring that the UK’s communications sector remained one of the most competitive in the world.” “A level playing field is needed in the internet ecosystem with global issues considered carefully.” “Regulation must not chill innovation.” “The limits of online privacy must be defined.” “Children must be protected.”

PS I mentioned in the previous post how at least one of the forums around the forthcoming Communications Green Paper was “CPD certified”. A little digging turned up The CPD Certification Service, which is presumably what that referred to. Anyway, I’ve added it to my watchlist to see if Pearson, or other companies of that ilk, start sniffing around it as a gateway to one possible new credentials market…

PPS Are there any emerging leaders in the qualification verification arena yet?

Written by Tony Hirst

December 22, 2011 at 2:01 pm

A Quick Peek at Three Content Analysis Services

with 5 comments

A long, long time ago, I tinkered with a hack called Serendipitwitterous (long since rotted, I suspect), that would look through a Twitter stream (personal feed, or hashtagged tweets), use the Yahoo term extraction service to try to identify concepts or key words/phrases in each tweet, and then use these as a search term on Slideshare, Youtube and so on to find content that may or may not be loosely related to each tweet.

The Yahoo Term Extraction is still hanging in there – just – but I think it finally gets deprecated early next year. From my feeds today, however, it seems there may be a replacement in the form of a new content analysis service via YQL – Yahoo! Opens Content Analysis Technology to all Developers:

[The Y! COntent Analysis service will] extract key terms from the content, and, more importantly, rank them based on their overall importance to the content. The output you receive contains the keywords and their ranks along with other actionable metadata.
On top of entity extraction and ranking, developers need to know whether key terms correspond to objects with existing rich metadata. Having this entity/object connection allows for the creation of highly engaging user experiences. The Y! Content Analysis output provides related Wikipedia IDs for key terms when they can be confidently identified. This enables interoperability with linked data on the semantic Web.

What this means is that you can push a content feed through the service, and get an annotated version out that includes identifier based hooks into other domains (i.e. little-l, little-d linked data). You can find the documentation here: Content Analysis Documentation for Yahoo! Search

So how does it fare? As I’ve previously explored using the Reuters Open Calais service to annotate OU/BBC programme listings (e.g. Augmenting OU/BBC Co-Pro Programme Data With Semantic Tags), I thought I’d use a programme feed from The Bottom Line again…

To start, we need to open the YQL developer console: http://developer.yahoo.com/yql/console/

We can then pull in an example programme description from the BBC using a YQL query of the form:

select long_synopsis from xml where url='http://www.bbc.co.uk/programmes/b00vy3l1.xml'

Grabbing a BBC programme feed into YQL

For reference, the text looks like this:

The view from the top of business. Presented by Evan Davis, The Bottom Line cuts through confusion, statistics and spin to present a clearer view of the business world, through discussion with people running leading and emerging companies.
In the week that Facebook launched its own new messaging service, Evan and his panel of top business guests discuss the role of email at work, amid the many different ways of messaging and communicating.
And location, location, location. It’s a cliche that location can make or break a business, but how true is it really? And what are the advantages of being next door to the competition?
Evan is joined in the studio by Chris Grigg, chief executive of property company British Land; Andrew Horton, chief executive of insurance company Beazley; Raghav Bahl, founder of Indian television news group Network 18.
Producer: Ben Crighton
Last in the series. The Bottom Line returns in January 2011.

The content analysis query example provided looks like this:

select * from contentanalysis.analyze where text="Italian sculptors and painters of the renaissance favored the Virgin Mary for inspiration"

but we can nest queries in order to pass the long_synposis from the BBC programme feed through the service:

select * from contentanalysis.analyze where text in (select long_synopsis from xml where url='http://www.bbc.co.uk/programmes/b00vy3l1.xml')

Here’s the result:

<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
    yahoo:count="2" yahoo:created="2011-12-22T11:03:51Z" yahoo:lang="en-US">
    <diagnostics>
        <publiclyCallable>true</publiclyCallable>
        <url execution-start-time="2" execution-stop-time="370"
            execution-time="368" proxy="DEFAULT"><![CDATA[http://www.bbc.co.uk/programmes/b00vy3l1.xml]]></url>
        <user-time>572</user-time>
        <service-time>565</service-time>
        <build-version>24402</build-version>
    </diagnostics> 
    <results>
        <categories xmlns="urn:yahoo:cap">
            <yct_categories>
                <yct_category score="0.536">Business &amp; Economy</yct_category>
                <yct_category score="0.421652">Finance</yct_category>
                <yct_category score="0.418182">Finance/Investment &amp; Company Information</yct_category>
            </yct_categories>
        </categories>
        <entities xmlns="urn:yahoo:cap">
            <entity score="0.979564">
                <text end="57" endchar="57" start="48" startchar="48">Evan Davis</text>
                <wiki_url>http://en.wikipedia.com/wiki/Evan_Davis</wiki_url>
                <types>
                    <type region="us">/person</type>
                    <type region="us">/place/place_of_interest</type>
                    <type region="us">/place/us/town</type>
                </types>
                <related_entities>
                    <wikipedia>
                        <wiki_url>http://en.wikipedia.com/wiki/Don%27t_Tell_Mama</wiki_url>
                        <wiki_url>http://en.wikipedia.com/wiki/Lenny_Dykstra</wiki_url>
                        <wiki_url>http://en.wikipedia.com/wiki/Los_Angeles_Police_Department</wiki_url>
                        <wiki_url>http://en.wikipedia.com/wiki/Today_%28BBC_Radio_4%29</wiki_url>
                        <wiki_url>http://en.wikipedia.com/wiki/Chrisman,_Illinois</wiki_url>
                    </wikipedia>
                </related_entities>
            </entity>
            <entity score="0.734099">
                <text end="265" endchar="265" start="258" startchar="258">Facebook</text>
                <wiki_url>http://en.wikipedia.com/wiki/Facebook</wiki_url>
                <types>
                    <type region="us">/organization</type>
                    <type region="us">/organization/domain</type>
                </types>
                <related_entities>
                    <wikipedia>
                        <wiki_url>http://en.wikipedia.com/wiki/Mark_Zuckerberg</wiki_url>
                        <wiki_url>http://en.wikipedia.com/wiki/Social_network_service</wiki_url>
                        <wiki_url>http://en.wikipedia.com/wiki/Twitter</wiki_url>
                        <wiki_url>http://en.wikipedia.com/wiki/Social_network</wiki_url>
                        <wiki_url>http://en.wikipedia.com/wiki/Digital_Sky_Technologies</wiki_url>
                    </wikipedia>
                </related_entities>
            </entity>
            <entity score="0.674621">
                <text end="477" endchar="477" start="450" startchar="450">location, location, location</text>
            </entity>
            <entity score="0.651227">
                <text end="79" endchar="79" start="60" startchar="60">The Bottom Line cuts</text>
                <types>
                    <type region="us">/other/movie/movie_name</type>
                </types>
            </entity>
            <entity score="0.646818">
                <text end="799" endchar="799" start="789" startchar="789">Raghav Bahl</text>
                <wiki_url>http://en.wikipedia.com/wiki/Raghav_Bahl</wiki_url>
                <types>
                    <type region="us">/person</type>
                </types>
                <related_entities>
                    <wikipedia>
                        <wiki_url>http://en.wikipedia.com/wiki/Network_18</wiki_url>
                        <wiki_url>http://en.wikipedia.com/wiki/Superpower</wiki_url>
                        <wiki_url>http://en.wikipedia.com/wiki/Deng_Xiaoping</wiki_url>
                        <wiki_url>http://en.wikipedia.com/wiki/The_Amazing_Race</wiki_url>
                        <wiki_url>http://en.wikipedia.com/wiki/Hare</wiki_url>
                    </wikipedia>
                </related_entities>
            </entity>
            <entity score="0.644349">
                <text end="144" endchar="144" start="133" startchar="133">clearer view</text>
            </entity>
            <entity score="0.54609">
                <text end="675" endchar="675" start="665" startchar="665">Chris Grigg</text>
                <types>
                    <type region="us">/person</type>
                </types>
            </entity>
        </entities>
    </results>
</query>

So, some success in pulling out person names, and limited success on company names. The subject categories look reasonably appropriate too.

[UPDATE: I should have run the desc contentanalysis.analyze query before publishing this post to pull up the docs/examples... As well as the where text= argument, there is a where url= argument that will pul back semantic information about a URL. Running the query over the OU homepage, for example, using select * from contentanalysis.analyze where url="http://www.open.ac.uk" identifies the OU as an organisation, with links out to Wikipedia, as well as geo-information and a Yahoo woe_id.]

Another related service in this area that I haven’t really explored yet is TSO’s Data Enrichment Service (API).

Here’s how it copes with the same programme synposis:

TSO Data Enrichment Service

Pretty good… and links in to dbpedia (better for machine readability) compared to the Wikipedia links that the Yahoo service offers.

For completeness, here’s what the Reuters Open Calais service comes up with:

OPen Calais - content analysis

The best of the bunch on this sample of one, I think, albeit admittedly in the domain the Reuters focus on?

But so what…? What are these services good for? Automatic metadata generation/extraction is one thing, as I’ve demonstrated in Visualising OU Academic Participation with the BBC’s “In Our Time”, where I generated a quick visualisation that showed the sorts of topics that OU academics had talked about as guests on Melvyn Bragg’s In Our Time, along with the topics that other universities had been engaged with on that programme.

Written by Tony Hirst

December 22, 2011 at 11:23 am

A Tool Chain for Plotting Twitter Archive Retweet Graphs – Py, R, Gephi

with 5 comments

Another set of stepping stones that provide a clunky route to a solution that @mhawksey has been working on a far more elegant expression of (eg Free the tweets! Export TwapperKeeper archives using Google Spreadsheet and Twitter: How to archive event hashtags and create an interactive visualization of the conversation)…

The recipe is as follows:

- download a Twapperkeeper archive to a CSV file using a Python script as described in Python Script for Exporting (Large) Twapperkeeper Archives By User; the CSV file should contain a single column with one row per archive entry; each row includes the sender, the tweet, the tweet ID and a timestamp; **REMEMBER – TWAPPERKEEPER ARCHIVES WILL BE DISABLED ON JAN 6TH, 2012**

- in an R environment (I use RStudio), reuse code from Rescuing Twapperkeeper Archives Before They Vanish and Cornelius Puschmann’s post Generating graphs of retweets and @-messages on Twitter using R and Gephi:

require(stringr)

#A helper function to remove @ symbols from user names...
trim <- function (x) sub('@','',x)

twapperkeeperCSVParse=function(fp){
    df = read.csv(fp, header=F)
    df$from=sapply(df$V1,function(tweet) str_extract(tweet,"^([[:alnum:]_]*)"))
    df$id=sapply(df$V1,function(tweet) str_extract(tweet,"[[:digit:]/s]*$"))
    df$txt=sapply(df$V1,function(tweet) str_trim(str_replace(str_sub(str_replace(tweet,'- tweet id [[:digit:]/s]*$',''),end=-35),"^([[:alnum:]_]*:)",'')))
    df$to=sapply(df$txt,function(tweet) trim(str_extract(tweet,"^(@[[:alnum:]_]*)")))
    df$rt=sapply(df$txt,function(tweet) trim(str_match(tweet,"^RT (@[[:alnum:]_]*)")[2]))
    return(df)
}
#usage: 
#twarchive.df=twapperkeeperCSVParse("PATH_TO_YOUR_FILE")
#For example:
df=twapperkeeperCSVParse("~/code/twapps/reports/twArchive_online11.txt")

ats.df <- data.frame(df$from,df$to)
rts.df <- data.frame(df$from,df$rt)

#Cribbing http://blog.ynada.com/339
require(igraph)
ats.g <- graph.data.frame(ats.df, directed=T)
rts.g <- graph.data.frame(rts.df, directed=T)

write.graph(ats.g, file="ats.graphml", format="graphml")
write.graph(rts.g, file="rts.graphml", format="graphml")

- Cornelius’ code uses the igraph library to construct a graph and export graphml files that describe graphs of at behaviour (tweets in the archive sent from one user to another) and RT behaviour (tweets from one person retweeting another using the RT @name convention).

- visualise the graphml files in Gephi. Note a couple of things – empty nodes aren’t handled properly in my version of the code, so the graph includes a dummy node that all non-at or non-RT row tweet senders point to; when you visualise the graph, this node will be obvious, so just delete it ;-)

- the Gephi visualisation by default uses the Label attribute for labeling nodes – we need to change this:

Gephi - setting node label choice

You should now be able to view graphs that illustrate RT or @ behaviour as captured in a Twapperkeeper archive in Gephi.

ILI2011 RT behaviour

Just by the by, we can also generate stats’n graphs of the contents of the archive. For example, via Getting Started With Twitter Analysis in R, we can generate a bar plot to show who was retweeted most:

require(ggplot2)

ggplot()+geom_bar(aes(x=na.omit(df$rt)))+opts(axis.text.x=theme_text(angle=-90,size=6))+xlab(NULL)

We can also do some counting to find out who was RT’d the most, for example:

#count the occurrences of each name in the rt column
rt.count=data.frame(table(df$rt))
#sort the results in descending order and display the top 5 results
head(rt.count[order(-rt.count$Freq),],5)
#There are probably better ways of doing that! If so, let me know via comments

Next on the to do list is:
- automate the production of archive reports
- work in the time component so we can view behaviour over time in Gephi… (here’s a starting point maybe, again from Cornelius Puschmann’s blog: Dynamic Twitter graphs with R and Gephi (clip and code))

As things stand though, I may not be able to get round to either of those for a while…

Written by Tony Hirst

December 21, 2011 at 4:55 pm

Learning Problems and Consultation Based Curricula

with 3 comments

(This is a jumble of snippets in no particular order for a post that I’m not going to get round to writing…)

Ewan McIntosh’s presentation at TEDxLondon: The Problem Finders [VIDEO]

All our students, their parents and the people teaching them, have been indoctrinated that is teachers who sift through all the things we can learn, find the areas worth exploring, and make up theoretical problems for students to solve. On top of this, most educators believe that it is their job to invent problems at just the right level of difficulty to appeal to every one of the 30 children in front of them.

A couple of days after I saw this, John Naughton raised a similar issue at the Arcadia project review workshop – that experts and professionals are good at creating (or identifying) problems out of mess or muddle, the trick being that the problems are cast into a standard form to which known solutions/problem solving strategies can then be applied.

Is this related? In science education, I guess it’s all too easy to fall into the trap of framing practical lessons as the running of experiments that try to replicate “the right answer”, as opposed to being activities that set out to explore a particular claim and see if they can replicate it. (I seem to remember that I typically wrote up my own school chemistry experiments but there also being a phase of the practical lessons where everyone would shout out their own results of the particular experiment that had been carried out, thus giving us multiple pieces of evidence related to the investigation being made or claim under test.)

How about problems to which the answer is unknown? New problems will often fall into this class?

Where the problem fits a particular pattern but the answer is not known, finding the answer may constitute “real” work. This is in part what I’m struggling to frame as authentic educational activities.

Stack Overflow is a great place for finding problems that people have identified, but don’t necessarily know how to answer?

Courses are organisational principles: they provide a curriculum; they often provide a linear model to work through. They provide pacing, a cohort, an end point in the form of assessment and feedback.

A quick approach to course design is often to focus on the syllabus.

How about using a consultation document, or a Green or White paper, as providing the core syllabus, the issues to be addressed in the course of study?

Consultation aims to solicit views of people who have a stake in, and/or knowledge about, a particular policy matter or implementation proposal. (Caveat: a cynic might say that consultations provide an opportunity to develop a PR strategy against likely responses to the decisions that follow the consultation period).

Assessment recast as a consultation exercise where the aim is to solicit the opinion and knowledge of the candidate about a particular topic. The aim of consultation based curriculum is to develop the knowledge and critical skills of the candidate such that they can provide a meaningful and considered response to the consultation. The fact that the response of the candidate might have a consequence if submitted to a real consultation exercise means the candidate has a stake in clarifying their personal views and then expressing them clearly.

Consultation based education provides:
- re-occurring, lifelong learning/updating opportunities
- fixed time scales over weeks to months
- a contextualised curriculum: for example, the forthcoming “Communications Green Paper” [need a better link] provides a frame that can be used to contextualise the sort of content that appears in the ICT courses produced by members of my department (Communication and Systems). Could we run an open, online course based around the Communications Green Paper, on part drawing on content deposited in OpenLearn as well as other OER repositories around the web, aimed at helping people understand better the issues raised in the Green Paper as well as the technologies referred to?

There is a risk that consultation/report based courses might be viewed as propaganda. I think we’d need to make sure they were properly framed as scrutiny.

MIT’s recent MITx proposal which will offer credentials around MIT open online courses has attracted a lot of commentary over the last few days. As @alyp responded to a tweet I made about this being an innovation, “The OU was issuing certificates of course completion 20 yrs ago. The innovation may be crowd-sourced and tech driven assessment”. I should have clairified: the innovation is maybe a systemic one, with employers being willing to trust the new form of credential that MITx appears to be offering (where ‘new’ is maybe new only in the sense of rebranding an old idea/marketing it in a new way…)

I notice that the Westminster Forum on the forthcoming Communications Green Paper is “CPD Certified”… Could an updating course around the Green Paper offer similar “credentialling”?

There are opportunities for repeat business (education is a business now, right?): sign up for 5 year updating package that will include courses around major reports and bills in a given subject area published by BIS, for example. Or a technical package that will review technologies and guidance, in course like way, built around COI guidance docs. The courses would be critical, academically minded, and serve to both educate as well as provide scrutiny.

Consultation/report based courses would as a side-effect deepen citizen engagement with policy and legislation development, as well as raising awareness of changes in legislation and policy amongst professionals.

A couple of years ago, a New Year’s resolution led to the WriteToReply experiment I ran with Joss Winn for a while. I’m still working on my resolutions for next year… how does LearnToEngage sound?

[See also the follow on to this post, looking in (slightly) more detail at how a consultation framed course might work: News, Courses and Scrutiny]

PS maybe related? Steve Wheeler on Content as curriculum? Also some of my other doodlings relating to lifelong learning – rather than degree based qualification – relationships between universities and students: Graduate With Who (Whom?!;-), Exactly…?, Subscription Models for Lifelong Students, Subscriptions Not Courses? Idling Around Lifelong Learning, Education, Training and Lifelong Learning.

PPS Just spotted this, possibly worth critiquing: HESA: What is a Course?

Written by Tony Hirst

December 21, 2011 at 11:19 am

Posted in Anything you want

Tune Your Feeds…

leave a comment »

I’m so glad we’re at year’s end: I’m completely bored of the web, my feeds contain little of interest, I’m drastically in need of a personal reboot, and I’m starting to find myself stuck in a “seen-it-all-before” rut…

Take the “new” Google Circle’s volume slider, for example… Ooh.. shiny… ooh, new feature…

Yawn… Slider widgets have been around for ages, of course (e.g. Slider Widgets Around the Web) and didn’t Facebook allow you to do the volume control thing on your Facebook news feeds way back when, when Facebook’s feeds were themselves news (Facebook News Mixing Desk)?

Facebook Mixing desk

Does Facebook still offer this service I wonder?

On the other hand, there is the new Google Zeitgeist Scrapbook… I’m still trying to decide whether this is interesting or not… The prmeise is a series of half completed straplines that you can fill in with subheadings that interest you, and reveal a short info paragraph as a result.

Google scrapbook

Google scrapbook

The finished thing is part scrapbook, part sticker book.

Google scrapbook

The reason why I’m not sure whether this is interesting or not is because I can’t decide whether it may actually hint at a mechanic for customising your own newspaper out of content from your favoured news provider. For example, what would it look like if we tried to build something similar around content from the Guardian Platform API? Might different tag combinations be dragged into the story panels to hook up a feed from that tag or section of the “paper”? And once we’ve acted as editor of our own newspaper, might advanced users then make use of mixing desk sliders to tune the volume of content in each section?

This builds on the idea that newspapers provide you with content and story types you wouldn’t necessarily see, whilst still allowing to some degree of control over how weighted the “paper” is to different news sections (something we always had some element of control over before, though at a different level of granularity, for example, by choosing to buy newspapers only on certain days because they came with a supplement you were interested in, though you were also happy to read the rest of the paper since you have it…)

(It also reminds me that I never could decide about Google’s Living Stories either…)

PS in other news, MIT hints at an innovation in the open educational field, in particular with respect to certification… It seems you may soon be able to claim some sort of academic credit, for a fee, if you’ve been tracked through an MITx open course (MIT’s new online courses target students worldwide). Here’s the original news release: MIT launches online learning initiative and FAQ.

So I wonder: a “proven” online strategy is to grab as big an audience as you can as quickly as you can, then worry about how to make the money back. Could MIT’s large online course offereings from earlier this year be seen in retrospect as MIT testing the water’s to see whether or not they could grow an audience around online courses quickly?

I just wonder what would have happened if we’d managed to convert a Relevant Knowldge course to an open course accreditation container for a start date earlier this year, and used it to offer credit around the MIT courses ourselves?!;-) As to what other innovations might there be around open online education? I suspect the OU still has high hopes for SocialLearn… but I’m still of the mind that there’s far more interesting stuff to be done in the area of open course production

Written by Tony Hirst

December 19, 2011 at 9:40 pm

Posted in Anything you want

Tagged with ,

Fragment – Outliers in Emergent Social Positioning Maps

leave a comment »

This is a placeholder as much as anything, something I want to try out but don’t have time to do right now… The context is the social media mapping approach I’ve been doodling with a few weeks for now, where I try to position social media users in terms of who their followers follow (for example, A Couple More Social Media Positioning Maps for UK HE Twitter Accounts).

One of the problems with the approach is that you often get some of the same-old, same-old accounts appearing again and again (@stephenfry for example). So I’ve been wondering whether it might be worth generating funnel plots that plot the rate at which followers of a target account follow the other accounts identified in the positioning maps generated around the target account? On the x we’d plot the total number of followers of each account, and on the y, the rate at which they are followed by the followers of the target account (i.e. their in-degree in the map divided by the target account follower sample size used to generate the map). We might then get useful signal from the presence of accounts that appear to be over-represented within the target account followers sample, signal that can be used to identify those accounts that are more highly associated with the target account than we might expect by chance?

Another factor that I maybe need to take into account is the total number of accounts followed by the target account followers?

PS by the by, I notice that my map of folk “in the vicinity of the #gdslaunch hashtag” appears to have been posterised…:-)

In the vivinity of the #gdslaunch

(If anyone wants SVG or graphml based representations of any of the Gephi generated images I post either here or on my flickr account, it can probably be arranged;-)

Written by Tony Hirst

December 18, 2011 at 12:12 pm

Posted in Anything you want

Tagged with

JISC Project Blog Metrics – Making Use of WordPress Stats. Plus, An Aside…

with 2 comments

Brian has a post out on Beyond Blogging as an Open Practice, What About Associated Open Usage Data?, and proposes that “when adopting open practices, one should be willing to provide open accesses to usage data associated with the practices” (his emphasis).

What usage stats are relevant though? If you’re on a hosted WordPress blog, it’s easy enough to pull out in a machine readable way the stats that WordPress collects about your blog and makes available to you (albeit at the cost of revealing a blog specific API key in the URL. Which means that if this key provides access to anything other than stats, particularly if it provides write access to any part of your blog, it’s probably not something you’d really want to share in public… [Getting your WordPress.com Stats API Key])

That said, you can still hand craft your own calls to the WordPress stats API, and extract your own usage data as data, using the WordPress Stats API.

So for example, a URL of the form:
http://stats.wordpress.com/csv.php?api_key=YOURKEY&blog_uri=BLOG.EXAMPLE.COM&end=2011-11-30&table=views
will pull in a summary of November’s views data; or:
http://stats.wordpress.com/csv.php?api_key=KEY&blog_uri=YOURBLOG&end=2011-11-30&table=referrers_grouped
will pull in a list of referrers.

For what it’s worth, I’ve started cobbling together a spreadsheet that can pull in live data, or custom ranged reports, from WordPress: WordPress Stats into Google Spreadsheets (make your own personal copy of the spreadsheet if you want to give it a try). This may or may not become a work in progress… at the moment, it doesn’t even support the full range of URL parameters/report configurations (for the time being at least, that is leaf “as an exercise for the reader”;-)

The approach I took is very simplistic, simply based around crafting URLs that grab specified sets of CSV formatted data, and pop them into a spreadsheet using the =importData() formula (I’m sure Martin could come up with something far more elegant;-); that said, it does provide an example of how to get started with a bit of programmatic URL hacking… and if you want to get started with handcrafting your own URLs, it provides a few examples there too….:-)

The pattern I used was to define a parameter spreadsheet, and then CONCATENATE parameter values to create the URLs; for example:

=importdata(CONCATENATE("http://stats.wordpress.com/csv.php?", "api_key=", Config!B2, "&blog_uri=", Config!B3, "&end=", TEXT(Config!B6,"YYYY-MM-DD"), "&table=referrers_grouped"))

One trick to note is that I defined the end parameter setting in the configuration sheet as a date type, displayed in a particular format. When we grab this data value out of the configuration sheet we’re actually grabbing a date typed record, so we need to use the TEXT() formula to put it into the format that the WordPress API requires (arguments of the form 2011-11-30).

If you want to use the spreadsheet to publish your own data, I guess one way would would be to keep the privacy settings private, but publish the sheets you are happy for people to see. Just make sure you don’t reveal your API key;-) [If you know of a good link/resource describing best practice around publishing public sheets from spreadsheets that also contain, and drawn on, private data, such as API keys, please post a link in the comments below;-)]

[A note on the stats: the WordPress stats made available via the API seem to relate to page views/visits to the website. Looking at my own stats, views from RSS feeds seem to be reported separately, and (I think) this data is not available via the WordPress stats API? If, as I do, you run your blog RSS feed through a service like Feedburner, to get a fuller picture of how widely the content on a blog is consumed, you'd need to report both the WordPress stats and the Feedburner stats, for example. Which leads the the next question, I guess: how can we (indeed, can we at all?) pull feed stats out of Feedburner?]

At this point, I need to come back to the question related above: what usage stats are relevant, particularly in the case of a JISC project blog? To my mind, a JISC project blog can support a variety of functions:

- it serves as a diary for the project team allowing them to record micro-milestones and solutions to problems; if developers are allowed to post to the blog, this might include posts at the level of granularity of a Stack Overflow Q and A, compared to the 500 word end-of-project post that tries to summarise how a complete system works;
- it can provide a feed that others can subscribe to to keep up to date with the project without having to hassle the project team for updates;
- it can provide context for the work by linking out to related resources, an approach that also might alert other projects who watch for trackbacks and pingbacks to the the project;
- it provides an opportunity to go fishing in a couple of ways: firstly, by acting as a resource others can link to (with the triple payoff that it contextualises the project further, it may suggest related work the project team are unaware by means of trackbacks/pingbacks into the project blog, and it may turn up useful commentary around the project); secondly, by providing a place where other interested parties might engage in discussion commentary or feedback around elements of the project, via blog comments.

Even if a blog only ever gets three views per post, they may be really valuable views. For me what’s important is how the blog can be used to document interesting things that might have been turned up in the course of doing the project that wouldn’t ordinarily get documented. Problems, gotchas, clever solutions, the sudden discovery or really useful related resources. The blog also provides an ongoing link-basis for the project, something that can bring it to life in a networked context (a context that may have a far longer life, and scope, than just the life or scope of the actual project).

For many projects that don’t go past a pilot, it may well be that the real value of the project is the blogged documentation of things turned up during the process, rather than any of the formal outputs… Maybe..?!;-)

PS in passing, Google Webmaster tools now lets you track search stats around articles Google associates you with as an author: Clicks and impressions for authors. It’s been some time since I looked at Google Webmaster tools, but as Ouseful.info is registered there, I thought I’d check my broken links…and realised just how many pages get logged by Google as containing broken links when a single post erroneously contains a relative link… (i.e. when the <a href=’ doesn’t start with http://)

PPS Related to the above is a nice example of why I think being able to read and write URL is an important skill, something Jon Udell also picks up on in Forgotten knowledge. In the above case, I needed to unpick the WordPress Stats APi documentation a little to work out how to put the URLs together (something that a knowledge of how to read and write URL helped me with). In Jon Udell’s case was an example of how a conference organiser was able to send a customised URL to the conference hotel that embedded the relevant booking dates.

But I wonder, in an age where folk use Google+search term (e.g. typing Facebook into Google) rather than URLs (eg typing facebook.com into a browser location bar), a behaviour that can surely only be compounded by the fusion of location and search bars in browsers such as Google Chrome, is “URL literacy” becoming even more of a niche skill, rather than becoming more widespread? Is there some corollary here to the world of phones and addressbooks? I don’t need to remember phone numbers any more (I don’t even necessarily recognise them) because my contacts lists masks the number with the name of the person it corresponds to. How many kids are going to lose out on a basic education in map reading because there’s no longer a need to learn route planning or map-based navigation – GPS, SatNav and online journey planners now do that for us… And does this distancing from base skills and low level technologies extend further? Into the kitchen, maybe? Who needs ingredients when you have ready meals (and yes, frozen croissants and gourmet meals from the farm shop do count as ready meals;-), for example? Who needs to actually use a cookery book (or really engage with a lecture) when you can watch a TV chef, (or TED Talks)..?

Written by Tony Hirst

December 15, 2011 at 3:04 pm

Posted in Anything you want, Infoskills

Tagged with

Information Literacy, Graphs, Hierarchies and the Structure of Information Networks

with 11 comments

Over dinner at Côte in Cambridge last week, during the Arcadia Project review event, I doodled a couple of data structures, one on either side of a scrap of paper, and asked my co-Arcadians what sort of thing the drawing might represent, or what the structures they described might be called in general terms.

The sketches were broadly along the lines of the following, though without the circular nodes and labels displayed, just a set of connecting lines:

Hierarchy

and:

Graph

So if I asked you the same question (what would you call these two different things?), how would you answer?

To my mind, the different organisational structures these represent, and how we can exploit and manipulate them, represents a whole host of issues in the reimagining of information literacy and the teaching of information skills. This ranges from an understanding of the structure of information spaces through the representation and analysis of those structures, to ways in which we can navigate and discover things in those spaces as well as how we can visualise and otherwise make sense of them.

So how would I describe the two different things shown above? The first image represents a hierarchy and is often referred to as a tree. Many library classification schemes, and many organisational management structures, are based around that sort of information structure.

The second image is a depiction of a more general network structure. Whenever I talk about graphs on the OUsefu.info blog (in fact, pretty much whenever I talk about a graph anywhere), that’s the sort of thing I’m talking about. This mess of connections is the way the web is structured. (The tree structure is also a graph, but subject to particular constraints; can you work out what some of those constraint might be?)

Note: it’s maybe worth reiterating at this point when I talk about graphs, the messy network thing I mean, not line charts like this:

Line chart

One of the terms I got to describe one of the graphs was “a matrix”. Matrices are in fact a very powerful way of describing the structure of a graph – if you fancy a treasure hunt, the terms adjacency matrix and incidence matrix should give you a head start…

I’m not sure what the problem is, but I think there is a problem that arises from not appreciating how powerful graph structures are as a way of making sense of the world. And I’m not really sure what I wanted to say in this post… except maybe go on a little fishing expedition to see how widespread the lack of familiarity with the notion of a graph as something like this:

Graph

really is…? So, if I asked you to draw a graph: a) what would you draw? b) would you even remotely consider drawing something the the image directly above? If you answered “no’ to (b), does it “say” anything to you at all?! Would you ever draw a diagram that had that flavour when explaining something (what?!) to someone else? (And the same question for the hierarchy…?)

PS a nice thing about graphs is you don’t have to draw them by hand – all you have to do is describe what connects to what, and then you can let a machine draw it for you. So for example:

- here is the “source code” for the tree
- here is the “source code” for the messy network graph

PPS when folk hear other folk wittering on about “the social graph”, what do they think it is? If asked to draw an indicative sketch of “the social graph”, what would they draw?!

Written by Tony Hirst

December 12, 2011 at 4:55 pm

Posted in Arcadia, Infoskills

Python Script for Exporting (Large) Twapperkeeper Archives By User

with 6 comments

FWIW, I started putting together a script that will grab individual hashtag archives, or all the hashtag archives created by a single user, from Twapperkeeper (which is shutting off its archives in early January).

The script should be capable of grabbing tweets from even large archives (hundreds of thousands/millions of tweets), though it’s probably not very efficient in the way it does it…

You can find the script here: Twapperkeeper archive rescue

If you have any problems with the script, or make any improvements to it, please let me know via the comments…

Written by Tony Hirst

December 12, 2011 at 12:44 pm

Posted in Anything you want

Tagged with

Follow

Get every new post delivered to your Inbox.

Join 126 other followers