OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Archive for the ‘Analytics’ Category

MOOC Platforms and the A/B Testing of Course Materials

[The following is my *personal* opinion only. I know as much about FutureLearn as Google does. Much of the substance of this post was circulated internally within the OU prior to posting here.]

In common with other MOOC platforms, one of the possible ways of positioning FutureLearn is as a marketing platform for universities. Another might see it as a tool for delivering informal versions of courses to learners who are not currently registered with a particular institution. [A third might position it in some way around the notion of "learning analytics", eg as described in a post today by Simon Buckingham Shum: The emerging MOOC data/analytics ecosystem] If I understand it correctly, “quality of the learning experience” will be at the heart of the FutureLearn offering. But what of innovation? In the same way that there is often a “public benefit feelgood” effect for participants in medical trials, could FutureLearn provide a way of engaging, at least to a limited extent, in “learning trials”.

This need not be onerous, but could simply relate to trialling different exercises or wording or media use (video vs image vs interactive) in particular parts of a course. In the same way that Google may be running dozens of different experiments on its homepage in different combinations at any one time, could FutureLearn provide universities with a platform for trying out differing learning experiments whilst running their MOOCs?

The platform need not be too complex – at first. Google Analytics provides a mechanism for running A/B tests and “experiments” across users who have not disabled Google Analytics cookies, and as such may be appropriate for initial trialling of learning content A/B tests. The aim? Deciding on metrics is likely to prove a challenge, but we could start with simple things to try out – does the ordering or wording of resource lists affect click-through or download rates for linked resources, for example? (And what should we do about those links that never get clicked and those resources that are never downloaded?) Does offering a worked through exercise before an interactive quiz improve success rates on the quiz, and so on.

The OU has traditionally been cautious when running learning experiments, delivering fee-waived pilots rather than testing innovations as part of A/B testing on live courses with large populations. In part this may be through a desire to be ‘equitable’ and not jeopardise the learning experience for any particular student by providing them with a lesser quality offering than we could*. (At the same time, the OU celebrates the diversity and range of skills and abilities of OU students, which makes treating them all in exactly the same way seem rather incongruous?)

* Medical trials face similar challenges. But it must be remembered that we wouldn’t trial a resource we thought stood a good chance of being /less/ effective than one we were already running… For a brief overview of the broken worlds of medical trials and medical academic publishing, as well as how they could operate, see Ben Goldacre’s Bad Pharma for an intro.

FutureLearn could start to change that, and open up a pathway for experimentally testing innovations in online learning as well as at a more micro-level, tuning images and text in order to optimise content for its anticipated use. By providing course publishers with a means of trialling slightly different versions of their course materials, FutureLearn could provide an effective environment for trialling e-learning innovations. Branding FutureLearn not only as a platform for quality learning, but also as a platform for “doing” innovation in learning, gives it a unique point of difference. Organisations trialling on the platform do not face the threat of challenges made about them delivering different learning experiences to students on formally offered courses, but participants in courses are made aware that they may be presented with slightly different variants of the course materials to each other. (Or they aren’t told… if an experiment is based on success in reading a diagram where the labels are presented in different fonts or slightly different positions, or with or without arrows, and so on, does that really matter if the students aren’t told?)

Consultancy opportunities are also likely to arise in the design and analysis of trials and new interventions. The OU is also provided with both an opportunity to act according to it’s beacon status as far communicating innovative adult online learning/pedagogy goes, as well as gaining access to large trial populations.

Note that what I’m not proposing is not some sort of magical, shiny learning analytics dashboard, it’d be a procedural, could have been doing it for years, application of web analytics that makes use of online learning cohorts that are at least a magnitude or two larger than is typical in a traditional university course setting. Numbers that are maybe big enough to spot patterns of behaviour in (either positive, or avoidant).

There are ethical challenges and educational challenges in following such a course of action, of course. But in the same way that doctors might randomly prescribe between two equally good (as far as they know) treatments, or who systematically use one particular treatment over another that is equally good, I know that folk who create learning materials also pick particular pedagogical treatments “just because”. So why shouldn’t we start trialling on a platform that is branded as such?

Once again, note that I am not part of the FutureLearn project team and my knowledge of it is largely limited to what I have found on Google.

See also: Treating MOOC Platforms as Websites to be Optimised, Pure and Simple…. For some very old “course analytics” ideas about using Google Analytics, see Online Course Analytics, which resulted in OUseful blogarchive: “course analytics”. Note that these experiments never got as far as content optimisation, A/B testing, search log analysis etc. The approach I started to follow with the Library Analytics series had a little more success, but still never really got past the starting post and into a useful analyse/adapt cycle. Google Analytics has moved on since then of course… If I were to start over, I;d probably focus on creating custom dashboards to illustrate very particular use cases, as well as REDACTED.

Written by Tony Hirst

January 31, 2013 at 4:53 pm

Posted in Analytics, Infoskills, OU2.0

Tagged with

Open Webstats from GovUK

In the #solo12eval session* on Monday organised by Shane McCracken and Karen Bultitude on the topic of evaluating impact (whatever that is) of online science engagement, I was reminded (yet again…) of the Culture24 report on Evaluating Impact online. The report included as one of its deliverables a set of example Google Analytics report templates (now rotted?) that provided a starting point for what could be a commonly-accepted-as-sensible reporting framework. (I keep wondering whether it would be useful to try to do the same for academic library websites/library website analytics?) One of the things I pondered afterwards was whether it would make sense to open up Google Analytics from a ‘typical’ website in that sector to all-comers, so that different parties could demonstrate what stories and information they could pull out of the stats using a common data basis. Something a bit like CSS Zen Garden, but around a common Google Analytics dataset, for example?

* From the session, I also learned of the JISC Impact Analysis Programme, which includes an interestingly titled project on Tracking Digital Impact (TDI). That project is presumably in stealth mode, because it was really hard to find out anything about it… (I thought JISC projects were all encouraged to do the blogging thing? Or is that just from certain enlightened parts of JISC…?)

Loosely related to the workshop, and from my feeds, I noticed a couple of announcements over the last couple of days relating to the publication of web/traffic stats on a couple of government web properties.

First up, the Government Digital Service/@gdsteam posted on their Updat[ed] GOV.UK Performance Dashboard, which you can find: Performance Platform Dashboard.

As you can see, this dashboard reports on a variety of Google Analytics stats – average unique visitors, weekly pageviews, and so on.

As well as the dashboard itself, the @gds_datashark team seem to be quite happy to show their working and presumably allow others to propose check-ins of their own bug fixes and code solutions to .. Gov github

To make it easy to play along, they’re publishing a set of raw data feeds (Headline narrative text, Yesterday’s hourly traffic and comparison average, Weekly visits to GOV.UK, Direct Gov and Businesslink, Weekly unique visitors to GOV.UK, Direct Gov and Businesslink, Format success metrics) although the blog post notes these are ‘internal’ URLs and hence are subject to change…

(Via tweets from @jukesie and @lseteph, I was also reminded that Steph experimented with publishing BIS’ departmental webstats way back when)

In the past, UKGov has posted a certain amount of costings related data around website provision (for example, So Where Do the Numbers in Government Reports Come From?), so if there are any armchair web analysts/auditors out there (unlikely, I know;-), it seems as if data is there for the taking, as well as the asking (the GDS folk seem to be quite open to ideas…)

The second announcement that caught my eye was the opening up of site usage stats on the data.gov.uk website.

Data is broken down into site-wide, publisher and datasets groupings, and reports on things like:

- browser type
– O/S type
– social network referrals
– language
– country

The data is also available via a CSV file.

So I wonder: could we use the GDS and data.gov.uk data/data feeds as the basis for a crude webstats Zen Garden? How would such a site best be architected? (One central github repo pulling in exemplar view requests from cloned repos?) And would it make sense to publish webstats data/analytics from a “typical” science engagement website (or library website, or course website), and allow the community to see what sorts of take on it folk can come up with in respect of different ways presenting the data and more importantly, identifying different ways of making sense of it/finding different ways of telling stories with it?

Written by Tony Hirst

November 15, 2012 at 1:45 pm

Do Retweeters Lack Commitment to a Hashtag?

I seem to be going down more ratholes than usual at the moment, in this case relating to activity round Twitter hashtags. Here’s a quick bit of reflection around a chart from Visualising Activity Around a Twitter Hashtag or Search Term Using R that shows activity around a hashtag that was minted for an event that took place before the sample period.

The y-axis is organised according to the time of first use (within the sample period) of the tag by a particular user. The x axis is time. The dots represent tweets containing the hashtag, coloured blue by default, red if they are an old-style RT (i.e. they begin RT @username:).

So what sorts of thing might we look for in this chart, and what are the problems with it? Several things jump out at me:

  • For many of the users, their first tweet (in this sample period at least) is an RT; that is, they are brought into the hashtag community through issuing an RT;
  • Many of the users whose first use is via an RT don’t use the hashtag again within the sample period. Is this typical? Does this signal represent amplification of the tag without any real sense of engagement with it?
  • A noticeable proportion of folk whose first use is not an RT go on to post further non-RT tweets. Does this represent an ongoing commitment to the tag? Note that this chart does not show whether tweets are replies, or “open” tweets. Replies (that is, tweets beginning @username are likely to represent conversational threads within a tag context rather than “general” tag usage, so it would be worth using an additional colour to identify reply based conversational tweets as such.
  • “New style” retweets are diaplayed as retweets by colouring… I need to check whether or nor newstyle RT information is available that I could use to colour such tweets appropriately. (or alternatively, I’d have to do some sort of string matching to see whether or not a tweet was the same as a previously seen tweet, which is a bit of a pain:-(

(Note that when I started mapping hashtag communities, I used to generate tag user names based on a filtered list of tweets that excluded RTs. this meant that folk who only used the tag as part of an RT and did not originate tweets that contained the tag, either in general or as part of a conversation, would not be counted as a member of the hashtag community. More recently, I have added filters that include RTs but exclude users who used the tag only once, for example, thus retaining serial RTers, but not single use users.)

So what else might this chart tell us? Looking at vertical slices, it seems that news entrants to the tag community appear to come in waves, maybe as part of rapid fire RT bursts. This chart doesn’t tell us for sure that this is happening, but it does highlight areas of the timelime that might be worth investigating more closely if we are interested in what happened at those times when there does appear to be a spike in activity. (Are there any modifications we could make to this chart to make them more informative in this respect? The time resolution is very poor, for example, so being able to zoom in on a particular time might be handy. Or are there other charts that might provide a different lens that can help us see what was happening at those times?)

And as a final point – this stuff may be all very interesting, but is it useful?, And if so, how? I also wonder how generalisable it is to other sorts of communication analysis. For example, I think we could use similar graphical techniques to explore engagement with an active comment thread on a blog, or Google+, or additions to an online forum thread. (For forums with mutliple threads, we maybe need to rethink how this sort of chart would work, or how it might be coloured/what symbols we might use, to distinguish between starting a new thread, or adding to a pre-existing one, for example. I’m sure the literature is filled with dozens of examples for how we might visualise forum activity, so if you know of any good references/links…?! ;-) #lazyacademic)

Written by Tony Hirst

February 9, 2012 at 6:30 pm

Visualising Activity Around a Twitter Hashtag or Search Term Using R

I think one of valid criticisms around a lot of the visualisations I post here and on my various #f1datajunkie blogs is that I often don’t post any explanatory context around the visualisations. This is partly a result of the way I use my blog posts in a selfish way to document the evolution of my own practice, but not necessarily the “so what” elements that represent any meaning or sense I take from the visualisations. In many cases, this is because the understanding I come to of a dataset is typically the result of an (inter)active exploration of the data set; what I blog are the pieces of the puzzle that show how I personally set about developing a conversation with a dataset, pieces that you can try out if you want to…;-)

An approach that might get me more readers would be to post commentary around what I’ve learned about a dataset from having a conversation with it. A good example of this can be seen in @mediaczar’s post on How should Page Admins deal with Flame Wars?, where this visualisation of activity around a Facebook post is analysed in terms of effective (or not!) strategies for moderating a flame war.

@mediaczar visualisation of engagement around facebook flamewars

The chart shows a sequential ordering of posts in the order they were made along the x-axis, and the unique individual responsible for each post, ordered by accession to the debate along the y-axis. For interpretation and commentary, see the original post: How should Page Admins deal with Flame Wars? ;-)

One take away of the chart for me is that it provides a great snapshot of new people entering into a conversation (vertical lines) as well as engagement by an individual (horizontal lines). If we use a time proportional axis on x, we can also see engagement over time.

In a Twitter context, it’s likely that a rapid increase in numbers of folk engaging with a hashtag, for example, might be the result of an RT related burst of activity. For folk who have already engaged in hashtag usage, for example as part of a live event backhannel, a large number of near co-occurring tweets that are not RTs might signal some notable happenstance within the event.

To explore this idea, here’s a quick bit of R tooling inspired by Mat’s post… It uses the twitteR library and sources tweets via a Twitter search.

require(twitteR)
#Pull in a search around a hashtag.
searchTerm='#ukgc12'
rdmTweets <- searchTwitter(searchTerm, n=500)
# Note that the Twitter search API only goes back 1500 tweets

#Plot of tweet behaviour by user over time
#Based on @mediaczar's http://blog.magicbeanlab.com/networkanalysis/how-should-page-admins-deal-with-flame-wars/
#Make use of a handy dataframe creating twitteR helper function
tw.df=twListToDF(rdmTweets)
#@mediaczar's plot uses a list of users ordered by accession to user list
## 1) find earliest tweet in searchlist for each user [ http://stackoverflow.com/a/4189904/454773 ]
require(plyr)
tw.dfx=ddply(tw.df, .var = "screenName", .fun = function(x) {return(subset(x, created %in% min(created),select=c(screenName,created)))})
## 2) arrange the users in accession order
tw.dfxa=arrange(tw.dfx,-desc(created))
## 3) Use the username accession order to order the screenName factors in the searchlist
tw.df$screenName=factor(tw.df$screenName, levels = tw.dfxa$screenName)
#ggplot seems to be able to cope with time typed values...
require(ggplot2)
ggplot(tw.df)+geom_point(aes(x=created,y=screenName))

We can get a feeling for which occurrences were old-style RTs by identifying tweets that start with a classic RT, and then colouring each tweet appropriately (note there may be some overplotting/masking of points…I’m not sure how big the x-axis time bins are…)

#Identify and colour the RTs...
library(stringr)
#A helper function to remove @ symbols from user names...
trim <- function (x) sub('@','',x)
#Identify classic style RTs
tw.df$rt=sapply(tw.df$text,function(tweet) trim(str_match(tweet,"^RT (@[[:alnum:]_]*)")[2]))
tw.df$rtt=sapply(tw.df$rt,function(rt) if (is.na(rt)) 'T' else 'RT')
ggplot(tw.df)+geom_point(aes(x=created,y=screenName,col=rtt))

So now we can see when folk entered into the hashtag community via a classic RT.

We can also start to explore who was classically retweeted when:

#Generate a plot showing how a person is RTd
tw.df$rtof=sapply(tw.df$text,function(tweet) trim(str_match(tweet,"^RT (@[[:alnum:]_]*)")[2]))
#Note that this doesn't show how many RTs each person got in a given time period if they got more than one...
ggplot(subset(tw.df,subset=(!is.na(rtof))))+geom_point(aes(x=created,y=rtof))

Another view might show who was classically RTd by whom (activity along a row indicating someone was retweeted a lot through one or more tweets, activity within a column identifying an individual who RTs a lot…):

#We can start to get a feel for who RTs whom...
require(gdata)
#We don't want to display screenNames of folk who tweeted but didn't RT
tw.df.rt=drop.levels(subset(tw.df,subset=(!is.na(rtof))))
#Order the screennames of folk who did RT by accession order (ie order in which they RTd)
tw.df.rta=arrange(ddply(tw.df.rt, .var = "screenName", .fun = function(x) {return(subset(x, created %in% min(created),select=c(screenName,created)))}),-desc(created))
tw.df.rt$screenName=factor(tw.df.rt$screenName, levels = tw.df.rta$screenName)
# Plot who RTd whom
ggplot(subset(tw.df.rt,subset=(!is.na(rtof))))+geom_point(aes(x=screenName,y=rtof))+opts(axis.text.x=theme_text(angle=-90,size=6)) + xlab(NULL)

What sense you might make of all this, or where to take it next, is down to you of course… Err, erm…?! ;-)

PS see also: http://blog.ouseful.info/2012/01/21/a-quick-view-over-a-mashe-google-spreadsheet-twitter-archive-of-ukgc2012-tweets/

Written by Tony Hirst

February 6, 2012 at 1:14 pm

Social Interest Positioning – Visualising Facebook Friends’ Likes With Data Grabbed Using Google Refine

What do my Facebook friends have in common in terms of the things they have Liked, or in terms of their music or movie preferences? (And does this say anything about me?!) Here’s a recipe for visualising that data…

After discovering via Martin Hawksey that the recent (December, 2011) 2.5 release of Google Refine allows you to import JSON and XML feeds to bootstrap a new project, I wondered whether it would be able to pull in data from the Facebook API if I was logged in to Facebook (Google Refine does run in the browser after all…)

Looking through the Facebook API documentation whilst logged in to Facebook, it’s easy enough to find exemplar links to things like your friends list (https://graph.facebook.com/me/friends?access_token=A_LONG_JUMBLE_OF_LETTERS) or the list of likes someone has made (https://graph.facebook.com/me/likes?access_token=A_LONG_JUMBLE_OF_LETTERS); replacing me with the Facebook ID of one of your friends should pull down a list of their friends, or likes, etc.

(Note that validity of the access token is time limited, so you can’t grab a copy of the access token and hope to use the same one day after day.)

Grabbing the link to your friends on Facebook is simply a case of opening a new project, choosing to get the data from a Web Address, and then pasting in the friends list URL:

Google Refine - import Facebook friends list

Click on next, and Google Refine will download the data, which you can then parse as a JSON file, and from which you can identify individual record types:

Google Refine - import Facebook friends

If you click the highlighted selection, you should see the data that will be used to create your project:

Google Refine - click to view the data

You can now click on Create Project to start working on the data – the first thing I do is tidy up the column names:

Google Refine - rename columns

We can now work some magic – such as pulling in the Likes our friends have made. To do this, we need to create the URL for each friend’s Likes using their Facebook ID, and then pull the data down. We can use Google Refine to harvest this data for us by creating a new column containing the data pulled in from a URL built around the value of each cell in another column:

Google Refine - new column from URL

The Likes URL has the form https://graph.facebook.com/me/likes?access_token=A_LONG_JUMBLE_OF_LETTERS which we’ll tinker with as follows:

Google Refine - crafting URLs for new column creation

The throttle control tells Refine how often to make each call. I set this to 500ms (that is, half a second), so it takes a few minutes to pull in my couple of hundred or so friends (I don’t use Facebook a lot;-). I’m not sure what limit the Facebook API is happy with (if you hit it too fast (i.e. set the throttle time too low), you may find the Facebook API stops returning data to you for a cooling down period…)?

Having imported the data, you should find a new column:

Google Refine - new data imported

At this point, it is possible to generate a new column from each of the records/Likes in the imported data… in theory (or maybe not..). I found this caused Refine to hang though, so instead I exprted the data using the default Templating… export format, which produces some sort of JSON output…

I then used this Python script to generate a two column data file where each row contained a (new) unique identifier for each friend and the name of one of their likes:

import simplejson,csv

writer=csv.writer(open('fbliketest.csv','wb+'),quoting=csv.QUOTE_ALL)

fn='my-fb-friends-likes.txt'

data = simplejson.load(open(fn,'r'))
id=0
for d in data['rows']:
	id=id+1
	#'interests' is the column name containing the Likes data
	interests=simplejson.loads(d['interests'])
	for i in interests['data']:
		print str(id),i['name'],i['category']
		writer.writerow([str(id),i['name'].encode('ascii','ignore')])

[I think this R script, in answer to a related @mhawksey Stack Overflow question, also does the trick: R: Building a list from matching values in a data.frame]

I could then import this data into Gephi and use it to generate a network diagram of what they commonly liked:

Sketching common likes amongst my facebook friends

Rather than returning Likes, I could equally have pulled back lists of the movies, music or books they like, their own friends lists (permissions settings allowing), etc etc, and then generated friends’ interest maps on that basis.

[See also: Getting Started With The Gephi Network Visualisation App – My Facebook Network, Part I and how to visualise Google+ networks]

PS dropping out of Google Refine and into a Python script is a bit clunky, I have to admit. What would be nice would be to be able to do something like a “create new rows with new column from column” pattern that would let you set up an iterator through the contents of each of the cells in the column you want to generate the new column from, and for each pass of the iterator: 1) duplicate the original data row to create a new row; 2) add a new column; 3) populate the cell with the contents of the current iteration state. Or something like that…

PPS Related to the PS request, there is a sort of related feature in the 2.5 release of Google Refine that lets you merge data from across rows with a common key into a newly shaped data set: Key/value Columnize. Seeing this, it got me wondering what a fusion of Google Refine and RStudio might be like (or even just R support within Google Refine?)

PPPS this could be interesting – looks like you can test to see if a friendship exists given two Facebook user IDs.

PPPPS This paper in PNAS – Private traits and attributes are predictable from digital records of human behavior – by Kosinski et. al suggests it’s possible to profile people based on their Likes. It would be interesting to compare how robust that profiling is, compared to profiles based on the common Likes of a person’s followers, or the common likes of folk in the same Facebook groups as an individual?

Written by Tony Hirst

January 4, 2012 at 11:06 am

So Where Am I Socially Situated on Google+?

I haven’t really entered into the spirit of Google Plus yet – I haven’t created any circles or started populating them, for example, and I post rarely – but if you look at my public profile page you’ll see a list of folk who have added me to their circles…

This is always a risky thing of course – because my personal research ethic means that for anyone who pops their head above the horizon in my social space by linking publicly to one of my public profiles, their public data is fair game for an experiment… (I’m also aware that via authenticated access I may well be able to find grab even more data – but again, my personal research ethic is such that I try to make sure I don’t use data that requires any form of authentication in order to acquire it.)

So, here’s a started for 10: a quick social positioning map generated around who folk who have added me to public circles on Google+ publicly follow… Note that for folk who follow more than 90 people, I’m selecting a random sample of 90 of their friends to plot the graph. The graph is further filtered to only show folk who are followed by 5 or more of the folk who have added me to their circles (bear in mind that this may miss people out because of the 90 sample size hack).

Who folk who put me in a g+ circle follow

Through familiarity with many of the names, I spot what I’d loosely label as an OU grouping, a JISC grouping, an ed-techie grouping and a misc/other grouping…

Given one of the major rules of communication is ‘know your audience’, I keep wondering why so many folk who “do” the social media thing have apparently no interest in who they’re bleating at or what those folk might be interested in… I guess it’s a belief in “if I shout, folk will listen…”?

PS if you want to grab your own graph and see how you’re socially positioned on Google Plus, the code is here (that script is broken… I’ve started an alternative version here). It’s a Python script that requires the networkx library. (The d3 library is also included but not used – so feel free to delete that import…)

Written by Tony Hirst

October 16, 2011 at 5:51 pm

Posted in Analytics

Tagged with ,

Charting the Social Landscape – Who’s Notable Amongst Followers of UK HE Twitter Accounts?

Over the last week or two, I’ve been playing around with a few ideas relating to where Twitter accounts are located in the social landscape. There are several components to this: who does a particular Twitter account follow, and who follows it; do the friends, or followers cluster in any ways that we can easily and automatically identify (for example, by term analysis applied to the biographies of folk in an individual cluster); who’s notable amongst the friends or followers of an individual that aren’t also a friend or follower of the individual, and so on…

Just to place a stepping stone in my thinking so far, here’s a handful of examples, showing who’s notable amongst the followers of a couple of official HE Twitter accounts but who doesn’t follow the corresponding followed_by account.

Firstly, here’s a snapshot of who followers of @OU_Community follow in significant numbers:

Positioing @ou_community

Hmmm – seems the audience are into their satire… Should the OU be making some humorous videos to tap into that interest?

Here’s how a random sample (I think!) of 250 of @UCLnews’ followers seem to follow at the 4% or more level (that is, at least 0.04 * 250 = 10 of @UCLnews followers follow them…)

positioning of @uclnews co-followed accounts

Seems to be quite a clustering of other university accounts being followed in there, but also “notable” figures and some evidence of a passing interest in serious affairs/commentators? That other UCL accounts are also being followed might suggest evidence that the @UCLnews account is being followed by current students?

How about the followers of @boltonuni? (Again, using a sample of 250 followers, though from a much smaller total follower population when compared to @UCLnews):

@boltonuni cofollowed

The dominance of other university accounts is noticeable here. A couple of possible reasons for this suggesting are that the sampled accounts skew towards other “professional” accounts from within the sector (or that otherwise follow it), or that the student and potential students have a less coherent (in the nicest possible sense of the word!) world view… Or that maybe there are lots of potential students out there following several university twitter accounts trying to get a feel for what the universities are offering.

If we actually look at friend connections between the @boltonuni 250 follower sample, 40% or so are not connected to other followers (either because they are private accounts or because they don’t follow any of the other followers – as we might expect from potential students, for example?)

The connected followers split into two camps:

Tunnelling in on boltonuni follower sample

A gut reaction reading of these communities that they represent sector and locale camps.

Finally, let’s take a look at 250 random followers of @buckssu (Buckinghamshire University student union); this time we get about 75% of followers in the giant connected component:

@buckssu follower sample

Again, we get a locale and ‘sector’ cluster. If we look at folk followed by 4% or more of the follower sample, we get this:

Flk followed by a sample of followers of buckssu

My reading of this is that the student union accounts are pretty tightly connected (I’m guessing we’d find some quite sizeable SU account cliques), there’s a cluster of “other student society” type accounts top left, and then a bunch of celebs…

So what does this tell us? Who knows…?! I’m working on that…;-)

Written by Tony Hirst

October 3, 2011 at 2:23 pm

Posted in Analytics, OU2.0

Tagged with

Follow

Get every new post delivered to your Inbox.

Join 824 other followers