OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Archive for the ‘Analytics’ Category

Charting the Social Landscape – Who’s Notable Amongst Followers of UK HE Twitter Accounts?

Over the last week or two, I’ve been playing around with a few ideas relating to where Twitter accounts are located in the social landscape. There are several components to this: who does a particular Twitter account follow, and who follows it; do the friends, or followers cluster in any ways that we can easily and automatically identify (for example, by term analysis applied to the biographies of folk in an individual cluster); who’s notable amongst the friends or followers of an individual that aren’t also a friend or follower of the individual, and so on…

Just to place a stepping stone in my thinking so far, here’s a handful of examples, showing who’s notable amongst the followers of a couple of official HE Twitter accounts but who doesn’t follow the corresponding followed_by account.

Firstly, here’s a snapshot of who followers of @OU_Community follow in significant numbers:

Positioing @ou_community

Hmmm – seems the audience are into their satire… Should the OU be making some humorous videos to tap into that interest?

Here’s how a random sample (I think!) of 250 of @UCLnews’ followers seem to follow at the 4% or more level (that is, at least 0.04 * 250 = 10 of @UCLnews followers follow them…)

positioning of @uclnews co-followed accounts

Seems to be quite a clustering of other university accounts being followed in there, but also “notable” figures and some evidence of a passing interest in serious affairs/commentators? That other UCL accounts are also being followed might suggest evidence that the @UCLnews account is being followed by current students?

How about the followers of @boltonuni? (Again, using a sample of 250 followers, though from a much smaller total follower population when compared to @UCLnews):

@boltonuni cofollowed

The dominance of other university accounts is noticeable here. A couple of possible reasons for this suggesting are that the sampled accounts skew towards other “professional” accounts from within the sector (or that otherwise follow it), or that the student and potential students have a less coherent (in the nicest possible sense of the word!) world view… Or that maybe there are lots of potential students out there following several university twitter accounts trying to get a feel for what the universities are offering.

If we actually look at friend connections between the @boltonuni 250 follower sample, 40% or so are not connected to other followers (either because they are private accounts or because they don’t follow any of the other followers – as we might expect from potential students, for example?)

The connected followers split into two camps:

Tunnelling in on boltonuni follower sample

A gut reaction reading of these communities that they represent sector and locale camps.

Finally, let’s take a look at 250 random followers of @buckssu (Buckinghamshire University student union); this time we get about 75% of followers in the giant connected component:

@buckssu follower sample

Again, we get a locale and ‘sector’ cluster. If we look at folk followed by 4% or more of the follower sample, we get this:

Flk followed by a sample of followers of buckssu

My reading of this is that the student union accounts are pretty tightly connected (I’m guessing we’d find some quite sizeable SU account cliques), there’s a cluster of “other student society” type accounts top left, and then a bunch of celebs…

So what does this tell us? Who knows…?! I’m working on that…;-)

Written by Tony Hirst

October 3, 2011 at 2:23 pm

Posted in Analytics, OU2.0

Tagged with

Early Peek at ALTC2011 Twitter Community…

A quick peek at the quick-off-the-mark users of the altc2011 hashtag on Twitter…

Social connections between folk using the hashtag:

altc2011 tweeps - colour follower count, node size betweenness centrality

(Image generated using gephi; node size: betweenness centrality, colour – follower count)

By looking at the Twitter profile of hashtag users, finding a user’s blog (or other affiliation) URL, and running RSS feed autodiscovery over the URLs, we can generate an OPML blogroll (after a fashion) from the list of hashtagging twitter users: altc2011 hashtaggers – discovered feeds OPML blogroll

List intelligence: I looked at the lists that hashtag users are on and ranked lists by number of subscribers as well as number of hashtag users appearing on the lists.

Lists containing N numbers of people using the altc2011 hashtag:

/ousefulAPI/altc2010 52
/helenwhd/e-learning 43
/kamyousaf/e-learning-uk 27
/suebecks/tech-enhanced-learning 27
/JonPowles/education 26
/sarahhorrigan/tel-people 25
/mhawksey/purposed 24
/traceymadden/education 22
/juliadesigns/education-uk-18 21
/ZoeEBreen/elearning-evangelists-uk 20
/mhawksey/lak11 20
/artfraud/education-2 20

Lists ordered by subscriber count (first number is number of people on list who’ve been an early user of altc2011 hashtag):

/kamyousaf/e-learning-uk 27 107
/kamyousaf/uk-ict-education 14 80
/mhawksey/purposed 24 42
/mhawksey/lak11 20 34
/helenwhd/e-learning 43 31
/suebecks/tech-enhanced-learning 27 27
/catherinecronin/education-elearning 17 26
/amcunningham/learning 17 26
/juliadesigns/education-uk-18 21 25
/JonPowles/education 26 19
/PatParslow/elearning-crew 15 18
/mhawksey/jiscel10 19 14
/ousefulAPI/altc2010 52 12
/ZoeEBreen/elearning-evangelists-uk 20 9
/ulcc/mootuk11-taggers 18 9
/HeyWayne/learning-tech-people 15 9

If we look at membership of lists containing altc2011 members, and then see who appears on those lists, we get an idea (maybe) of notable people in the community (number is number of lists each person appeared on):

'gconole', 17
'josiefraser', 15
'timbuckteeth', 15
'HallyMk1', 14
'mweller', 14
'jamesclay', 14
'mattlingard', 13
'francesbell', 13
'daveowhite', 12
'mhawksey', 12

Written by Tony Hirst

September 6, 2011 at 9:55 am

Posted in Analytics

Tagged with

Getting Library Catalogue Searches Out There…

As a long time fan of custom search engine offerings, I keep wondering why Google doesn’t seem to have much active interest in this area? Google Custom Search updates are few and far between, and typically go unreported by the tech blogs. Perhaps more surprisingly, Custom Search Engines don’t appear to have much, if any, recognition in the Google Apps for Education suite, although I think they are available with a Google Apps for education ID?

One of the things I’ve been mulling over for years is the role that automatically created course related search engines might have to play as part of a course’s VLE offering. The search engine would offer search results either over a set of web domains linked to from the actual course materials, or simply boost results from those domains in the context of a “normal” set of search results. I’ve recently started thinking that we could also make use “promoted” results to highlight specific required or recommended readings when a particular topic is searched for (for example, Integrating Course Related Search and Bookmarking?).

During an informal “technical” meeting around three JISC funded reseource discovery projects at Cambridge yesterday (Comet, Jerome, SALDA; disclaimer: I didn’t work on any of them, but I was in the area over the weekend…), there were a few brief mentions of how various university libraries were opening up their catalogues to the search engine crawlers. So for example, if you do a site: limited search on the following paths:

- sabre.sussex.ac.uk/vufindsmu/Record/
- jerome.library.lincoln.ac.uk/catalogue/
- webcat.hud.ac.uk/catlink/bib/
- search.lib.cam.ac.uk/

you can get (partial?) search results, with a greater or lesser degree of success, from the Sussex, Lincoln, Huddersfield and Cambridge catalogues respectively.

In a Google custom search engine context, we can tunnel in a little deeper in an attempt to returns results limited to actual records:

- sabre.sussex.ac.uk/vufindsmu/Record/*/Description
- jerome.library.lincoln.ac.uk/catalogue/*
- webcat.hud.ac.uk/catlink/bib/*
- search.lib.cam.ac.uk/?itemid=*

I’ve added these to a new Catalogues tab on my UK HE library website CSE (about), so we can start to search over these catalogues using Google.

I’m not sure how useful or interesting this is at the moment, except to the library systems developers maybe, who can compare how informatively their library catalogue content is indexed and displayed in Google search results compared to other libraries… (so for example, I noticed that Google appears to be indexing the “related items” that Huddersfield publishes on a record page, meaning that if a search term appears in a related work, you might get a record that at first glance appears to have little to do with your search term, in effect providing a “reverse related work” search (that is, search on related works and return items that have the search term as the related work)).

Searching UK HE library catalogues via a Google CSE

But it’s a start… and with the addition of customised rankings, might provide a jumping off point for experimenting with novel ways of searching across UK HE catalogues using Google indexed content. (For example, a version of the CSE on the cam.ac.uk domain might boost the Cambridge results; within an institution, works related to a particular course through mention on a reading list might get a boost if a student on that course runs a search… and so on…

PS A couple of other things that may be worth pondering… could Google Apps for Education account holders be signed up to to Subscribed Links offering customised search results in the main Google domain relating to a particular course. (That is, define subscribed link profiles for a each course, and automatically add those subscriptions to an Apps for Edu user’s account based on the courses they’re taking?) Or I wonder if it would be possible to associate subscribed links to public access browsers in some way?

And how about finding some way of working with Google to open up “professional” search profiles, where for example students are provided with “read only” versions of the personalised search results of an expert in a particular area who has tuned, through personalisation, a search profile that is highly specialised in a particular subject area, e.g. as mentioned in Google Personal Custom Search Engines? (see also Could Librarians Be Influential Friends? And Who Owns Your Search Persona?).

If anyone out there is working on ways of using Google customised and personalised search as a way of delivering “improved” search results in an educational context, I’d love to hear more about what you’re getting up to…

Written by Tony Hirst

August 9, 2011 at 8:55 am

Posted in Analytics, OU2.0, Search, SEO

Tagged with , ,

Surveying the Territory: Open Source, Open-Ed and Open Data Folk on Twitter

Over the last few weeks, I’ve been tinkering with various ways of using the Twitter API to discover Twitter lists relating to a particular topic area, whether discovered through a particular hashtag, search term, a list that already exists on a topic, or one or more people who may be associated with a particular topic area.

On my to do list is a map of the “open” community on Twitter – and the relationships between them – that will try to identify notable folk in different areas of openness (open government, open data, open licensing, open source software) and the communities around them, then aggregate all this open afficionados, plot the network connections between them, and remap the result (to see whether the distinct communities we started with fall out, as well as to discover who acts as the bridges between them, or alternatively discover whether new emergent groupings appear to crystallise out based on network connectivity).

As a step on the road to that, I had a quick peek around found who were tweeting using the #oscon hashtag over the weekend. Through analysing people who were tweeting regularly around the topic, I identified several lists in the area: @realist/opensource, @twongkee/opensource, @lemasney/opensource ,@suncao/open-linked-free, @jasebo/open-source

Pulling down the members of these lists, and then looking for connections between them, I came up with this map of the open source community on Twitter:

A peek at FOSS community on Twitter

Using a different technique not based on lists, I generated a map of the open data community based on the interconnections between people followed by @openlylocal:

How the people @countculture follows follow each other

and the open education community based on the people that follow @opencontent:

How followers of @Opencontent follow each other

(So that’s a different way of identifying the members of each community, right? One based on lists that mention users of a particular hashtag, one based on folk a particular individual follows, and one based on the folk that follow a particular individual.)

I’ve also toyed with looking at communities defined by members of lists that mention a particular individual, or people followed by a particular individual, as well as ones based on members of lists that contain folk listed on one or more trusted, curated lists in a particular topic area (got that?!;-).

Whilst the graphs based on mapping friends or followers of an individual give a good overview of that individual’s sphere of interest or influence, I think the community graphs derived from finding connections between people mentioned on “lists in the area” is a bit more robust in terms of mapping out communities in general, though I guess I’d need to do “proper” research to demonstrate that?

As mentioned at the start, the next thing on my list is a map across the aggregated “open” communities on Twitter. Of course, being digerati, many of these people will have decamped to GooglePlus. So maybe I shouldn’t bother, but instead wait for Google+ to mature a bit, an API to become available, blah, blah, blah…

Written by Tony Hirst

July 25, 2011 at 2:32 pm

A Couple of Notes on “List Intelligence”

Just so I don’t forget the development timeline such as it is, here are a few quick notes-to-self as much as anything about my “List Intelligence” tinkering to date:

  • List Intelligence uses (currently) Twitter lists to associate individuals with a particular topic area (the focus of the list; note that this may be ill-specified, e.g. “people I have met”, or topic focussed “OU employees”, etc)
  • List Intelligence is presented with a set of “candidate members” and then:
    1. looks up the lists those candidate members are on to provide a set of “candidate lists”;
    2. identifies the membership of those candidate lists (“candidate list members”) (this set may be subject to ranking or filtering, for example based on the number of list subscribers, or the number of original candidate members who are members of the current list);
    3. for the superset of members across lists (i.e. the set of candidate list members), rank each individual compared to the number of lists they are on (this may be optionally weighted by the number of subscribers to each list they are on); these individuals are potentially “key” players in the subject area defined by the lists that the original candidate members are members of;
    4. identify which of the candidate lists contains most candidate members, and rank accordingly (possibly also according to subscriber numbers); the top ranked lists are lists trivially associated with the set of original candidate members;
    5. provide output files that allow the graphing of individuals who are co-members of the same sets, and use the corresponding network as the basis for network analysis;
    6. optionally generate graphs based on friendship connections between candidate list members, and use the resulting graph as the basis for network analysis. (Any clusters/communities detected based on friendship may then be compared with the co-membership graphs to see the extent to which list memberships reflect or correlate to community structures);
  • the original set of candidate members may be defined in a variety of ways. For example:
    1. one or more named individuals;
    2. the friends of a named individual;
    3. the recent users of a particular hashtag;
    4. the recent users of a particular searched for term;
    5. the members of a “seed” list.
  • List Intelligence attempts to identify “list clusters” in the candidate lists set by detecting significant overlaps in membership between different candidate lists.
  • Candidate lists may be used to identify potential “focus of interest” areas associated with the original set of candidate members.

I’ll try to post some pseudo-code, flow charts and formal algorithms to describe the above… but it may take a week or two…

Written by Tony Hirst

June 24, 2011 at 5:35 pm

Follower Networks and “List Intelligence” List Contexts for @JiscCetis

I’ve been tinkering with some of my “List Intelligence” code again, and thought it worth capturing some examples of the sort of network exploration recipes I’m messing around with at the moment.

Let’s take @jiscCetis as an example; this account follows no-one, is followed by a few, hasnlt much of a tweet history and is listed by a handful of others.

Here’s the follower network, based on how the followers of @jiscetis follow each other:

Friend connections between @Jisccetis followers

There are three (maybe four) clusters there, plus all the folk who don’t follow any of the @jisccetis’ followers…: do these follower clusters make any sort of sense I wonder? (How would we label them…?)

The next thing I thought to do was look at the people who were on the same lists as @jisccetis, and get an overview of the territory that @jisccetis inhabits by virtue of shared list membership.

Here’s a quick view over the folk on lists that @jisccetis is a member of. The nodes are users named on the lists that @jisccetis is named on, the edges are undirected and join indivduals who are on the same list.

Distribution of users named on lists that jisccetis is a member of

Plotting “co-membership” edges is hugely expensive in terms of upping the edge count that has to be rendered, but we can use a directed bipartite graph to render the same information (and arguably even more information); here, there are two sorts of nodes: lists, and the memvers of lists. Edges go from members to listnames (I should swap this direction really to make more sense of authority/hub metrics…?)

jisccetis co-list membership

Another thing I thought I’d explore is the structure of the co-list membership community. That is, for all the people on the lists that @jisccetis is a member of, how do those users follow each other?

How folk on same lists as @jisccetis follow each other

It may be interesting to explore in a formal way the extent to which the community groups that appear to arise from the friending relationships are reflected (or not) by the make up of the lists?

It would probably also be worth trying to label the follower group – are there “meaningful” (to @jisccetis? to the @jisccetis community?) clusters in there? How would you label the different colour groupings? (Let me know in the comments…;-)

Written by Tony Hirst

June 18, 2011 at 7:55 pm

Identifying the Twitterati Using List Analysis

Given absolutely no-one picked up on List Intelligence – Finding Reliable, Trustworthy and Comprehensive Topic/Sector Based Twitter Lists, here’s a example of what the technique might be good for…

Seeing the tag #edusum11 in my feed today, and not being minded to follow it it I used the list intelligence hack to see:

- which lists might be related to the topic area covered by the tag, based on looking at which Twitter lists folk recently using the tag appear on;
- which folk on twitter might be influential in the area, based on their presence on lists identified as maybe relevant to the topic associated with the tag…

Here’s what I found…

Some lists that maybe relate to the topic area (username/list, number of folk who used the hashtag appearing on the list, number of list subscribers), sorted by number of people using the tag present on the list:

/joedale/ukedtech 6 6
/TWMarkChambers/edict 6 32
/stevebob79/education-and-ict 5 28
/mhawksey/purposed 5 38
/fosteronomo/chalkstars-combined 5 12
/kamyousaf/uk-ict-education 5 77
/ssat_lia/lia 5 5
/tlists/edtech-995 4 42
/ICTDani/teched 4 33
/NickSpeller/buzzingeducators 4 2
/SchoolDuggery/uk-ed-admin-consultancy 4 65
/briankotts/educatorsuk 4 38
/JordanSkole/jutechtlets 4 10
/nyzzi_ann/teacher-type-people 4 9
/Alexandragibson/education 4 3
/danielrolo/teachers 4 20
/cstatucki/educators 4 13
/helenwhd/e-learning 4 29
/TechSmithEDU/courosalets 4 2
/JordanSkole/chalkstars-14 4 25
/deerwood/edtech 4 144

Some lists that maybe relate to the topic area (username/list, number of folk who used the hashtag appearing on the list, number of list subscribers), sorted by number of people subscribing to the list (a possible ranking factor for the list):
/deerwood/edtech 4 144
/kamyousaf/uk-ict-education 5 77
/SchoolDuggery/uk-ed-admin-consultancy 4 65
/tlists/edtech-995 4 42
/mhawksey/purposed 5 38
/briankotts/educatorsuk 4 38
/ICTDani/teched 4 33
/TWMarkChambers/edict 6 32
/helenwhd/e-learning 4 29
/stevebob79/education-and-ict 5 28
/JordanSkole/chalkstars-14 4 25
/danielrolo/teachers 4 20
/cstatucki/educators 4 13
/fosteronomo/chalkstars-combined 5 12
/JordanSkole/jutechtlets 4 10
/nyzzi_ann/teacher-type-people 4 9
/joedale/ukedtech 6 6
/ssat_lia/lia 5 5
/Alexandragibson/education 4 3
/NickSpeller/buzzingeducators 4 2
/TechSmithEDU/courosalets 4 2

Other ranking factors might include the follower count, or factors from some sort of social network analysis, of the list maintainer.

Having got a set of lists, we can then look for people who appear on lots of those lists to see who might be influential in the area. Here’s the top 10 (user, number of lists they appear on, friend count, follower count, number of tweets, time of arrival on twitter):

['terryfreedman', 9, 4570, 4831, 6946, datetime.datetime(2007, 6, 21, 16, 41, 17)]
['theokk', 9, 1564, 1693, 12029, datetime.datetime(2007, 3, 16, 14, 36, 2)]
['dawnhallybone', 8, 1482, 1807, 18997, datetime.datetime(2008, 5, 19, 14, 40, 50)]
['josiefraser', 8, 1111, 7624, 17971, datetime.datetime(2007, 2, 2, 8, 58, 46)]
['tonyparkin', 8, 509, 1715, 13274, datetime.datetime(2007, 7, 18, 16, 22, 53)]
['dughall', 8, 2022, 2794, 16961, datetime.datetime(2009, 1, 7, 9, 5, 50)]
['jamesclay', 8, 453, 2552, 22243, datetime.datetime(2007, 3, 26, 8, 20)]
['timbuckteeth', 8, 1125, 7198, 26150, datetime.datetime(2007, 12, 22, 17, 17, 35)]
['tombarrett', 8, 10949, 13665, 19135, datetime.datetime(2007, 11, 3, 11, 45, 50)]
['daibarnes', 8, 1592, 2592, 7673, datetime.datetime(2008, 3, 13, 23, 20, 1)]

The algorithms I’m using have a handful of tuneable parameters, which means there’s all sorts of scope for running with this idea in a “research” context…

One possible issue that occurred to me was that identified lists might actually cover different topic areas – this is something I need to ponder…

Written by Tony Hirst

June 9, 2011 at 6:55 pm

Follow

Get every new post delivered to your Inbox.

Join 841 other followers