Archive for the ‘Analytics’ Category
Over the last week or two, I’ve been playing around with a few ideas relating to where Twitter accounts are located in the social landscape. There are several components to this: who does a particular Twitter account follow, and who follows it; do the friends, or followers cluster in any ways that we can easily and automatically identify (for example, by term analysis applied to the biographies of folk in an individual cluster); who’s notable amongst the friends or followers of an individual that aren’t also a friend or follower of the individual, and so on…
Just to place a stepping stone in my thinking so far, here’s a handful of examples, showing who’s notable amongst the followers of a couple of official HE Twitter accounts but who doesn’t follow the corresponding followed_by account.
Firstly, here’s a snapshot of who followers of @OU_Community follow in significant numbers:
Hmmm – seems the audience are into their satire… Should the OU be making some humorous videos to tap into that interest?
Here’s how a random sample (I think!) of 250 of @UCLnews’ followers seem to follow at the 4% or more level (that is, at least 0.04 * 250 = 10 of @UCLnews followers follow them…)
Seems to be quite a clustering of other university accounts being followed in there, but also “notable” figures and some evidence of a passing interest in serious affairs/commentators? That other UCL accounts are also being followed might suggest evidence that the @UCLnews account is being followed by current students?
How about the followers of @boltonuni? (Again, using a sample of 250 followers, though from a much smaller total follower population when compared to @UCLnews):
The dominance of other university accounts is noticeable here. A couple of possible reasons for this suggesting are that the sampled accounts skew towards other “professional” accounts from within the sector (or that otherwise follow it), or that the student and potential students have a less coherent (in the nicest possible sense of the word!) world view… Or that maybe there are lots of potential students out there following several university twitter accounts trying to get a feel for what the universities are offering.
If we actually look at friend connections between the @boltonuni 250 follower sample, 40% or so are not connected to other followers (either because they are private accounts or because they don’t follow any of the other followers – as we might expect from potential students, for example?)
The connected followers split into two camps:
A gut reaction reading of these communities that they represent sector and locale camps.
Finally, let’s take a look at 250 random followers of @buckssu (Buckinghamshire University student union); this time we get about 75% of followers in the giant connected component:
Again, we get a locale and ‘sector’ cluster. If we look at folk followed by 4% or more of the follower sample, we get this:
My reading of this is that the student union accounts are pretty tightly connected (I’m guessing we’d find some quite sizeable SU account cliques), there’s a cluster of “other student society” type accounts top left, and then a bunch of celebs…
So what does this tell us? Who knows…?! I’m working on that…;-)
A quick peek at the quick-off-the-mark users of the altc2011 hashtag on Twitter…
Social connections between folk using the hashtag:
(Image generated using gephi; node size: betweenness centrality, colour – follower count)
By looking at the Twitter profile of hashtag users, finding a user’s blog (or other affiliation) URL, and running RSS feed autodiscovery over the URLs, we can generate an OPML blogroll (after a fashion) from the list of hashtagging twitter users: altc2011 hashtaggers – discovered feeds OPML blogroll
List intelligence: I looked at the lists that hashtag users are on and ranked lists by number of subscribers as well as number of hashtag users appearing on the lists.
Lists containing N numbers of people using the altc2011 hashtag:
Lists ordered by subscriber count (first number is number of people on list who’ve been an early user of altc2011 hashtag):
/kamyousaf/e-learning-uk 27 107
/kamyousaf/uk-ict-education 14 80
/mhawksey/purposed 24 42
/mhawksey/lak11 20 34
/helenwhd/e-learning 43 31
/suebecks/tech-enhanced-learning 27 27
/catherinecronin/education-elearning 17 26
/amcunningham/learning 17 26
/juliadesigns/education-uk-18 21 25
/JonPowles/education 26 19
/PatParslow/elearning-crew 15 18
/mhawksey/jiscel10 19 14
/ousefulAPI/altc2010 52 12
/ZoeEBreen/elearning-evangelists-uk 20 9
/ulcc/mootuk11-taggers 18 9
/HeyWayne/learning-tech-people 15 9
If we look at membership of lists containing altc2011 members, and then see who appears on those lists, we get an idea (maybe) of notable people in the community (number is number of lists each person appeared on):
As a long time fan of custom search engine offerings, I keep wondering why Google doesn’t seem to have much active interest in this area? Google Custom Search updates are few and far between, and typically go unreported by the tech blogs. Perhaps more surprisingly, Custom Search Engines don’t appear to have much, if any, recognition in the Google Apps for Education suite, although I think they are available with a Google Apps for education ID?
One of the things I’ve been mulling over for years is the role that automatically created course related search engines might have to play as part of a course’s VLE offering. The search engine would offer search results either over a set of web domains linked to from the actual course materials, or simply boost results from those domains in the context of a “normal” set of search results. I’ve recently started thinking that we could also make use “promoted” results to highlight specific required or recommended readings when a particular topic is searched for (for example, Integrating Course Related Search and Bookmarking?).
During an informal “technical” meeting around three JISC funded reseource discovery projects at Cambridge yesterday (Comet, Jerome, SALDA; disclaimer: I didn’t work on any of them, but I was in the area over the weekend…), there were a few brief mentions of how various university libraries were opening up their catalogues to the search engine crawlers. So for example, if you do a site: limited search on the following paths:
you can get (partial?) search results, with a greater or lesser degree of success, from the Sussex, Lincoln, Huddersfield and Cambridge catalogues respectively.
In a Google custom search engine context, we can tunnel in a little deeper in an attempt to returns results limited to actual records:
I’m not sure how useful or interesting this is at the moment, except to the library systems developers maybe, who can compare how informatively their library catalogue content is indexed and displayed in Google search results compared to other libraries… (so for example, I noticed that Google appears to be indexing the “related items” that Huddersfield publishes on a record page, meaning that if a search term appears in a related work, you might get a record that at first glance appears to have little to do with your search term, in effect providing a “reverse related work” search (that is, search on related works and return items that have the search term as the related work)).
But it’s a start… and with the addition of customised rankings, might provide a jumping off point for experimenting with novel ways of searching across UK HE catalogues using Google indexed content. (For example, a version of the CSE on the cam.ac.uk domain might boost the Cambridge results; within an institution, works related to a particular course through mention on a reading list might get a boost if a student on that course runs a search… and so on…
PS A couple of other things that may be worth pondering… could Google Apps for Education account holders be signed up to to Subscribed Links offering customised search results in the main Google domain relating to a particular course. (That is, define subscribed link profiles for a each course, and automatically add those subscriptions to an Apps for Edu user’s account based on the courses they’re taking?) Or I wonder if it would be possible to associate subscribed links to public access browsers in some way?
And how about finding some way of working with Google to open up “professional” search profiles, where for example students are provided with “read only” versions of the personalised search results of an expert in a particular area who has tuned, through personalisation, a search profile that is highly specialised in a particular subject area, e.g. as mentioned in Google Personal Custom Search Engines? (see also Could Librarians Be Influential Friends? And Who Owns Your Search Persona?).
If anyone out there is working on ways of using Google customised and personalised search as a way of delivering “improved” search results in an educational context, I’d love to hear more about what you’re getting up to…
Just so I don’t forget the development timeline such as it is, here are a few quick notes-to-self as much as anything about my “List Intelligence” tinkering to date:
- List Intelligence uses (currently) Twitter lists to associate individuals with a particular topic area (the focus of the list; note that this may be ill-specified, e.g. “people I have met”, or topic focussed “OU employees”, etc)
- List Intelligence is presented with a set of “candidate members” and then:
- looks up the lists those candidate members are on to provide a set of “candidate lists”;
- identifies the membership of those candidate lists (“candidate list members”) (this set may be subject to ranking or filtering, for example based on the number of list subscribers, or the number of original candidate members who are members of the current list);
- for the superset of members across lists (i.e. the set of candidate list members), rank each individual compared to the number of lists they are on (this may be optionally weighted by the number of subscribers to each list they are on); these individuals are potentially “key” players in the subject area defined by the lists that the original candidate members are members of;
- identify which of the candidate lists contains most candidate members, and rank accordingly (possibly also according to subscriber numbers); the top ranked lists are lists trivially associated with the set of original candidate members;
- provide output files that allow the graphing of individuals who are co-members of the same sets, and use the corresponding network as the basis for network analysis;
- optionally generate graphs based on friendship connections between candidate list members, and use the resulting graph as the basis for network analysis. (Any clusters/communities detected based on friendship may then be compared with the co-membership graphs to see the extent to which list memberships reflect or correlate to community structures);
- the original set of candidate members may be defined in a variety of ways. For example:
- one or more named individuals;
- the friends of a named individual;
- the recent users of a particular hashtag;
- the recent users of a particular searched for term;
- the members of a “seed” list.
- List Intelligence attempts to identify “list clusters” in the candidate lists set by detecting significant overlaps in membership between different candidate lists.
- Candidate lists may be used to identify potential “focus of interest” areas associated with the original set of candidate members.
I’ll try to post some pseudo-code, flow charts and formal algorithms to describe the above… but it may take a week or two…