Last night I spent an hour or two tinkering with the dev version of my prototype hashtag community explorer (Personal Twitter Networks in Hashtag Communities), in part prompted by a tweet from @sleslie, thinking about what sorts of features might help you decide whether or not to follow someone new from that community.

NOTE – at times this post reads like a mechanical, and very contrived, prescription for deciding whether to follow someone on Twitter according to how ‘useful’ they may to you. I know friending/following is a lot more fluid/ad hoc than this, but that’s not the point, okay…? (though I’m not sure what the actual point is, yet…?!)

Part of the rationale for this is so that I can start reading about formal social network analysis with some sort of prior knowledge about what sorts of measures I think might be useful, and why, along with how easy they are to calculate in practice. And along with that, I was also looking for easy to do calculations that might be useful in the context of friend recommendation algorithm. (It also occurs to me that this sort of thinking might be tangentially useful to the development of ‘trust’ or ‘reputation’ metrics that Martin is so keen on… e.g. Some more thoughts on metrics ;-)

So here’s where I got to, comparing myself and @jamesclay in the context of a sample of altc2009 hashtagger:

The first metric is easy enough to calculate – @jamesclay’s friends/followers ratio. When rating how valuable a node might be in a network, I think the ratio of input (“friends”) edges to output (“followers”) edges is a useful one. If the number if close to zero, the node is acting in a largely broadcast mode. My friends/followers ratio is about 0.2-0.25 – approx 4 followers per friend, which works for me. Looking at the magnitude of the number of followers also gives you a clue as to how well connected a node is as a potential amplification channel.

The next pair of numbers I calculated related to the number of mutual friends and the number of mutual followers between myself and @jamesclay, normalised against my total number of friends and my total number of followers respectively.

The first measure – my “normalised mutual friends” tells me what proportion of my friends are also jamesclay’s friends. That is, what proportion of my friends are mutually ‘trusted’ by the person I’m considering following (where friending someone on twitter is taken as a vote of trust; we might also take the number of friends to be the number of people who can influence us on Twitter?). As this number tends to 1, it tells me the extent to which all the people I follow are also followed by @jamesclay. If this number equals one, @jamesclay has friended all the people I have. Although note that in that case, this may only be a small proportion of @jamesclay’s total friends list. (So maybe I need a measure to accommodate that? Eg the number of mutual friends normalised against @jamesclay’s total number of friends?) If the number tends to zero, then very few of my influenced

My “normalised mutual followers” score tells me what proportion of my followers are also jamesclay’s followers. That is, what proportion of my followers mutually ‘trusted’ both myself and jamesclay. If this number tends to one, all my followers are also following jamesclay; which would mean that a tweet from jamesclay would reach all my followers and maybe more. If the number tends to zero, we potentially influence completely different sets of people.

(I guess there’s a number we can grab here which is our shared audience size, that is, the number of our combined unique followers: my_followers+your_followers-mutual_followers. Dividing this by my_followers then gives an amplification factor if ‘you’ retweet me?)

The next two measures are based on the number of my friends who follow jamesclay. That is, the people I trust (as demonstrated by my friending them) who in turn trust (have friended/are following) jamesclay.

The first number is the number of my friends who follow jamesclay, divided by the total number of his followers. That is, what proportion of jamesclay’s followers are my friends? Or to put it another way, what proportion of jamesclay’s total following do I trust?

The last number is the number of my friends who follow jamesclay divided by the total number of my friends. That is, what proportion of my friends trust jamesclay.

Okay, so I have no idea where any of this is going, but I just needed to write it down so that I don’t have to remember it, but know that I can call on it if i do need it…;-) I fully expect that things relating to all the above have been properly worked out in the context of ‘proper’ social network analysis, but I’m still trying to generate my own context to make reading that stuff relevant.

As we know, connecting every node to every other node creates a dumb, inefficient network. OTOH, local clusters with inter-connectors (small worlds) is what we’re aiming for). Can you visualize your emerging algorithms to check if you’re on the right track?

I have a load of viz script, but they only run in offline mode (Graphviz). When I get a chance, I’ll start pulling graph metrics out using networkx.

At the moment, I’m exploring things that can be done in local areas of bounded networks with a minimum of twitter api calls…

I hav *no idea* what I’m doing or where this is going… just taking obvious next steps and seeing where it leads me…

When you say: “local clusters with inter-connectors (small worlds) is what we’re aiming for)”, what do you mean “aiming for”? What local clusters are you aiming for, how do you want them to be displayed, and why?

We’re aiming for a patter something like this:

which maximizes both interaction and information flow. The interconnectors are important, but if everyone is an interconnector, the network is very inefficient.

yes, yes – i know about small world networks; I did the Santa Fe Institute Complex Systems summer school, etc etc ;-)

But what do you mean you are aiming for that? You want (to curate) your network so that it looks something like that?

If you’re building an algorithm, shouldn’t the output look like this?

@ajcann hmm maybe… but then, in the other networks that I am maybe weakly connected to through one person, what confidence do I have in them if don’t share any followers and none of my friends follow them?

That was part of the reason for looking at the people who lots of hashtaggers had friended who hadn’t used a particular hashtag in https://ouseful.wordpress.com/2009/09/04/more-thinkses-around-twitter-hashtag-networks-jiscri/

That is, I wanted to see if there was an easy way to detect people who a hashtag community appeared to mutually respect who was not explicitly part of that community?

If I wanted to get a connection into a community, one thing to do would be to identify a key to that community, (such as a hashtag) and then look to see who the hubs and authorities in it were, and then maybe follow them (for slightly different reasons)?

I could see something base on this being a great addition to Twitter itself. The biggest barrier I see to people engaging with Twitter is the question of ‘how do I know who to follow?’ – so if you could pick a few people and then start seeing really ‘honestly’ useful recomendations for other people to expand your network, that would be really useful.

…I started with a similar manual version of looking at who people I follow follow, and even a few more levels deep, looking at recent tweets from those people. Follow Fridays was a useful idea, but it seems to be dying, probably because it became a way of saying ‘hey I like you’ to the people mentioned and a lot of people still doing it just list everyone to avoid offending someone they miss, ths devaluing the recomendations (as well as the fundamental flaw that the measure of usefulness is so individual anyhow)

but applying a mechanical metric of ‘utility’ avoids the social moras of FF, shortcuts the effort needed to manually check follower lists etc….it’s a real win.

Any chance you could sell the work to Twitter, or is that too mercinary for someone in academia ;)