A Couple of Proof of Concept Demos With the Cloudworks API

Via a tweet from @mhawksey in response to a tweet from @sheilmcn, or something like that, I came across a post by Sheila on the topic of Cloud gazing, maps and networks – some thoughts on #oldsmooc so far. The post mentioned a prototyped mindmap style browser for Cloudworks, created in part to test out the Cloudworks API.

Having tinkered with mindmap style presentations using the d3.js library in the browser before (Viewing OpenLearn Mindmaps Using d3.js; the app itself may well have rotted by now) I thought I’d have a go at exploring something similar for Cloudworks. With a promptly delivered API key by Nick Freear, it only took a few minutes to repurpose an old script to cast a test call to the Cloudworks API into a form that could easily be visualised using the d3.js library. The approach I took? To grab JSON data from the API, construct a tree using the Python networkx library, and drop a JSON serialisation of the network into a templated d3.js page. (networkx has a couple of JSON export functions that will create tree based and graph/network based JSON data structures that d3.js can feed from.

Here’s the Python fragment:

#http://cloudworks.ac.uk/api/clouds/{cloud_id}.{format}?api_key={api_key}

import urllib2,json, networkx as nx
from networkx.readwrite import json_graph

id=cloudscapeID #need logic

urlstub="http://cloudworks.ac.uk/api/"
urlcloudscapestub=urlstub+"cloudscapes/"+str(id)
urlsuffix=".json?api_key="+str(key)

ctyp="/clouds"
url=urlcloudscapestub+ctyp+urlsuffix

entities=json.load(urllib2.urlopen(url))

#print entities

#I seem to remember issues with non-ascii before, though maybe that was for XML? Hmmm...
def ascii(s): return "".join(i for i in s.encode('utf-8') if ord(i)<128)

def graphRoot(DG,title,root=1):
    DG.add_node(root,name=ascii(title))
    return DG,root

def gNodeAdd(DG,root,node,name):
    node=node+1
    DG.add_node(node,name=ascii(name))
    DG.add_edge(root,node)
    return DG,node

DG=nx.DiGraph()
DG,root=graphRoot(DG,id)
currnode=root

#This simple example just grabs a list of clouds associated with a cloudscape
for c in entities['items']:
    DG,currnode=gNodeAdd(DG,root,currnode,c['title'])
    
#We're going to use the tree based JSON data format to feed the d3.js mindmap view
jdata = json_graph.tree_data(DG,root=1)
#print json.dumps(jdata)

#The page template is defined elsewhere.
#It loads the JSON from a declaration in the Javascript of the form: jsonData=%(jdata)s
print page_template % vars()

The rendered view is something along the lines of:

cloudscapeTree

You can find the original code here.

Now I know that: a) this isn’t very interesting to look at; and b) doesn’t even work as a navigation surface, but my intention was purely to demonstrate a recipe from getting data out of the Cloudworks API and into a d3.js mindmap view in the browser, and it does that. A couple of obvious next steps: i) add in additional API calls to grow the tree (easy); ii) linkify some of the nodes (I’m not sure I know who to do that at them moment?)

Sheila’s post ended with a brief reflection: “I’m also now wondering if a network diagram of cloudscape (showing the interconnectedness between clouds, cloudscapes and people) would be helpful ? Both in terms of not only visualising and conceptualising networks but also in starting to make more explicit links between people, activities and networks.”

So here’s another recipe, again using networkx but this time dropping the data into a graph based JSON format and using the d3.js force based layout to render it. What the script does is grab the followers of a particular cloudscape, grab each of their followers, and then graph how the followers of a particular cloudscape follow each other.

Because I had some problems getting the data into the template, I also used a slightly different wiring approach:

import urllib2,json,scraperwiki,networkx as nx
from networkx.readwrite import json_graph

id=cloudscapeID #need logic
typ='cloudscape'

urlstub="http://cloudworks.ac.uk/api/"
urlcloudscapestub=urlstub+"cloudscapes/"+str(id)
urlsuffix=".json?api_key="+str(key)

ctyp="/followers"
url=urlcloudscapestub+ctyp+urlsuffix
entities=json.load(urllib2.urlopen(url))

def ascii(s): return "".join(i for i in s.encode('utf-8') if ord(i)<128)

def getUserFollowers(id):
    urlstub="http://cloudworks.ac.uk/api/"
    urluserstub=urlstub+"users/"+str(id)
    urlsuffix=".json?api_key="+str(key)

    ctyp="/followers"
    url=urluserstub+ctyp+urlsuffix
    results=json.load(urllib2.urlopen(url))
    #print results
    f=[]
    for r in results['items']: f.append(r['user_id'])
    return f

DG=nx.DiGraph()

followerIDs=[]

#Seed graph with nodes corresponding of followers of a cloudscape
for c in entities['items']:
    curruid=c['user_id']
    DG.add_node(curruid,name=ascii(c['name']).strip())
    followerIDs.append(curruid)

#construct graph of how followers of a cloudscape follow each other
for c in entities['items']:
    curruid=c['user_id']
    followers=getUserFollowers(curruid)
    for followerid in followers:
        if followerid in followerIDs:
            DG.add_edge(curruid,followerid)

scraperwiki.utils.httpresponseheader("Content-Type", "text/json")

#Print out the json representation of the network/graph as JSON
jdata = json_graph.node_link_data(DG)
print json_graph.dumps(jdata)

In this case, I generate a JSON representation of the network that is then loaded into a separate HTML page that deploys the d3.js force directed layout visualisation, in this case how the followers of a particular cloudscape follow each other.

cloudworks_innerfrendsNet

This hits the Cloudworks API once for the cloudscape, then once for each follower of the cloudscape, in order to construct the graph and then pass the JSON version to the HTML page.

Again, I’m posting it as a minimum viable recipe that could be developed as a way of building out Sheila’s idea (though the graph definition would probably need to be a little more elaborate, eg in terms of node labeling). Some work on the graph rendering probably wouldn’t go amiss either, eg in respect of node sizing, colouring and labeling.

Still, what do you expect in just a couple of hours?!;-)

Further Dabblings with the Cloudworks API

Picking up on A Couple of Proof of Concept Demos with the Cloudworks API, and some of the comments that came in around it (thanks Sheila et al:-), I spent a couple more hours tinkering around it and came up with the following…

A prettier view, stolen from Mike Bostock (I think?)

prettier view d3js force directed layout

I also added a slider to tweak the layout (opening it up by increasing the repulsion between nodes) [h/t @mhawksey for the trick I needed to make this work] but still need to figure this out a bit more…

I also added in some new parameterised ways of accessing various different views over Cloudworks data using the root https://views.scraperwiki.com/run/cloudworks_network_d3js_force_directed_view_pretti/

Firstly, we can make calls of the form: ?cloudscapeID=2451&viewtype=cloudscapecloudcloudscape

cloudworks cloudscapes by cloud

This grabs the clouds associated with a particular cloudscape (given the cloudscape ID), and then constructs the network containing those clouds and all the cloudscapes they are associated with.

The next view uses a parameter set of the form cloudscapeID=2451&viewtype=cloudscapecloudtags and displays the clouds associated with a particular cloudscape (given the cloudscape ID), along with the tags associated with each cloud:

cloudworks cloudscape cloud tags

Even though there aren’t many nodes or edges, this is quite a cluttered view, so I maybe need to rethink how best to visualise this information?

I’ve also done a couple of views that make use of follower data. For example, here’s how to call on a view that visualises how the folk who follow a particular cloudscape follow each other (this is actually the default if no viewtype is given) –
cloudscapeID=2451&viewtype=cloudscapeinnerfollowers

cloudworks cloudscape innerfollowers

And here’s how to call a view that grabs a particular user’s clouds, looks up the cloudscapes they belong to, then graphs those cloudscapes and the people who follow them: ?userID=1174&viewtype=usercloudcloudscapefollower

cloudworks followers of cloudscapes containing a user's clouds

Here’s another way of describing that graph – followers of cloudscapes containing a user’s clouds.

The optional argument filterNdegree=N (where N is an integer) will filter the diaplayed network to remove nodes with degree <=N. Here’s the above example, but filtered to remove the nodes that have degree 2 or less: ?userID=1174&viewtype=usercloudcloudscapefollower&filterNdegree=2

cloudworks graph filtered

That is, we prune the graph of people who follow no more than two of the cloudscapes to which the specified user has added a cloud. In other words, we depict folk who follow at least three of the cloudscapes to which the specified user has added a cloud.

(Note that on inspecting that graph it looks as if there is at least one node that has degree 2, rather than degree 3 and above. I’m guessing that it originally had degree 3 or more but that at least one of the nodes it was connected to was pruned out? If that isn’t the case, something’s going wrong…)

Also note that it would be neater to pull in the whole graph and filter the d3.js rendered version interactively, but I don’t know how to do this?

However…I also added a parameter to the script that generates the JSON data files from data pulled from the Cloudworks API calls that allows me to generate a GEXF network file that can be saved as an XML file (.gexf suffix, cf. Visualising Networks in Gephi via a Scraperwiki Exported GEXF File) and then visualised using a tool such as Gephi. The trick? Add the URL parameter &format=gexf (the (optional) default is &format=json) [example].

gephiview of cloudworks graph

Gephi, of course, is a wonderful tool for the interactive exploration of graph-based data sets…. including a wide range of filters…

So, where are we at? The d3.js force directed layout is all very shiny but the graphs quickly get cluttered. I’m not sure if there are any interactive parameter controls I can add, but at the moment the visualisations border on the useless. At the very least, I need to squirt a header into the page from the supplied parameters so we know what the visualisation refers to. (The data I’ve played with to date – which has been very limited – doesn’t seem to be that interesting either from what I’ve seen? But maybe the rich structure isn’t there yet? Or maybe there is nothing to be had from these simple views?)

It may be worth exploring some other visualisation types to see if they are any more legible, at least, though it would be even more helpful if they were simply more informative ;-)

PS just in case, here’s a link to the Cloudworks API documentation.

PPS if there are any terms of service associated with the API, I didn’t read them. So if I broke them, oops. But that said – such is life; never ever trust that anybody you give data to will look after it;-)