Trackforward – Following the Consequences with N’th Order Trackbacks

One of the nice things about blogging within the WordPress ecosystem is the way that trackbacks/pingbacks capture information about posts that link back to your posts, in much the same way that using the link: search limit on a web or blog search engine allows you to see what other webpages are linking back to a particular web page.

In the latter case, for example, searching for link:http://hedebate.jiscinvolve.org/on-line-higher-education-learning/ on Google blogsearch will turn up blog posts that link back to the original HE Debate blog post on On-Line Higher Education Learning.

(Actually, that’s not quite true. In an apparent tweak of the Google blogsearch algorithm last year, the Google blogsearch engine now seems to be indexing and returning results from complete web pages rather than indexing the content of RSS feeds i.e. blog posts – which means that as well as the useful links referred to in the body of a post, links are also indexed from blogrolls, twitter feeds and bookmark lists displayed in blog sidebars, blog comments etc etc. Which in turn is to say that Google blogsearch qua a web search of blog web pages is not much use as a blog search engine at all…)

By judicious linking back to your own blog posts, it’s possible to build up quite complex pathways between related posts that are navigable in two directions: from one post that links to another, previously published post, via an inline link; and “forwards” in time to a later post that has itself linked back to a post of interest and been picked up via a trackback/pingback.

(For examples of these emergent link structures, see Emergent Structure in the Digital Worlds Uncourse Blog Experiment, Uncovering a Little More Digital Worlds Structure and Trackback Graphs and Blog Categories.)

So the question arises – if I write a blog post that several other people link back to, and several further posts in turn link back to those posts that referred back to my post, but not my original post, how do I keep track of the conversation?

Keeping track of posts that cite my post is easy enough – if I have an effective pingback set-up, that will tell me who’s linking back to my posts; or I can simply run link: searches against the URLs of my posts every so often to see who the search engines think are linking back to me.

The answer lies in a recursive algorithm of the form:

function showInLinks($url){
  $links=getLinksto($url);
  foreach ($link in $links){
    print $link;
    showInLinks($link)
  }
}

This will then display URLs for the pages that link to an originally specified URL, the URLs of pages that link to those URLs, and so on…

So here for example is a quick test:

The items numbered “1.” are links that Google blogsearch thinks link back to the original URL. The items numbered “2.” are links that link to the links that link back to the original URL.

Here’s some minimal PHP code if you want to try it out:

<?php
$urlstub = "http://ajax.googleapis.com/ajax/services/search/blogs?scoring=d&v=1.0&rsz=large&q=link%3A";
$url="http://halfanhour.blogspot.com/2008/11/future-of-online-learning-ten-years-on_16.html";
if ($_GET['url']) $url=$_GET['url'];
$testurl=$urlstub.$url;
echo "Starting with: ".$url."<br/>";
echo "via: ".$testurl."<br/><br/>";
$depth=0;

function handlelinks($url, $depth){
	$urlstub = "http://ajax.googleapis.com/ajax/services/search/blogs?v=1.0&rsz=large&q=link%3A";
	//echo "testing".$url."  ";
	$depth++;
	$testurl=$urlstub.$url;
	//echo "testing ".$testurl."  ";
	$ch = curl_init();
	curl_setopt($ch, CURLOPT_URL, $testurl);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	$body = curl_exec($ch);
	curl_close($ch);
	// now, process the JSON string
	$json = json_decode($body);
	//var_dump($json); echo "<br/&gt";    
	if ($depth<3) 
	  foreach (responseData->results as $result) {
		for ($i=0;$i<$depth;$i++) echo "  ";
		echo $depth.".$result->title;
		echo "<a href='".$result->postUrl."'>".$result->postUrl."</a><br/>";
		handlelinks($result->postUrl, $depth);
	 }
}
handlelinks($url, $depth);
?>

By using this sort of algorithm to generate an RSS feed of links, it becomes possible to subscribe to a feed that will keep you updated of all the downstream posts (“blogversation” posts) that are contributing to a discussion that at some point referred to a URL you are interested in.