Six Degrees of Separation May Work in the Real World, But Not Inside a Siloed Organisation?

Trying to track down who knows a particular thing in an organisation can get a bit frustrating at times…

You ask someone who you think may know, and they don’t. Treating it like a six degrees of separation thing, you then ask each person for a name of someone they think might know. But that doesn’t work either.

At some point, messages get cc’d to people who have already asked (and who are presumably getting fed up with seeing the same request keep looping back to them). Which makes me think that a good definition of a silo might be defined in graph theoretic terms as a cycle?

Maybe silos do have ways out of them – a single connection between two subgraphs that are each heavily interconnected within themselves – or maybe they don’t…

For example, if edges are directed in the sense of who folk would think to ask about a topic maybe the person who connects one subgraph (that doesn’t know the answer) to the subgraph where someone does know the answer doesn’t have any incoming edges.

If I ask A_C or A_B, and B_C is the person who knows what I need to know, we’re stuck… Whereas if I’d asked A_A, B_A, or B_B, I’d have got there… If B_A is the person I need to find, then, erm… If it’s A_A, all hope is lost!

PS This makes me remember a weak optimisation trick: if you get stuck in a local minimum, start again by seeding with a new random starting point. Hmm… maybe sending random emails instead?

Do Special Interest Groups Reveal Their Hand in Parliamentary Debate?

Mulling over committee treemaps – code which I really need to revisit and package up somewhere – I started wondering…

…can we use the idea of treemap displays as a way in to help think about how interest groups may – or may not – reveal themselves in Parliament?

For example suppose we had a view over Parliamentary committees, or AAPGs. Add another level of structure to the treemap display showing members of each each committee with cells of equal area for each member. Now, in a debate, if any of the members of the committee speak, highlight the cell for that member.

With a view over all committees, if the members of a particular committee, or particular APPG, lit up, or didn’t light up, we might be able to to start asking whether the representation from those members was statistically unlikely.

(We could do the same for divisions, displaying how each member voted and then seeing whether that followed party lines?)

From mulling over this visually inspired insight, we wouldn’t actually need to use treemaps, of course. We could just run a query over the data and do some counting, creating “lenses” that show how particular interest groups or affiliations (committees, APPGs, etc) are represented in a debate?

Authoring Interactive Diagrams and Explorable Explanations

One of the things that the OU has always tended to do well is create clear – and compelling – diagrams and animations to help explain often complex topics. These include interactive diagrams that allow a learner to engage with the diagram and explore it interactively.

At a time when the OU is looking to reduce costs across the board, finding more cost effective ways of supporting the production, maintenance, presentation and updating of our courses, along with the components contained within them, is ever more pressing.

As a have-a-go technology optimist, I’m generally curious as to how technology may help us come up with, as well as produce, such activities.

I’m a firm believer in using play as a tool for self-directed discovery and learning, and practise as a way of identifying or developing, erm, new practise, and I’m also aware that new technology and tools themselves can sometimes require a personal time investment before you start to get productive with them. However, for many, if you don’t get to play often, knowing how to install or start using a new piece of software, let alone how to start playing with once you are in, can be a blocker. And that’s if you’ve got – or make – the time to explore new tools in the first place.

Changing a workflow is also not just down to one person changing their own practise – it can heavily depend on immediate downstream factors, such as what the person you hand over your work to is expecting from you in order for them to do their job.

(Upstream considerations can also make life more or less easy. For example, if you want to analyse a data set that the person before you has handed over as a table in a PDF document, you have to do work to get the data out of the document before you can analyse it.)

And that’s part of the problem: because tech can often help in several ways, but is sometimes most effective when you change the whole process; and if you stick with the old process, and just update one step of the workflow, that can often makes things worse, not better.

Sometimes, a workflow can just be bonkers. When we produced material for the FutureLearn Learn to Code MOOC, we used an authoring tool that could generate markdown content. The FutureLearn authoring environment is (I was told) a markdown environment. I was keen to explore an authoring route that would let us publish from the authoring environment to FutureLearn (in the absence of a FutureLearn API, I’d have been happy to finesse one by scraping form controls and bodging my own automation route.) As it was, we exported content from the markdown producing environment into Word, iterated through it there with the editor (introducing errors into code elements), and then someone cut and pasted the content into the FutureLearn editor, presumably restyling it as they did so. Then we had to fix the errors that were either introduced by the editing process, or made it through the editing process, by checking back against code in the original authoring environment. The pure markdown workflow was stymied because even though we could produce markdown, and FutureLearn could (presumably) accept it, the intermediate workflow was a Word based one. (The lesson from this? Innovation can be halted if you have to use legacy processes in a workflow rather than reengineering all of it.)

The OU-XML authoring route has similar quirks: authors typically author in Word, then someone has to copy, paste and retag the content in an XML authoring tool so it’s marked up correctly.

But that’s all by the by, and more than enough for the subject of another post…

Because the topic of this post is a quick round-up of some tools that support the creation – and deployment – of interactive diagrams and explorable explanations. I first came across this phrase in a 2011 post by Bret Victor – Explorable Explanations, and I’ve posted about them a couple of times (for example, Time to Revisit Tangle?).

One of the most identifiable aspects of many explorable explanations are interactive diagrams where you can explore some dynamic feature of an explanation in an interactive way. For example, exploring the effect of changing parameter values in an equation:

One of the things I’m interested in are frameworks and environments that support “direct authoring” of interactive components that could be presented to students. Ideally, the authoring environment should produce some sort of source code from which the final application can be previewed as well as published. Ideally, there should also be separation between style and “content”, allowing the same asset to be rendered in multiple ways, (this might include print as well as online static or interactive content).

Unfortunately, in many cases, direct authoring is replaced by a requirement to use some sort of “source code”. (That’s partly because building UIs that naive users can use can be really difficult, especially if those users refuse to use the UI because it’s a bit clunky. Even if the code the UI generates, which is the thing you actually want to produce, is actually quite simple and it would be much easier if authors wrote that source code directly.)

For example, I recently came across Idyll [view the code and/or read the docs], a framework for creating interactive documents. See the following couple of examples to get a feel for what it can do:

The example online editor gives an example of the markup language (markdown, with extensions) and the rendered, interactive document:

(It’d be quite interesting to see how closely this maps onto the markdown export from a Jupyter notebook that incorporates ipywidgets.)

Moving the sliders in the rendered document changes the variable values and dynamically replots the curve in the chart.

I can see Idyll becoming a component of the forthcoming OpenCreate tool, so it’ll be interesting if anyone else can – partly because it would presumably require downstream buy-in into using the interactive components Idylll bundles with.

Whilst Idyll is a live project, the next one – Apparatus –  looks to have stalled. It has good provenance, though, with one of the examples coming from Bret Victor himself.

Here’s an example of the sort of thing it can produce:

The view can also reveal the underlying configuration:

The scene is built up from a set of simple objects, or previously created objects (for example, the “Wheel with mark” This feature is important because it encourages another useful behaviour amongst new users: it encourages you to create simple building blocks that do a particular thing, and then assemble those building blocks to help you do more complex things later on.

The apparatus “manual” fits in one diagram:

The third tool – Loopy – also looks like it may be recently stalled (again, code is available and the UI is via a browser). This tool allows for the creation, through direct manipulation, of  a particular sort of “systems diagram” where influence at one node can positively or negatively influence another node:

To create a node, simply draw a circle; to connect nodes, draw a line from one node to another.

You can set the weight, positive or negative:

 

As well as adding and editing text, and moving or deleting items:

You can also animate the diagram, feeding in positive or negative elements from one item and seeing how those changes feed through to influence the rest of the system:

The defining setup of the diagram can be saved in a URI and then shared.

All three of these applications encourage the use to explore a particular explanation.

Apparatus and LOOPY both provide direct authoring environments that allow the user to create their own scenes through adding objects to a canvas, although Apparatus does require the user to add arithmetic or geometrical constraints to some items when they are first created. (Once a component has been created, it can just be reused in another diagram.)

Apparatus and LOOPY also carry their own editor with them, so a user could change the diagram themselves. In Idyll, you would need access to the underlying enhanced markdown.

If you know of any other browser based, open source frameworks for creating and deploying standalone, iframe/web page embeddable interactive diagrams and explorable explanations, please let me know via the comments.

PS for a range of other explorable explanations, see this awesome list of explorables.

Ad-Tech – A Great Way in To OSINT

Open Source Intelligence – OSINT – is intelligence that can be collected from public sources. That is to say, OSINT is the sort of intelligence that you should be able to collect using a browser and a public or academic library that also provides access to public subscription content. (For an intro to OSINT, see for example Sailing the Sea of OSINT in the Information Age; for example context, Threat Intelligence: Collecting, Analysing, Evaluating). OSINT can be used as much by corporates as by the security services. It’s also up for grabs by journalists, civil society activists and stalkers…

Looking at the syllabus for a OSINT beginners course, such as IMSL’s Basic Open Source (OSINT) Research & Analysis Tradecraft turns up the sorts of thing you might also expect to see as part of one of Phil Bradley or Karen Blakeman’s ILI search workshops:

  • Appreciation of the OS environment
    • Opportunities, Challenges and Threats
  • Legal and Ethical Guidance
  • Search Tradecraft
    • Optimising Search
    • Advanced Search Techniques
  • Profile Management and Risk Reduction
    • Technical Anonymity/Low Attribution
    • Security Tradecraft
  • Social Media exploitation
    • Orientation around the most commonly used platforms Twitter, Facebook, LinkedIn etc.
    • Identifying influence
    • Event monitoring
    • Situational Awareness
    • Emerging social media platforms
  • Source Evaluation
    • Verifying User Generated Content on Social Media

And as security consultant Bruce Schneier beautifully observed in 2014, [s]urveillance is the business model of the Internet.

What may be surprising, or what may help explain in part their dominance, is that a large part of the surveillance capability the webcos have developed is something they’re happy to share to with the rest of us. Things like social media exploitation, for example, allow you to easily identify social relationships, and pick up personal information along the way (“Happy Birthday, sis..”). You can also identify whereabouts (“Photo of me by the Eiffel Tower earlier to day”), captioned or not – Facebook and Google will both happily tag your photos for you to make them, and the information, or intelligence, they contain more discoverable.

Part of the reason that the web companies have managed to grow so large is that they operate very successful two-sided markets. As the FT Lexicon defines it, these are markets that provide “a meeting place for two sets of agents who interact through an intermediary or platform”. In the case of the web cos, “social users” who gain social benefit from interacting with each other through the platform, and the advertisers who pay the platform to advertise to the social users (Some Notes on Churnalism and a Question About Two Sided Markets).

A naive sort of social media intelligence would focus, I think, on what can be learned simply through the publicly available activity on the social user side of the platform, albeit activity that may be enriched through automatic tagging by the platform itself.

But there is the other side of the platform to consider too. And the tools on that side of the platform, the tools developed for the business users, are out and out designed to provide the business users – the advertisers – with intelligence about the social users.

Which is all to say that if surveillance is your thing, then ADINT – Adtech Intelligence – could be a good OSINT way in, as a recent paper from the Paul G. Allen School of Computer Science & Engineering, University of Washington describes: ADINT: Using Targeted Advertising for Personal Surveillance (read the full paper; Wired also picked up the story: It Takes Just $1,000 to Track Someone’s Location With Mobile Ads). Here’s the paper abstract:

Targeted advertising is at the heart of the largest technology companies today, and is becoming increasingly precise. Simultaneously, users generate more and more personal data that is shared with advertisers as more and more of daily life becomes intertwined with networked technology. There are many studies about how users are tracked and what kinds of data are gathered. The sheer scale and precision of individual data that is collected can be concerning. However, in the broader public debate about these practices this concern is often tempered by the understanding that all this potentially sensitive data is only accessed by large corporations; these corporations are profit-motivated and could be held to account for misusing the personal data they have collected. In this work we examine the capability of a different actor — an individual with a modest budget — to access the data collected by the advertising ecosystem. Specifically, we find that an individual can use the targeted advertising system to conduct physical and digital surveillance on targets that use smartphone apps with ads.

The attack is predicated in part around knowing the MAID – the Mobile Advertising ID (MAID) – of a user you want to track, and several strategies are described for obtaining that.

I haven’t looked at adservers for a long time (or Google Analytics for that matter), so I thought I’d have a quick look at what the UIs support. So for example, Google AdWords seems to offer quite a simple range of tools, that presumably let me target based on various things, like demographics:

or location:

or time:

It also looks like I can target ads based on apps a user users:

or websites they visit:

though it’s not clear to me if I need to be the owner of those apps or webpages?

If I know someone’s email address, it also looks like I can use that to vector an ad towards them? Which means Google cookies presumably associate with an email address?

This email vectoring is actually part of Google’s “Customer Match” offering, which “lets you show ads to your customers based on data about those customers that you share with Google”.

So how about Facebook? As you might expect, there’s a range of audience targeting categories that draw heavily on the information users provide to the system:

(You’ve probably heard the slogan “if you aren’t paying for the product, you are the product” and thought nothing of it. Are you starting to feel bought and sold, yet?)

Remember that fit of anger, or joy, when you changed your relationship, maybe also flagging a life event (= valuable to advertisers)?

Or maybe when you bought that thing (is there a Facebook Pay app yet, to make this easier for Facebook to track?):

And of course, there’s location:

If you fancy exploring some more, the ADINT paper has a handy table summarising what’s offered by various other adtech providers:

On the other hand, if you want to buy readymade audiences from a data aggregator, try the Oracle Data Marketplace. It looks as if they’ll happily resell you audiences derived from Experian data, for example:

So I’m wondering, what other sorts of intelligence operation could be mounted against a targeted individual using adtech more generally? And what sorts of target identification can be achieved through a creative application of adtech, and maybe some simple phishing to entice a particular user onto a web page you control and which you can use to grab some preliminary tracking information from targeted users you entice there?

Presumably, once you can get your hooks into a user, maybe by enticing them to a web page that you have set up to show your ad so that the adserver can spear the user, you can also use ad retargeting or remarketing (that follows users around the web, in the sense of continuing to show them ads from a particular campaign) to keep a tail on them?

[This post was inspired by an item on Mike Caulfield’s must read Traces weekly email newsletter. Subscribe to his blog – Hapgood – for a regular dose of digital infoskills updating. You might also enjoy his online book Web Literacy for Student Fact-Checkers.]

Library ezproxy Access to Authenticated Subscription Content

I didn’t make it to ILI this year – for the first time they rejected all my submissions, so I guess I’m not even a proper shambrarian now :-( – but I was reminded  of  it, and the many interesting conversations I’ve had there in previous years, during a demo of the LEAN Library browser extension by Johan Tilstra (we’d first talked about a related application a couple of years ago at ILI).

The LEAN Library app is a browser extension (available for Chrome, Firefox, Safari) that can offers three services:

  • Library Access: seamless access to subscription content using a Library’s ezproxy service;
  • Library Assist: user support for particular subscription site;
  • Library Alternatives: provide alternative sources for content that a user doesn’t have access to in one subscription service that they do have access to in another service.

Johan also stressed several points in the demo:

  • once installed, and the user authenticated, the extension rewrites URLs on domains the library has subscribed to automatically; as soon as you land on a subscribed to site, you are redirected to a proxied version of the site that lets you download subscription content directly;
  • the extension pops up a branded library panel on subscribed to sites seamlessly, unlike a bookmarklet that requires user action to trigger the proxy behaviour; because the pop-up appears without any other user interaction required, when a user visits a site that they didn’t know was subscribed to, they are informed of the fact. This is really useful for raising awareness amongst library patrons of the service that is being provided by the library.

I used to try to make a similar sort of point back when I used to bait libraries regularly, under the mantle of trying to get folk to think about “invisible library” services (as well as how to make folk realise they were using such services):

The LEAN Library extension is sensitised to sites subscribed to from the library from a whitelist downloaded from a remote configuration site. In fact, LEAN Library host the complete configuration UI, that allows library managers to define and style the content pop-up and define the publisher domains for which a subscription exists. (FWIW, I’m not clear what happens when a journal on a publisher site that is not part of a subscription package is the one the user wants to access?)

This approach has a couple of advantages:

  • the extension doesn’t try every domain the user visits to see if it’s ezproxy-able, it has a local list of relevant domains;
  • if the Library updates its subscription list, so is the extension.

That said, if a user does try to download content, it’s not necessarily obvious how the library knows that the proxy page was “enabled” by the extension. (If the URL rewriter added a code parameter to the URL, would that be trackable in the ezproxy logs?)

It’s interesting to see how this idea has evolved over the years. The LEAN Library approach certainly steps up the ease of use from the administrator side, and arguably for users to. For example, the configuration site makes it easy for admins to customise the properties of the extension, which used to require handcrafting in the original versions of this sort of application.

As to past – I wonder if it’s worth digging through my old posts on a related idea, the OU Library Traveller, to see whether there are any UX nuggets, or possible features, that might be worth exploring again? That started out as a bookmarklet and built on several ideas by Jon Udell, before moving to a Greasemonkey script (Greasemonkey was an extension that let you run your own custom scripts via the extension):

PS In passing, I note that the OU libezproxy bookmarklet is still availableI still use my own bookmarklet several times a week. I also used to have a DOI version that let you highlight a doi and it would resolve it through the proxy (DOI and OpenURL Resolution)? There was a DOI-linkifier too, I think? Here’s a (legacy) related OU Library webservice: EZproxy DOI Resolver

Fragment – DIT4C – Docker Base Containers for Edu Remote Computing Labs

What’s an effective way of helping a student run a desktop application when their own computer won’t run the application, for whatever reason, locally? Virtualised software, running remotely, provides one solution. So here’s an example of a project that looks at doing just that:  DIT4C (“Data Intensive Tools for the Cloud”)a platform for hosting data analysis tools “in the cloud” using containers [repo].

Prepackaged, standalone containers are defined for a range of applications, including RStudio, Jupyter notebooks, Jupyter+R and OpenRefine

Standalone Containers With Branded Landing Page

The application containers are built on top of a base container that includes an nginx webser/proxy, a GoTTY shell and a file uploader. The individual containers then have a “homepage” that links to the particular application:

So what do we have at this point?

  • a branded landing page;
  • browser accessed shell:
  • a browser accessed file uploader:

These services are all running within a single container. I don’t know if there’s a way of linking multiple containers using docker-compose? This would require finding some way of announcing the services provided by each container, to a central nginx server which could then link to each from a single homepage. But this would mean separate terminals and file loaders into each one (though maybe the shared files could be handled as a single mounted volume shared across all the linked containers?

Once again, I’m coming round to the idea that using a single container to run multiple services, rather than several linked containers each running a single service, is simpler, even if it does go against the (ideal?) model of using containers as part of a small pieces, loosely joined architecture? I think I need to post a simple recipe (or recipes) somewhere that show different ways of running multiple services within a single container. The docker docs  – Run multiple services in a container – provide a crib in to this at the moment.

X11 Applications

Skimming the docs, I notice reference to a base X11 desktop container. Interesting… I have a PhD student looking for an easy way to host a Qt widget running application in the cloud for evaluation purposes. To this end I’ve just started looking around for X11/noVNC web client containers that would allow us to package the app in a simple container then access it from something like Digital Ocean (given there’s no internal OU docker container hosting service that I’m allowed to access (or am aware of… Maybe on the Faculty cluster?)).

So things like this show the way – a container that offers a link to a containerised “desktop” application, in this case QGIS (dit4c/dockerfile-dit4c-container-qgis); (does the background colour mean anything, I wonder? How could we make use of background colour in OU containers?):


Following the X11 Session link, we get to a desktop:

There’s an icon in the toolbar to the application we want – QGIS:

What I’m thinking now is this could be handy for running the V-REP robot simulator, and maybe Gephi…

It also makes me think that things could be simplified a little further by offering a link to QGIS, rather than X11 Application, and opening the application in full screen mode (on the virtualised desktop) on start-up. (See Distributing Virtual Machines That Include a Virtual Desktop To Students – V-REP + Jupyter Notebooks for some thoughts on how to use VMs to distribute a single pre-launched on startup desktop application to try to simplify the student experience.)

It also makes me even more concerned about the apparent lack of interest in the OU, and even awareness of, the possibilities of virtualised software offerings. For example, at a recent SIG group on (interactive) maps/mapping, brief mention was made of using QGIS, and problems arising therefrom (though I forget the context of the problems). Here we have a solution – out there for all to see and anyone to find – that demonstrates the use of QGIS in a prebuilt container. But who, internally, would think to mention that? I don’t think any of the Tech Enhanced Learning folk I’ve spoken to would even consider it, if they are even aware of it as an option?

(Of course, in testing, it might be rubbish… how much bandwidth is required for a responsive experience when creating detailed maps? See also one of my earlier related experiments: Accessing GUI Apps Via a Browser from a Container Using Guacamole, which remotely accessing the Audacity audio editor using a cloud hosted container.)

The Platform Offering

Skimming through the repos, I (mistakenly, as it happens) thought I saw a reference to resbaz (ResBaz Cloud – Containerised Research Apps as a Service). I was mistaken in thinking I had seen a reference in the code I skimmed though, but not, it seems, in the fact that there is a relationship:

And so it seems that perhaps more interestingly than the standalone containers is that DIT4C is a platform offering (architecture docs), providing authenticated access to users, file persistence (presumably?) and the ability to launch prebuilt docker images as required.

That said, looking at the Github repository commits for the project, there appears to have been little activity since March 2017 and the gitter channel appears to have gone silent at the end of 2016. In addition, the docs for getting an instance of the platform up and running are a little bit too sparse for me to follow easily… [UPDATE: it seems as if the funding did run out/get pulled:-(]

So maybe as a project, DIT4C is perhaps now “of historical interest” only, rather than being a live project we might have been able to jump on the back of to get an OU hosted remote computing lab up and running? :-( That said, the ResBaz (Research Bazaar) initiative, “worldwide festival promoting the digital literacy emerging at the center of modern research”, still seems to be around…

Digital Dementia – Are Google Search and the Web Getting Alzheimer’s?

According to the Alzheimer’s Society, Alzheimer’s disease – one of the most common forms of dementia – memory lapses tend to be one of the first symptoms sufferers become aware of along with “difficulty recalling recent events and learning new information”.

One of the things I have been aware of for some time but only started trying to pay more attention to recently, is how Google search increasingly responds to many of my tech related web queries with results that are dated 2013 and 2014. In addition, the majority of traffic to my blog is directed to a few posts that are themselves several years old, and that were shared – through blog links and links from other public websites at the time they were posted.

(I also note that Google web search is increasingly paranoid. If I run search limited queries, for example using the site: or inurl: or filetype: search limits, it often interrupts the search with a dialog asking if I am a robot.)

So I’m wondering, has Google web search,  and the web more generally, got a problem?

Google’s early days of search, that helped promote it’s use, were characterised by a couple of things that I remember from discovering via the MetaCrawler web search engine that aggregated results from several other web search engines: one was that the results were relevant, the other was that the Google search engine results came back quickly.

Part of Google’s secret sauce at the time was PageRank, an algorithm mined the link structure of the web – how websites linked to pages on other sites – to try to work out which pages the web thought were important. The intuition was that folk link to things they are (generally) happy to recommend, or reference as in some way authoritative, and so by mining all these links you could rank pages, and websites, according to how well referenced they were, and how well regarded the linking sites were in turn.

Since its early days, Google has added many more ranking factors (that is, decision criteria it uses to decide which results to put at the top of a search results listing for a particular query) to its algorithm.

To the extent that Google can generate a significant proportion of a websites traffic from its search results pages, this led to many websites engaging in “search engine optimisation”, where they try to identify Google’s secret ranking factors and maximise their webpages’ scores against them. It also means that the structural properties of webcontent itself may be being shaped by Google, or at least web publishers’ ideas of what the Google search engine favours.

If it is true that many of the pages from 2013 or 2014 are the most appropriate web results for the technical web searches I run, this suggests that the web may have a problem: as a memory device, new memories are not being laid down (new, relevant content, is not being posted).

On the other hand, it may be that content is still being laid down (I still post regularly to this blog, for example), but it is being overlooked – or forgotten – by the gateways that mediate our access to it, which for many is Google web search.

To the extent that Google web search still uses PageRank, this may reflect a problem with the web. If other well regarded sites don’t link to a particular web page, then the link structure of the web that gives sites their authority (based, as it is in PageRank, on the quality of links incoming from other websites) is impoverished, and the old PageRank factors, deeply embedded in the structure of the web that holds over from 2013 or 2014, may dominate. Add in to the mix that one other ranking factor is likely to be the number of times a link is followed from the Google search results listing (which in turn is influenced by how high up the results the link appears), and you can start to see how a well told story, familiar in the telling, keeps on being retold: the old links dominate.

If the new “memories” are still being posted into the web, then why aren’t they appearing in the search results? Some of them may do, at least in the short term. Google’s web crawlers never sleep, so content is being regularly indexed, often shortly after it was posted. (I still remember a time when it could take days for a web page to be indexed by the search engines; nowadays it can be near instant.) If a ranking factor is recency (as well as relevance), a new piece of content can get a boost if a search to which it is relevant is executed soon after the content is posted.

Recently posted content may also get a boost from social media shares (maybe?), in which a link to a piece of content is quickly – and easily – shared via a social network. The “half-life” of links shared on such media is not very long, links typically being shared soon after they are first seen, and then forgotten about.

Such sharing causes a couple of problems when it comes to laying down structural “web memories”. For example, links shared on any given social media site may not be indexable, either usefully, or at all,  either in whole, or in part, by the public web search engines, for several reasons:

  • shares are often “ephemeral”, in that they may disappear (to all intents and purposes) from the social network after a short period of time. (Just try searching for a link you saw shared on Twitter three of four weeks ago, if you can remember one from that far back…).
  • the sheer volume of links shared on global social networks can be overwhelming;
  • the authority of people sharing links may be suspect, and the fact that links are shared by large numbers of unauthoritative actors may swamp signal in noise. (There is also the issue of the number of false actors on social media – easily created bot accounts, for example, slaved to sharing or promoting particular sorts of content.)

Whilst it’s never been easier to “share” a link, or highlight it (through “favouriting” or “liking”), the lack of effort in doing so is reflected by the lack on interest reflected in the deeper structure of the web. If you don’t add your recommendations to the structural web, or contribute content to it, it starts to atrophy. However, if you take the time to make a permanent mark in the structure of the web, by posting a blog post to a lasting, public domain, with a persistent URL that others can link to, and in turn embed your content in a contextually meaningful way by linking to other posts that you value as useful context related to the content of your own post, you can help build new memories and help the web keep digital dementia at bay.

See also: The Web Began Dying in 2014, Here’s How and The Web We Lost /via @charlesarthur