Tipped off by @brlamb to this Huffington Post story on Pearson ‘Education’ — Who Are These People? (which in turn led to this SEC filing (I found the risk assessment on pages 8-10 interesting), I started wondering about the web domains owned by Pearson.

Looking up the domain registration details for turned up a a handful of nameservers –,,, – which we can use as the basis for a reverse lookup to see what other sites are registered with the same domain server (and which presumably, therefore, relate to Pearson activities).

SO for example, Gwebtools turns up a couple of thousand or so domains dangling off, but I couldn’t get Haystax extractor to scrape more than a single page (not used it before? Maybe I was doing something wrong? Or maybe Chrome was playing up (too many open tabs again?!). I’m also too tired right now to write a scraper – been struggling to answer ReCaptchas all night (I guess that by now they’re completely inaccessible if you have dyslexia? It often takes me 5 or 6 refreshes before I feel confident going for one!) Which is to say, if you scrape the data describing all the domains associated with each of the Pearson nameservers, please post a link to it in the comments;-)

I don’t remember if I tried grabbing Pearson data from OpenCorporates to do a corporate sprawl graph..? I guess I should try and find what trademarks they have registered too?

WHich reminds me: is there a free open source of directors listing for UK companies yet? And how’s the Lobbiests register campaign (or WhosLobbying scraping) coming on? Is there a reverse lookup by company, so for example we could look to see who reps from Pearson had been chatting to?

I wonder also if Pearson support any All Parliamentary Groups…?

PS this was handy, at first… How to Find the other Websites of a Person?

PPS See also A Gust of WInd BLows Across HE on Pearson’s VUE assessment centres being used for open online course supervised examinations.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

4 thoughts on “Fragment: Pondering Mapping the Pearson Network”

  1. The URLs (from are in a Google spreadsheet at

    Note that NS appears to include the same URLs as NS2 (I did not do a match), while OLDTXNDS2 is almost, but not quite, the same as USRXDNS1.

    Also, for what it’s worth, the numbers of domains reported by does not match what lists. According to, NS and NS2 have more domains than lists, and lists more domains for OLDTXNDS2 and USRXDNS1 than reports.

    1. @ed Thanks for doing that/digging around. Does SpyOnWeb support paging of results do you know (so you can get to see all the domains it claims to know about?)

    2. SpyOnWeb does not appear to support paging of results at this time – it apparently just displays a max of 100 URLs per nameserver. And it asks users to not “use any robot, spider, other automated device or any tool-bar, web-bar, other web-client, device, software, routine or manual process, to monitor or scrap information from this Site or the Service, or bypass any robot exclusion request (either on headers or anywhere else on the Site).”

