Googling Nasties and Oopses on University and Public Sector Websites
Following a (re?) tweet from, err, someone, last Friday regarding searching Google for, err, something like fast track degree site:ac.uk, I stumbled across this (on a search for "cheap" site:.ac.uk)
(Spot the “privacy leak” on that screenshot…)
There then followed a series of IT baiting tweets trying to find inappropriate content across site:ac.uk and site:gov.uk websites;-)
Just sayin’…;-)
PS you may also like to try searching for things like:
- "confidential" "internal use only" filetype:pdf
- overspend filetype:xls site:gov.uk
- intitle:viagra intitle:buy site:ac.uk
and so on…
PPS see also: Whose Investor Relations Sites Do Thomson Reuters Host? A Form of URL Hacking…


Um, is it really ethical to post that information?
James Thorniley
January 11, 2012 at 6:27 pm
@james there a was a bit of a Twitter debate today around the dumping of Windows training for teaching programming in UK skills, all wrapped up in confused rhetoric about digital literacy, or digital skills, or something… I’m just posting what’s bleedin’ obvious to anyone who knows how to use a search engine…
Tony Hirst
January 11, 2012 at 10:26 pm
When you do a search like this to look for things that need fixing, use Google’s time restriction features to ensure you only see fresh results – they are bad at removing stale results for compromises that have since been cleaned up.
Tony Finch
January 11, 2012 at 11:25 pm
@Tony Yes, that’s a good point… but the fact that the pages are still indexed means that maybe a visit is needed to Google Webmaster Central to put in a removal request (which can be made against cached copies too… http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1663691 )
That said, for journalists and other investigators looking for smoking gun remnants of documents that were posted and have maybe since been removed from live sites, knowing how to make effective use of Google cache results is an important skill…;-) [I'll leave a discussion about the ethics of rummaging in search engine caches for another day;-)]
Tony Hirst
January 12, 2012 at 12:03 am
[...] The first context is “data journalism”, and the extent to which data journalists need to be able to do programming (in the sense of identifying the steps in a process that can be automated and how they should be sequenced or organised) versus writing code. (I can’t write code for toffee, but I can read it well enough to copy, paste and change bits that other people have written. That is, I can appropriate and reuse other people’s code, but can’t write it from scratch very well… Partly because I can’t ever remember the syntax and low level function names. I can also use tools such as Yahoo Pipes and Google Refine to do coding like things…) Then there’s the question of what to call things like URL hacking or (search engine) query building? [...]
Different Speeches? Digital Skills Aren’t just About Coding… « OUseful.Info, the blog…
January 12, 2012 at 1:12 pm
[...] core skills on “safe” datasets or randomly generated data files.) In the post Googling Nasties and Oopses on University and Public Sector Websites, a commenter asked: “is it really ethical to post that information?” in the context of [...]
Exploring GP Practice Level Prescribing Data « OUseful.Info, the blog…
April 27, 2012 at 9:10 pm