Trying to find useful things to do with emerging technologies in open education and data journalism. Snarky and sweary to anyone who emails to offer me content for the site.
It struck me that it would be really handy to have a tool that could alert you to proposed amendments in your favourite Act in whatever Bills happen to be live at the moment.
A quick look at one of the .gov.uk websites provides an advanced search form that lets you search over current bills – UK Parliament Advanced Search:
Running a search on the phrase “the Data Protection Act 1998” and sorting the results by most recent first gave me a URL I could tinker with…
Clicking on a link should take you to the point in a Bill that mentions the Act you’re interested in:
Being a pipe, I get the RSS/JSON feed for free… which I can now subscribe to and use as an alerting service (for as long as the pipe’s screenscraping part works!) Ideally, of course, the parliamentary search would make results available as RSS…
As ever, this pipe took almost as much time to blog as it took to create…!
So we all know that the Google web search engine famously (and not just apocryphally) returns different results from it’s different national representations (google.com. google.co.uk, google.cn, etc.)…
…and hopefully we all know that if you are signed in to Google when you run a search, the default settings are such that Google will record your search and search results click-thru behaviour using Google Web History, and then in turn potentially use this intelligence to tweak your personal search results…
…and depending on how much you’ve been paying attention, you may know that Google Search Wiki lets you “customize search by re-ranking, deleting, adding, and commenting on search results. With just a single click you can move the results you like to the top or add a new site. You can also write notes attached to a particular site and remove results that you don’t feel belong. These modifications will be shown to you every time you do the same search in the future.”
Well now it seems that Google is experimenting with Google Preferred Sites, which let selected guinea pigs “set your Google Web Search preferences so that your search results match your unique tastes and needs. Fill in the sites you rely on the most, and results from your preferred sites will show up more often when they’re relevant to your search query” (see the official support page here”: Preferences: Preferred sites).
So the next time you give someone directions to a website using an instruction of the form “just google whatever, and it’ll be the first or second result”, bear in mind that it might not be…
(For what it’s worth, I run a cookie free, never logged in to Google browser to compare the results I get from my logged in’n’personalised Google results page and a raw organic” Google results page.)
Although Google manages to serve up pretty good results most of the time, sometimes it makes sense to give the search engine a hand by limiting the search to only provide results from a particular set of pages, or domains. So in this post I’ll describe a couple of “emergent” or “discovered” custom search engines that are available in tools you might already use.
(Custom search engines provide one way of achieving this, of course – set the limits over which you want results returned from, et voila… But creating custom search engines, as such, is not necessarily something that would occur to most people.)
Let’s start with delicious, the social bookmarking service, in which users bookmark links to delicious, with one or more tags.
Did you know that there are now a range of tools within delicious that let you search over the titles and descriptions of different sets of bookmarks?
If you pick a particular user, the default Search these bookmarks search will just search over the title and description fields of the bookmarks saved by that user. If you further limit the view of the bookmarks to those tagged in a particular way by a particular user, then the Search these bookmarks search will be limited to just those bookmarks. In other words, Search these bookmarks is context sensitive to the user, tag or user’n’tag combination that is currently selected.
(Remember that the full text of the bookmarked pages is not being searched – only the bookmark title and description fields – which is one good reason why it makes sense to fill in a bit of description about every bookmark you make: it makes (re)discovery of links at a future time easier…)
So where else do people create there own resource collections, or resource feeds Google Reader, maybe?
And as it happens, another emergent, “auto-created” custom search engine can be found just there:
The Google Reader search provides a blogsearch facility that lets you limit your search to the content of the RSS feeds you subscribe to in a variety of ways: the content of all your feeds, the content of the items you’ve read, the content of feeds bundled in various folders, and so on.
So for example, you could bundle a set of RSS feeds together in a single folder, and then, as if by magic, you have a custom search engine that searches over just the contents of those feeds.
With Google’s “official” blogsearch tool no longer functioning as such (rather than just indexing feed content – that is, just actual blog posts – it appears to be indexing blog web pages, so you get contaminated results that may only be a “hit” because your query was matched by sidebar content or other blog website fluff), the Google Reader search tool goes back to basics…
…the only problem is, that so far as I can tell, there is no way to subscribe to the results of any of these searches, and there is no published (or community documented) API for the Google Reader search facility… (so if someone can watch the AJAX calls and produce one, I’d be really grateful :-)
(By the by, can you define filters on folders in Google Reader, a bit like iTunes Smart Playlists?)
Link search: if you’re in the habit of searching social bookmarking sites such as delicious for useful links, whether by pivoting around particular tags or tag combinations, or by using the delicious search box, you might also be interested in searching for tweeted links. Here are a couple of ways of doing it…
The “official way”, using a Twitter advanced search form – just select the “contains Links” option.
This invokes a special search limit, filter:links, which you can also enter directly into the Twitter search box:
If for any reason that search limit isn’t working, here’s a workaround that makes use of Twitter search’s partial string matching capability:
Fan out: see which of your tweets have been retweeted by others (maybe;-)
This trick relies on a convention that has emerged in which Twitterers use the pattern along the lines ofRT @username “the original tweet”.
(See also the ReTweetist service, which will plot which of your messages have been retweeted, as well as the most popular current retweets.)
Also remember that you can subscribe to an RSS feeds of saved searches based on these query types:
Locale Based Searches
Want to know who’s recently been twittering (possibly) from nearby a particular location? Set the location options in the advanced search form, and run an otherwise empty query (i.e. no search terms in the search box).
So for example:
Now it used to be that you could search people’s biography or location strapline in Twitter, and find people to follow that way (that’s how I found several fellow Isle of Wight twitterers) , but that doesn’t seem possible using the “Find People” service at the moment? (And I can’t check to make sure, because the “Find People” service is temporarily stressed (i.e. down) again…).
So here’s a Google hack way round finding Twitterers from a particular location – construct a query of the form:
This works as follows – look for the search term, on twitter.com (site:twitter.com), but try not to return results from tweets (-inurl:status) or where part of the location appears in the user’s Twitter ID (-intitle:wight). If an individual’s page is indexed when there’s a tweet showing that contains the search term, then you may get the page returned as a result. But more likely you’ll only get results from pages where the search term is always present, such as when it’s part of a person’s bio… In a sense, this is a bit like indexing a fixed set of web search engine indexable, on-page, bio/location meta-data.