Tracking UK Parliamentary Act Amendments

A flurry of posts around the interwebs today (e.g. + A CONCEALED ASSAULT ON PRIVACY +) picked up on some proposed amendments to the Data Protection Act that have found their way into the Coroners and Justice Bill that I posted about last week (Data Sharing is Good, Right? Or is HM Gov Evil?).

It struck me that it would be really handy to have a tool that could alert you to proposed amendments in your favourite Act in whatever Bills happen to be live at the moment.

A quick look at one of the .gov.uk websites provides an advanced search form that lets you search over current bills – UK Parliament Advanced Search:

Running a search on the phrase “the Data Protection Act 1998” and sorting the results by most recent first gave me a URL I could tinker with…

So here’s a pipe that’ll grab the most recent bills mentioning a particular act:

Clicking on a link should take you to the point in a Bill that mentions the Act you’re interested in:

Being a pipe, I get the RSS/JSON feed for free… which I can now subscribe to and use as an alerting service (for as long as the pipe’s screenscraping part works!) Ideally, of course, the parliamentary search would make results available as RSS…

As ever, this pipe took almost as much time to blog as it took to create…!

So maybe Charles Arthur should rethink If I had one piece of advice to a journalist starting out now, it would be: learn to code and instead focus on Learning to Think Like A Programmer?;-)

PS see also: They Work For You: Free Our Bills and They Work For You: Free Our Bills (Techy Stuff).

Another Nail in the Coffin of “Google Ground Truth”?

So we all know that the Google web search engine famously (and not just apocryphally) returns different results from it’s different national representations (google.com. google.co.uk, google.cn, etc.)…

…and hopefully we all know that if you are signed in to Google when you run a search, the default settings are such that Google will record your search and search results click-thru behaviour using Google Web History, and then in turn potentially use this intelligence to tweak your personal search results…

…and depending on how much you’ve been paying attention, you may know that Google Search Wiki lets you “customize search by re-ranking, deleting, adding, and commenting on search results. With just a single click you can move the results you like to the top or add a new site. You can also write notes attached to a particular site and remove results that you don’t feel belong. These modifications will be shown to you every time you do the same search in the future.

Well now it seems that Google is experimenting with Google Preferred Sites, which let selected guinea pigs “set your Google Web Search preferences so that your search results match your unique tastes and needs. Fill in the sites you rely on the most, and results from your preferred sites will show up more often when they’re relevant to your search query” (see the official support page here”: Preferences: Preferred sites).

So the next time you give someone directions to a website using an instruction of the form “just google whatever, and it’ll be the first or second result”, bear in mind that it might not be…

(For what it’s worth, I run a cookie free, never logged in to Google browser to compare the results I get from my logged in’n’personalised Google results page and a raw organic” Google results page.)

Discovered Custom Search Engines

Although Google manages to serve up pretty good results most of the time, sometimes it makes sense to give the search engine a hand by limiting the search to only provide results from a particular set of pages, or domains. So in this post I’ll describe a couple of “emergent” or “discovered” custom search engines that are available in tools you might already use.

(Custom search engines provide one way of achieving this, of course – set the limits over which you want results returned from, et voila… But creating custom search engines, as such, is not necessarily something that would occur to most people.)

Let’s start with delicious, the social bookmarking service, in which users bookmark links to delicious, with one or more tags.

Did you know that there are now a range of tools within delicious that let you search over the titles and descriptions of different sets of bookmarks?

If you pick a particular user, the default Search these bookmarks search will just search over the title and description fields of the bookmarks saved by that user. If you further limit the view of the bookmarks to those tagged in a particular way by a particular user, then the Search these bookmarks search will be limited to just those bookmarks. In other words, Search these bookmarks is context sensitive to the user, tag or user’n’tag combination that is currently selected.

(Remember that the full text of the bookmarked pages is not being searched – only the bookmark title and description fields – which is one good reason why it makes sense to fill in a bit of description about every bookmark you make: it makes (re)discovery of links at a future time easier…)

So where else do people create there own resource collections, or resource feeds Google Reader, maybe?

And as it happens, another emergent, “auto-created” custom search engine can be found just there:

The Google Reader search provides a blogsearch facility that lets you limit your search to the content of the RSS feeds you subscribe to in a variety of ways: the content of all your feeds, the content of the items you’ve read, the content of feeds bundled in various folders, and so on.

So for example, you could bundle a set of RSS feeds together in a single folder, and then, as if by magic, you have a custom search engine that searches over just the contents of those feeds.

With Google’s “official” blogsearch tool no longer functioning as such (rather than just indexing feed content – that is, just actual blog posts – it appears to be indexing blog web pages, so you get contaminated results that may only be a “hit” because your query was matched by sidebar content or other blog website fluff), the Google Reader search tool goes back to basics…

…the only problem is, that so far as I can tell, there is no way to subscribe to the results of any of these searches, and there is no published (or community documented) API for the Google Reader search facility… (so if someone can watch the AJAX calls and produce one, I’d be really grateful :-)

(By the by, can you define filters on folders in Google Reader, a bit like iTunes Smart Playlists?)

See also: Search Hubs and Custom Search at ILI2007.

PS if you are looking for an effective blogsearchengine, Icerocket has been grabbing the buzz lately…

A Couple of Twitter Search Tricks…

Just a quickie post, this one, to describe a couple of Twitter search tricks’n’tips (which is to say, this is an infoskills post, right?;-)

You can find the Twitter search tool at http://search.twitter.com. I actually call it in my browser using the keyword “tw” associated with a Firefox Keyword Search.

Link search: if you’re in the habit of searching social bookmarking sites such as delicious for useful links, whether by pivoting around particular tags or tag combinations, or by using the delicious search box, you might also be interested in searching for tweeted links. Here are a couple of ways of doing it…

The “official way”, using a Twitter advanced search form – just select the “contains Links” option.

This invokes a special search limit, filter:links, which you can also enter directly into the Twitter search box:

If for any reason that search limit isn’t working, here’s a workaround that makes use of Twitter search’s partial string matching capability:

Fan out: see which of your tweets have been retweeted by others (maybe;-)
This trick relies on a convention that has emerged in which Twitterers use the pattern along the lines ofRT @username “the original tweet”.

(See also the ReTweetist service, which will plot which of your messages have been retweeted, as well as the most popular current retweets.)

Also remember that you can subscribe to an RSS feeds of saved searches based on these query types:

Locale Based Searches
Want to know who’s recently been twittering (possibly) from nearby a particular location? Set the location options in the advanced search form, and run an otherwise empty query (i.e. no search terms in the search box).

So for example:

Now it used to be that you could search people’s biography or location strapline in Twitter, and find people to follow that way (that’s how I found several fellow Isle of Wight twitterers) , but that doesn’t seem possible using the “Find People” service at the moment? (And I can’t check to make sure, because the “Find People” service is temporarily stressed (i.e. down) again…).

So here’s a Google hack way round finding Twitterers from a particular location – construct a query of the form:

http://www.google.com/search?q=location+wight+site%3Atwitter.com+-inurl%3Astatus+-intitle%3Awight

This works as follows – look for the search term, on twitter.com (site:twitter.com), but try not to return results from tweets (-inurl:status) or where part of the location appears in the user’s Twitter ID (-intitle:wight). If an individual’s page is indexed when there’s a tweet showing that contains the search term, then you may get the page returned as a result. But more likely you’ll only get results from pages where the search term is always present, such as when it’s part of a person’s bio… In a sense, this is a bit like indexing a fixed set of web search engine indexable, on-page, bio/location meta-data.

[UPDATE: looking at the results preview, if we search for “Location Isle of Wight” we can probably filter the results even further:
“location isle of wight” site:twitter.com -inurl:status -intitle:wight

And as @daveyp suggests, we can also search for institutional allegiance within a profile, eg site:twitter.com -inurl:status -intitle:huddersfield location huddersfield university]

(You can do something similar to stalk people on MySpace.)

For more Twitter search tricks , check out the Twitter advanced search form, or have a creative play in Google;-)