Following on from a the previous posts in which I looked at pulling data tables out of wikipedia into a Google spreadsheet and accessing financial data within Google spreadsheets, in this post I’ll show you some of the things you can do with the Google spreadsheet formula “=GoogleLookup”.
If you’ve ever entered a search such as “capital of brazil“, you might have noticed that the Google web search engine will try to treat your query as a factual question, and answer it as such:
The Google spreadsheet =GoogleLookup(“entity”; “attribute”) formula provides a way of asking this sort of factual query within a Google spreadsheet:
Why’s this useful? One thing it seems to be good for is as a way of doing lots of factual look-ups in one go. How so?
Create a new Google spreadsheet and type the following into cell B2:
=GoogleLookup($A2,B$1)
In cell A2 enter the word France and in cell B1 the word Population.
If you hover over cell A2, now populated with a number corresponding to the population of France, a tooltip will pop up showing the provenance of the data (i.e. where it came from), with a link to the data source.
Select cell A2 (the one containing the formula) and drag it using the small square in the bottom right hand corner of the cell across the row to cell C5. Highlight cells A2 to D2 and drag the small square marker in the bottom right hand corner of cell D2 across to cell D5. The formula will be copied into cells A2 to D5. The “$” symbols in the formula anchor various aspects of the formula so that it will look for the entity in column A of the current row, and the attribute in row 1 of the current column.
If you now add appropriate values into row 1 and column A, the GoogleLookup formula will be lookup the corresponding entity/attribute values (i.e. the corresponding facts).
Although I’ve had a quick look around for a comprehensive directory of allowable Google Lookup attribute values (and the sort of entity they should be paired with), I haven’t managed to find one. So I’ve put together a spreadsheet at http://spreadsheets.google.com/ccc?key=p1rHUqg4g421BB8IrYonydg where I’ll attempt to collate all the allowable lookup attributes I can find, along with appropriate exemplar entity values.
Using the ‘relative lookup formula’ approach shown above, the tables will show attribute values as column headings, and example entities as row headings.
At the moment I’ve started putting together Geography, Business and Chemistry lookup collections together:
In order to do a “bulk fact lookup”, enter a list of row label entity values into a spreadsheet, a set of attribute (fact) values as column headings, enter an appropriate ‘relativised GoogleLookup formula” into the first table data cell, and then drag it across the set of table data cells: voila – a bulk query made in a single shot across multiple entity and attribute values:-)
If you come across any more GoogleLookup attributes that seem to work, post a couple of example entity/attribute pairs in a comment to this post and I’ll add them to the =GoogleLookup formula attribute directory spreadsheet.
PS Looking at the sample spreadsheet just now, I notice that quite a few cells in the table may be unpopulated for particular entity values. So in a read/write web world, it should be possible to enter a value into the cell and have it logged, err, somewhere? Such as in a Google consensual fact engine?! ;-)
Interesting post, unfortunately it seems a little unpredictable, there are fair few unpopulated values and buggy results.
For example =GoogleLookup(“Vietnam”,”population”) results in “253/km² (46th)” which is the population density of Vietnam.
note: http://google.com/search?q=population+of+vietnam returns the correct value
I’ve logged an issue about attribute matching with the google docs team.
By the way, here’s a list of common entities and attributes:
http://documents.google.com/support/bin/answer.py?hl=en&answer=54199#sc1
There you go again- you taught me 2 things about Google Docs- been struggling to figure out out to do fill down and fill right, etc.
I tried to do some mountain elevations- I get results in google for “altitude of Mt Whitney” http://www.google.com/search?hl=en&q=elevation+of+mt+whitney
“altitude of Mt Everest” http://www.google.com/search?hl=en&q=elevation+of+mt+everest
“altitude of Mt McKinley” http://www.google.com/search?hl=en&q=elevation+of+mt+mckinley
but in spreadsheet form only McKinley gets a result (others are #NA)
Also it is not clear that the google search results which return a fact are correlated to the lookup functions. I played with “atomix weight of _______) in a spreadhseet which worked fine for “helium”, “oxygen”, “strontium” but don’t give a direct fact result in a google search.
Fun stuff!
Really impressiv. thanks for sharing.
Re: the Google consensual fact engine… You seen http://www.trueknowledge.com/ ?
@craig yes it is a bit unprdictable; but how neat would it be if the cells were editable (read/writable) if Google was returning N/A, and they allowed users to fill in the gaps?!
@alan thanks for the “altitude” attribute; I’ll add it to the spreadsheet.
@andy re: TrueKnowledge – I *keep* forgetting about that site (ask it a question, and using semantic web magic, it’ll try to give you an answer; I’ve been tracking it for ages, and play with it every so often, but usually only when reminded… There’s also http://freebase.com of course, and maybe a linked data browser out there that will expose related factual data too?