We take the art of geographic lookup for granted these days; type a place name into a form on a web site or feed it into a web service API and hey presto! Most of the time you’ll be told whether or not the place name is valid or not and, in case there’s more than one place with the same name, either asked to choose which one you mean or be presented with the most likely place.
Most of the time … but not all of the time.
The hey presto bit of the process seems at first glance to be relatively trivial but isn’t. Just ask anyone who’s had to implement a system that handles place names. Actually, the hey presto part is actually two discreet processes in their own right. First of all we need to identify a place, or whether indeed there’s a place at all; this is usually called geoidentification.
identify; verb; establish or indicate who or what (someone or something) is
This is the thing that determines that there is a place in “I’m in London today” but not in “I do love Yorkshire Pudding“.
Once a place has been identified, we need to work out if there’s more than one place of the same name (which is more than likely as we’re stunningly unimaginative where place names are concerned, duplicating and reusing the same name all over the world) and if so, which one. This is usually called geodisambiguation.
disambiguate; verb; remove uncertainty of meaning from (and ambiguous sentence, phrase or other linguistic unit)
Some places are pretty easy to disambiguate; as far as I know there’s only one Ouagadougou and that’s the capital of Burkina Faso. Some places should be easy to disambiguate, least at first sight; take London, that should be easy. It’s the capital of the United Kingdom. Well that’s true but it could also be the London in Ontario, or the one in Arkansas, in California, in Kentucky or any of the other 22 Londons that I’m aware of.
The gentle art of disambiguation is critical to the act of geocoding, geoparsing, geotagging and any of the other words the the location industry chooses to tack geo on as a prefix. Get disambiguation wrong and you fail on two counts.
Firstly, you’re showing your audience that you don’t know or don’t care about what they’re trying to tell you. Secondly, you allow your users the opportunity to specify the same place in a multitude of conflicting ways.
This is part of the problem of GeoBabel; your place is not my place.
So far, so theoretical, but let’s look at a concrete example of this. A few weeks back I added my Twitter account to the Twitter directory site wefollow.com. The first thing you’re asked to do is to supply your location, or to “Type Your City” as wefollow.com phrases it. So I type London and the site starts to attempt to disambiguate on the fly; so do I mean “London, United Kingdom” or “London, Ontario“? But wait, what about the other options?
Which “London” is the one tagged by 436 people but with no indication of which country? What’s the difference between “London, United Kingdom“, “London,UK” and “London England“. Space and punctuation, or the lack of it, is obviously important to wefollow.com here. So let’s try and give the system some help and start to type United Kingdom …
Oh dear. The “London, United Kingdom” still shows up but because I’ve put a space in there I don’t get offered “London,UK” anymore but I do get offered the London in the lesser known country of “Uunited Kingdom” and also “London, Ub2“, which one assumes is the UB2 postal code which specifies the London suburb of Southall.
Your place is not my place.
To be fair, I’m not singling wefollow.com out for attack here; this is just one of many examples of sites who try to use geographic lookup but end up making life difficult for their users (but which London do I pick?) and for themselves (now, how many users in London in the UK do we have?). I’d happily offer to help them; if only I could find any contact information anywhere on the site …
Another Piece Of Bloggage By Gary
Self professed "geek with a life", geo-blogger, geo-talker and geo-tweeter, Gary works in London and Berlin as Director of the Places Registry for Nokia; he's a co-founder of WhereCamp EU, the chair of w3gconf and sits on the W3C POI Working Group and the UK Location User Group. A contributor to the Mapstraction mapping API, Gary speaks and presents at a wide range of conferences and events including Where 2.0, State of the Map, AGI GeoCommunity, Geo-Loco, Social-Loco, GeoMob, the BCS GeoSpatial SG and LocBiz. Writing as regularly as possible on location, place, maps and other facets of geography, Gary blogs at www.vicchi.org and tweets as @vicchi.
Mail | Web | Twitter | Facebook | LinkedIn | Google+ | More Posts (271)Other bloggage that may or may not be geo-related to this one:
- Location vs. Place vs. POI
With Nokia, Google, Facebook and a whole host of other players recognising the inherent value in the concept of Places and Points Of Interest (POIs), it’s good to see that...
- Talking About A Sense Of Place
As a precursor to last week’s mashup* Digital Trends event, I chatted to Paul Squires of Imperica about my location trends in more detail than the mashup* format would have...
- Know Your Place; Adding Geographic Intelligence to your Content
Day two of the AGI GeoCommunity conference and the conference as a whole has ended. We discussed neogeography, paleogeography and pretty much all points in between, finally agreeing that labels...
- Retiring The Theory of Stuff; But First, A Corollary
It’s time to put the Theory of Stuff out to pasture. It’s had a good life. It’s appeared in 5 of my talk decks (or so Spotlight tells me), in...
- Plenaries, Privacy and Place
Day one of this year’s AGI GeoCommunity conference saw the geoweb track draw a sizeable, if varying, share of the delegate audience; some sessions were crammed tight and reduced to...




From when i last looked (while working on this problem at Bing), 20% of UK places share the same name, and about 50% in USA!
Ranking data to find best match using traditional methods, such as population or tags works 90% of the time, bringing in context meta-data can improve on that, such as users reverse ip, current map view etc.. Maybe there is opportunities in a PlaceRank algorithm, using a number of data sources, including social data, web crawling, spatial relationships, real time data. A place can be important one day and not the next!
Pingback: The Power of the Placename « James Thornett
I think the other issue here is, what do you use as the correct parent geography as ‘discriminator’: London, UK vs. London Ontario makes sense (‘State’/'Country’), but what happens when you get two in the same parent? If I choose ‘Newport’, I need ‘County’ to discriminate between ‘Gwent’ and ‘Shropshire’, for instance.
Near me is a village called ‘Newbiggin’. However, there are 3 in my county (North Yorkshire). Streetmap, for instance, lists all of these, but only with county name, so I need to view each on the map before I choose the correct one.
Size/significance, proximity and other context may help. In he end, a small thumbnail map might be the only way to avoid the old ‘try each one’ route.