Posts Tagged ‘garygale’

Know Your Place; Adding Geographic Intelligence to your Content

Day two of the AGI GeoCommunity conference and the conference as a whole has ended. We discussed neogeography, paleogeography and pretty much all points in between, finally agreeing that labels such as these get in the way of the geography itself. I was fortunate enough to have my paper submission accepted and presented a talk on how to Know Your Place at the end of the morning’s geoweb track. The paper is reproduced below and the deck that accompanies it is on SlideShare.

Know Your Place; Adding Geographic Intelligence to your Content


Yahoo! GeoPlanet exposes a geographic ontology of over six million named places, enabling technologies that join users with with most geographically relevant information possible and forms the heart of the Yahoo! Geo Technologies group’s technology platform.

GeoPlanet uses a unique, language neutral identifier for (nearly) all named places around the world. Each place exists within a graph of other places; the relationships between places are categorised by the nature of the relationship, categorised by administrative hierarchy, geographical scope and place type, amongst other. 

GeoPlanet’s geodata repository is exposed by publicly available web service platforms that allow places to be identified within content (Yahoo! Placemaker) and investigated by place name or identifier (Yahoo! GeoPlanet). Users are able to navigate rich metadata associated with a place including the place hierarchies and obtain parent, child, belong-to and neighbouring relationships.

For example, a list of first level administrative entities in a given country may be obtained by requesting the list of the children of that country. In a similar manner the surrounding postal codes of a given post code by be obtained via a request for its neighbours.

The framework for this is uniform and consistent across the globe and facilities geo-enrichment and geo-identification in a wide range of content, both structured and unstructured.

Place-based Thinking

Traditionally geography has been treated as a purely spatial exercise; this is certainly the case on the internet. Places are specified in terms of their longitude and latitude, and so cities or towns are referenced by the co-ordinate pair that identifies the theoretical or arbitrary centre of the place.

From this it can be seen that everything on the internet which is location related is referenced by a co-ordinate pair that has little relevance to a human but much relevance to a geographer or software which can algorithmically undertake a radius search from a point. Instead of a spatially based approach to location, Yahoo! Geo Technologies take a place based approach.

The map above shows a spatially correct map of the central area of the London Underground network similar to those produced up until the early 1930s; in the central area of London the map is compressed due to the close proximity of the lines and their stations.

In 1932 the familiar Tube map, shown below, was produced by Harry Beck in the form of a non geographic linear diagram. Whilst not geographically or spatially correct it is far more accessible and information rich due to Beck’s assumption that people are less concerned with the exact location of a station and more interested in how to change between lines and get to their destination.

We have taken a not dissimilar approach with our repository of named places, where a place can be a monument, a park, a colloquial region such as the Home Counties and continent or even the Earth. We have taken each of these different place names at all of their differing granularities and given them unique identifiers, called Where On Earth Ids.


The Where On Earth ID is a unique and permanent global identifier, shared publicly via the GeoPlanet and Placemaker API platforms.

They are language neutral, thus the WOEID for London is the same as for Londres, for Londra and for ロンドン, whilst recognising, for the London in the United Kingdom, that London, Central London, Greater London and the City of London are geographically related though separate places.

Their usage ensure that all Yahoo! APIs have the ability to employ geography consistently and globally.

A Global Geographic Ontology

Within our geodata repository we know not only where a place is geographically located, via its centroid, but also how these places relate to each other. This is more than an index of places, it is a geographic ontology of named places, each of which is referenced by a WOEID.

Using the postal town of Stratford-upon-Avon as an example, we can determine the children of a place, its parent, its adjacent places and non administrative or colloquial areas that a place belongs to or is contained within, at the following granularities. 

  • Supernames
  • Continents
  • Countries
  • Counties
  • Regions
  • Neighbourhoods
  • ZIP and Postal Codes
  • Custom Geographies

Joining People with Content and Content with People

We can use Placemaker to parse structured and unstructured content and to identify the places referenced, each of which is represented by a WOEID. Where more than one potential place exists for each name, a ranked list of disambiguated names is presented.

Each of the WOEIDs returned by Placemaker have the notional centroid and the bounding box, described by the South West and North East coordinates, as attributes. This allows the concept of a place to be displayed, such as that for the postal town of Stratford-upon-Avon, as shown below.

For each WOEID, we can use GeoPlanet to determine the vertical relationships of the place, such as which cities are in a country or which postal codes are within a city. We can determine the states, provinces or districts with in a country and which countries are on a continent. This powerful vertical hierarchy can be easily navigated from any WOEID.

GeoPlanet also contains a horizontal-like hierarchy, which frequently overlaps. If searching against a specific place such as a postal code, we can determine the surrounding postal codes as well; if searching for a town, we can determine the surrounding postal towns, as shown below.

GeoPlanet contains a rich ontology of named places, which allows us to look up places and where these places are. But more powerful is the relationship between places which allows users of GeoPlanet to add geographic intelligence to their use cases and applications, browsing the horizontal and vertical hierarchies with ease to discover geographic detail that no other point radius-based search would allow us to do.

Capturing the World’s Geography as it is Used by the World’s People 

The Oxford English Dictionary, often criticised for capturing transient or contentious terms, states its goal as “to capture the English language as it is used at this time” and not to impose how things are called. In the same vein, our goal is to capture the world’s geography as it is used by the world’s people.

We aim to follow the United Nations and ISO 3166-1 guidelines on the official name for a place but we strive to know the informal, the ethnic and the colloquial. We are less concerned with imposing a formal geography as we are with describing how a place is described today and what its relationship is with its parent, its children and its neighbours.

Thus we recognise that MOMA NYC (WOEIDs 23617044 and 2459115) is used to refer to the Museum of Modern Art in New York, that San Francisco (2487956) is the more commonly used form of The City and County of San Francisco and that the London Eye and the Millennium Wheel are synonymous (WOEID 22475381).

A Tale of Two Stratfords

Stratford is an important tourist destination, due to the town being William Shakespeare’s birthplace, with both the “on-Avon” and “upon-Avon” suffixes being used to refer to the town. GeoPlanet recognises both Stratford-on-Avon and Stratford-upon-Avon (WOEID 36424) when referring to the postal town and further recognises Stratford-on-Avon (WOEID 12696101) as the administrative District which is the parent for Stratford-upon-Avon.

“the Council often gets asked why there is a difference in using the terms ‘Stratford-on-Avon’ and ‘Stratford-upon-Avon’. Anything to do with the town of Stratford is always referred to as Stratford-upon-Avon. However, as a district council, we cover a much larger area than the town itself, but did not want to lose the instantly recognised tag of Stratford, so anything to do with the district is referred to as Stratford-on-Avon.” 

Appendix A – Data Background

The GeoPlanet geodata repository is derived from a variety of sources, both spatial data vendors, openly available sources and Yahoo! sourced. In raw form, it occupies 25 GB of storage; after automated  topology generation and semi automated processing to clean the data and to remove duplicates, the final data footprint is around 9.5 GB. A specialised Editorial team assesses overall data quality and integrity, areas of ambiguity and challenging geographics, such as disputed territories and colloquial areas.

Appendix B – Further Reading

  1. Yahoo! Developer Network – Yahoo! Placemaker
  2. Yahoo! Developer Network – Yahoo! GeoPlanet
  3. The London Tube Map Archive
  4. Transport for London – Design Classic
  5. Yahoo! Developer Network – Where On Earth Identifiers
  6. Oxford English Dictionary – Preface to the Second Edition (1989)
  7. Yahoo! Developer Network – On Naming and Representation
  8. Stratford-on-Avon District Council – Community and Living

Posted via email from Gary’s Posterous

Plenaries, Privacy and Place

Day one of this year’s AGI GeoCommunity conference saw the geoweb track draw a sizeable, if varying, share of the delegate audience; some sessions were crammed tight and reduced to standing room only whilst others had a slightly less cozy but still enthusiastic crowd.

Showing that Steven Feldman, the conference chair, started as he meant to continue, both the introductory plenaries were from people well known in the neogeography end of the geographic spectrum; Peter Batty and Andrew Turner.

Peter started talking about the Geospatial Revolution and about how geo is now mainstream after starting off life as a disruptive technology. He touched on crowdsourcing, neogeography and how geospatial data is really just another data type.

Due to Steven Feldman’s over running welcome plenary, Andrew gave us a view on How Neogeography Killed GIS in record time; talking to an appreciative crowd on place, data, and how neogeographers see GIS professionals (answer: they don’t).

The geoweb track kicked off with Tim Warr, down on the programme as working for Microsoft, announcing “I’m not working for Microsoft as of yesterday” and then promptly launched into a talk on Cloud Computing and GIS; All Hype or Something Useful? and covered the good cloud (accessibility, cost and speed), the bad cloud (security, control and continuity) and the realistic cloud where you don’t put all your clouds in one basket.

I was particularly pleased to see that WOEIDs made their debut at GeoCommunity thanks to Terry Jones and Tom Taylor.

Terry spoke about Using FluidDB for Storage and Location Aware Software Apps. If you haven’t come across FluidDB before, think about it as a wiki database for the web, or as Terry says “Why don’t our architectures let us work with information more flexibly?“; I strongly advise you look into this further and see what potential this platform has. WOEIDs were mentioned to a somewhat bemused audience but with a nice mention of my talk on this topic later today.

Tom took this one step further and gave a well received and insightful talk on the way Flickr are creating crowd sourced neighbourhood definitions from geotagged photos, all tagged with WOEIDs naturally. Tom’s Boundaries microsite shows just how powerful this can be, visualising and displaying neighbourhoods where no official definition exists, such as in London. Tom is a natural evangelist for this sort of data discovery process and caused some wry smiles when he added “I’m not an employee of Flickr or Yahoo! They haven’t paid me to say this“.

I took part in the Privacy: Where Do We Care? panel on location and the implications for privacy which I’ve blogged about earlier.

The day rounded off with a series of soapbox style georants; 15 slides, 20 seconds per slide and with the presenters having no control over the timing. Lots of themes were covered, some serious like Chris Osborne’s ITO World product pitch, some … interesting … like the Pitney Bowes boy’s geojokes, some semi disrespectful like my “Neo this and Paleo that … it’s all just Geo” (which will end up on my SlideShare account as soon as I find a net connection with some bandwidth) and some just rip roaringly hilarious like Ian Painter‘s paeon to palegeography which featured Martin DalyEd Parsons, Darth Vader and Isaac Newton. All of which were received by an increasingly well lubricated crowd from the soapbox arena, also know as the bar.

Photo credit: myself and Jeremy Morley.

Posted via email from Gary’s Posterous

Location and Privacy – Where Do We Care?

As part of this year’s AGI GeoCommunity ’09 conference, I took part in the Privacy: Where Do We Care? panel on location and the implications for privacy with Terry Jones, Audrey Mandela and Ian Broadbent, chaired and overseen by conference chair Steven Feldman.

Our location is probably the single most valuable facet of our online identity, although where I currently am, whilst interesting, is far less valuable and  personal than where I’ve been. Where I’ve been, if stored, monitored and analysed, provides a level of insight into my real world activities that transcends the other forms of insight and targeting that are directed at my online activities, such as behavioural and demographic analysis.

Where I’ve been, my location stream if you will, is a convergence of online and real world identity and should not be revealed, ignored or given away without thought and without consent.

In the real world we unconsciously provide differing levels of granularity in our social engagements when we answer the seemingly trivial question “where have you been?“. To our family and close friends we may give a detailed reply … “I was out with colleagues from work at Browns on St. Martin’s Lane, London“, to other friends and colleagues we may give a more circumspect reply … “I was out in the Covent Garden area” and to acquaintances, a more generalised reply … “I was in Central London” or even “mind your own business

As with the real world, so we should choose to reveal our location to applications and to companies online with differing levels of granularity, including the ability to be our own source of truth and to conceal ourselves entirely, in other words, to lie about where I am. 

Where I am in the real world should be revealed to the online world only on an opt-in basis, carefully considered and with an eye on the value proposition that is being given to me on the basis of revealing my location to a third party. My location is mine and mine alone and I should never have to opt out of revealing where am I and where I’ve been.

Posted via email from Gary’s Posterous

The Geo Ice Has Broken

Last night was the icebreaker for the AGI GeoCommunity conference in Stratford-upon-Avon (but not Stratford-upon-Avon, oh no, that’s the district not the town you know) and the run up to the conference has started extremely well, with the added bonus for me that John McKerrell of used a quote from one of my decks as the #geocom landing page.

Twitter is abuzz with commentary on what’s happening and who’s going to be doing what, all accompanied by the eponymous #geocom hashtag and everyone’s hoping that the conference lives up to their expectations. As Thierry Gregorious aptly put it on Twitter “#geocom If this feed is producing messages at current rate, will people be glued to their mobiles instead of the presentations?” … we shall see.

The ice breaker dinner well and truly broke ice and I landed up on a table full of geostrangers and Andrew Turner; as table 24 we put in a rather respectable joint second place in the 100 question quiz, but then crashed and burned to 3rd place after not being nearly accurate enough in the tie-breaker question on when precisely did the Berlin Wall come down.

After a surprisingly good dinner, with surprisingly good wine we sat through a surprising, and intriguing, comedienne who appeared to be the result of a union between Jasper Carrot and Victoria Wood. It was certainly an experience.

Finally everyone headed to the bar where some overworked and entirely good natured bar staff served us geolibations, geolagulavins and geo-gin-and-tonics until the early hours.

And the conference hasn’t even begun yet …

Posted via email from Gary’s Posterous