Posts Tagged: geo


28
Jan 10

What Happens When Geography and Innovation Collide

It’s taken a while but the consultation into opening up the Ordnance Survey’s United Kingdom mapping and geographic data is out and is no doubt being debated, looked at, discussed, pulled apart and opined on. Whilst every Ordnance Survey employee I’ve ever spoken to is utterly in favour of this move there’s still continued resistance to openness, though the gap between the two extremes of FreeOurData and the UK Government’s Cabinet Office is closing and closing fast. Of course, it doesn’t help when the Ordnance Survey asserts rights over the crime maps produced by London’s Metropolitan Police either.

But baby steps, as my friends in the United States often say. One such step is GeoVation, a Wikiword style merging of geography and innovation.

Last year I was approached by the organisers of the GeoVation challenge to be a judge in an endeavour that  ”allows innovative thinkers and geographic data to come together for social, environmental and economic benefit through the use of geography”. It looked like an Ordnance Survey public relations exercise to provide a seed fund to encourage entrepreneurs to use Ordnance Survey data.
But the organisers had good credentials, I knew most of them and respected them and so I actually read the small print. Yes, GeoVation was funded and supported by the Ordnance Survey. Yes, the seed fund pot, some £20K, came from the Ordnance Survey. But using Ordnance Survey data was not obligatory, mandatory or even strongly encouraged. I heard the phrases “what about GeoNames” and “what about OpenStreetMap” enough to accept the offer and become a GeoVation judge. Not everyone thought this was a good idea or saw beyond the Ordnance Survey involvement. It wasn’t just me either, I was joined by Steve Coast the founder of crowd-source mapping project, OpenStreetMap; James Alexander, CEO of Green Thing, the online service that encourages people to lead greener lives; James Cutler, CEO of eMapSite, the incredibly tall Peter ter Haar from the OS and we were helped by chairperson Steven Feldman.
There were a lot of submissions and ideas to look through. 380 people signed up, 170 ideas were submitted and almost 70 ventures were formally proposed to be entered into the award. We had some reading to do.
Let’s briefly mention the venture submissions for a moment. They varied. Oh how they varied. It’s unfortunate to say that a 15 minute video submission, a one page submission which doesn’t actually tell you what the venture is and a 20 page submission which still doesn’t tell you what the venture is are unlikely to engage the attention of the judges. But in the end we came up with a shortlist of 9 ventures and descended on the Ondaatje Theatre in London’s Royal Geographical Society for the final showcase. Each venture had 4 minutes to pitch their idea to the judges, followed by brief questions from the judges and from the audience. It doesn’t sound easy and it wasn’t, but each pitch put their heart and soul into it. After all the pitches were over, the judges retired to a back room for plenty of coffee and some animated voting and discussions. After 45 minutes we emerged, blinking, into the light, still friends and still talking to each other.
In first place and walking away with £10K were MaxiMap, a large scale education floor map of the British Isles which helps children understand the geography of where they live.
In second place, accompanied by a fetching gorilla suit, and loping away with £7K were Mission: Explore London, a team of geography addicted teachers, designers and artists who wanted to help children explore the city.
And in third place with £3K was London Blue Plaque Search, dedicated to making the iconic GLC/GLA/LCC/English Heritage blue plaques open to everyone.
After almost 6 months of meeting, discussing, debating and geopontificating GeoVation was finally over. At least for 2010. The challenge and awards will be returning in 2011 with even less Ordnance Survey involvement, though hopefully they’ll still contribute towards the seed fund. And as I seem to be quoted as saying in several places …
“One of the judges, Gary Gale, Director of Engineering for Yahoo! Geo Technologies, said: ‘The standard of entries was fantastic and the scope of them far-reaching and varied. Each of the finalists can and should be proud of getting to the finals and being able to showcase their geo-vision. But in the end, the judges decided that MaxiMap was the one idea that could make the most impact and had the greatest potential.’”
… and I can’t really sum it up better than that.
Photo credit: pomphorhynchus on Flickr
Written and posted from home (51.427051, -0.333344)

19
Jan 10

Is it Great Britain, the United Kingdom, the British Isles or what exactly?

In February 2009 I wrote a post for the Yahoo! Geo Technologies blog about how people outside of the United Kingdom are sometimes confused by the vagaries of how to correctly write street addresses in the UK and if the United Kingdom is a country and if England is a country then how can England be part of the United Kingdom. Some pointed comments to the original post ensued from the likes of Ed Parsons from Google and Andrew Larcombe from the British Computer Society’s Geospatial Specialist Group.

And so almost a year later I went back and started to research exactly how the United Kingdom, Great Britain and the British Isles are actually put together. It was an educational journey because, even with being born and bred in London, it turned out that even I didn’t fully understand this subject. So I tried to codify it with a variation on The Great British Venn Diagram, which looks something like this:

Let’s start with the easy bit. England, Scotland, Wales and Northern Ireland are constituent countries at an administrative level; they’re shown in yellow on the diagram above.

Great Britain, so named as to distinguish itself from Brittany, is a geographic island which comprises the countries of England, Scotland and Wales.

The United Kingdom is a sovereign state, shown in red, which comprises England, Scotland, Wales and Northern Ireland.

Ireland, also a geographic island, contains the administrative country of Northern Ireland and the sovereign state of the Republic of Ireland or Eire.

So far so good, but what about the Isle of Man and the Channel Islands? Both of these are not part of the United Kingdom, instead they are both Crown Dependencies, shown in purple, and are part of a federacy with the United Kingdom. And a federacy? That’s a type of government where one or more of the member administrative units have more independence than the majority of the member administrative units.

Finally, there’s everything else; those remnants of the British Empire scattered across the globe which enjoy the slightly nondescript appellation of British Overseas Territories (or British Dependent Territories prior to 2002 or Crown Colonies prior to 1981).

To be more precise, these are parts of the British Empire that did not gain independence and that the United Kingdom asserts sovereignty over.  They take in Anguilla, Bermuda, British Antarctic Territory, British Indian Ocean Territory, British Virgin Islands, Cayman Islands, Falkland Islands, Gibraltar, Montserrat, Pitcairn Islands, St Helena, Ascension Island and Tristan da Cunha, the Sovereign Base Areas of Akrotiri and Dhekalia and the Turks and Caicos Islands.

Written and posted from the Kempinski Hotel Bristol in Berlin (52.5052405, 13.3280218)

Posted via email from Gary’s Posterous


11
Dec 09

Geographic and Transport Data; a Tale of Capricousness, Whimsy and Downright Insanity

The industry I work in thrives on data; we consume loads of the stuff and in turn we generate petabytes of it. I’m talking about data in general, not the geographic, mapping or place data that I usually write about.

But the longer I work in the Internet industry the more convinced I become that, as an industry, we need to get our act together. How else to explain the bizarre, rapidly changing and capricious nature of how we gain access to, use, pay, don’t pay and disseminate data?

We’re socially conditioned to assume that free does not equate to good, hence the adage “there’s no such thing as a free lunch“. So stuff that costs is good and stuff that’s free isn’t. But normal rules don’t apply here.

Let’s take geographic data; I’m on home ground here so this should be relatively straightforward.

The proprietary data vendors, NavteqTeleAtlas and others, charge for their data and limit what you can and can’t do with it. OpenStreetMap on the other hand charges nothing for its’ data and only places limits on the data to protect the data by way of the Creative Commons Attribution Share Alike license.

So naturally the data you pay for should be good and the data you don’t pay for should be … less than good. Naturally.

Except OpenStreetMap data isn’t less than good. UCL’s Muki Haklay summed this up neatly as “How good is OpenStreetMap? Good enough” at the OpenStreetMap conference in Amsterdam this year. Conversely, the proprietary data vendors don’t always get it right. One data vendor, who will remain anonymous, shipped a release of data with wildly incorrect centroids, the lat/long coordinate which represents the nominal centre of a place, which meant that amongst others, Covent Garden ended up being centred on Holborn Underground Station.

This isn’t an isolated incident.

On the one hand, the City of Vancouver in British Columbia makes its data, all of its data, free and open. On the other hand, the City of Tempe in Arizona decides to charge a “fair approximation of market value” for its data, which as James Fee recently discovered means that you’ll need to cough up $100,000 to use it commercially.

In San Francisco, BART, the Bay Area Rapid Transit, makes their data which includes train times freely available and taking a refreshingly prosaic approach to accessibility and licensing.

Getting an API key: Psyche: you don’t need one. We’re opting for “open” without a lot of strings attached. Just follow our simple License Agreement, give our customers good information and don’t hog resources. If that doesn’t work for you, we can certainly manage usage with keys and write more terms and conditions. But who wants that?

Here in the UK TFL, Transport for London, give you some data for free but not the train times and for overground trains the Association of Train Operating Companies (pdf link) value this data at a staggering £27,430 per year

And elsewhere in the world, other operators are closing down people who want to use this data, in New York, in Berlin, in New South Wales and we can’t really seem to work out who owns the data and whether there’s intellectual property being infringed or a public service being undertaken.

… and don’t even talk about the British postal code data was closed, was then going to be opened up but now isn’t. Apparently.

With all the data we consume and emit, we spend a lot of time and effort evangelising APIs and web services that use it. But as an industry we really need to start to act clearly and consistently in order to be taken seriously and in order for the Internet industry to realise the potential that we all think it’s capable of.

Posted via email from Gary’s Posterous


19
Nov 09

Location Privacy Issue? I See No Location Privacy Issue

Telematics, the use of GPS and mobile technology within the automotive business, and the Web 2.0, neo and paleo aspects of location have traditionally carved parallel paths, always looking at if they would converge but somehow never quite making enough contact to cross over.

But not any more.

The combination of 3G mobile communications and GPS enabled smart-phones such as the iPhone and the BlackBerry means that one way or another, the Internet and the Web are coming into the car, either in your pocket or into the car itself.

With this in mind, last week I was at the Telematics Munich 2009 conference, which was coincidentally in Munich, giving a talk on some of the challenges we face with location and how the world of telematics can benefit by starting to look at location technologies on the Web.

One of the sessions I sat in on prior to my talk was on the eCall initiative. This is a pan European project to help motorists involved in a collision. A combination of onboard sensors, a GPS unit and a cellular unit detect when an accident has occured and sends this information to the local emergency services. The idea is that in circumstances where a vehicle’s occupants are unable to call for help, the car can do it for them.

So far, so public spirited and well meaning. But several things immediately stood out.

Firstly, while pitched as a pan European initiative, each member state has an opt out and naturally not all states have signed up to the initiative, including the United Kingdom.

Secondly, eCall is designed to be a secure black box system, but all the talk in Munich was of “monetize eCall offerings by integrating contactless card transactions like road-tolling, eco-tax and easy parking payment” or “how to geo-locate data messages to offer ubiquitous solutions“. In other words, adding value added services on top of a system which is actively able to track you at all times and which you, as the vehicle owner, has limited access to or control over.

But what really stood out was that there was not a single mention of location tracking and of the privacy aspects that this carries with it. Not a single mention. Not from the panel, not from the chair and not from the audience. Once rolled out, eCall as currently designed is pretty much mandatory in all new vehicles. Compare and contrast this with the outraged Daily Mail style diatribe that other, opt in, systems such as Yahoo’s Fire Eagle and Google’s Latitude have attracted.

The convergence of the internet, the web and telematics hasn’t yet happened but it will. It’s also evident that when this happens, the telematics industry may have a painful awakening as the impact of location technologies and the privacy issues they carry pervade into an industry which hasn’t needed to deal with this historically.

Posted via email from Gary’s Posterous


16
Nov 09

The (Geo) Data Dichotomy Dilemma

Before Web 2.0, before mashups, before FreeOurData.org.uk and other pleas, before the Internet itself, things used to be so much simpler for geo data. You were either an end user and accessed the data as a map or you were a GIS Professional and accessed the data via a (frequently very expensive and very specialised) Geographical Information System. But now we have geo data, lots of geo data, some of it free, some of it far from free, both in terms of usage and cost and a fundamental problem has replaced the paucity of data.

Everyone wants free, open, high quality geo data and no one wants to pay for it. But it’s not quite that simple.
The recent acquisitions of Tele Atlas and Navteq, the two big global geo data providers, by TomTom and Nokia respectively show the inherent value in owning data. But owning the data isn’t enough any more as the market for licensing the data is a shrinking one, despite the phenomenal growth of the satnav market, both in car and on mobile handsets. Why is the market shrinking? Because no one wants to pay for it, at least directly.
TomTom, primarily a hardware vendor, are differentiating into the software and data market,  seems to be concentrating on the PND usage of the data, although we’ve yet to see how the outlay necessary to acquire Tele Atlas coupled with the overall economic downturn will effect their overall 2009 earnings. Their Q1 2009 report somewhat dryly notes that “market conditions were challenging” and that “we are making clear progress with the transformation of Tele Atlas into a focused business to business digital content and services production company“. There may be other aspirations at play here but for now at least, the company is keeping quiet.
Nokia, also primarily a hardware vendor in the form of mobile and cellular handsets, are also moving away from their roots and into a wider market, hopefully in an attempt to stop the encroachment of upstarts such as HTC, Apple and RIM into Nokia’s traditionally strong smartphone heartland. Again, Nokia has yet to make a public play into this arena but all the composite elements are in place to enable this to happen.
Taking the opposite route, Google, which started off as a software player are now moving to being a player in the data market by gathering high quality geo and mapping data under the smokescreen of gathering Street View. This has allowed them to gather sufficient data to supplant Tele Atlas as a data provider, at least in the Continental United States.

All three companies are either making or have the prospect of making determined plays in the location space but all three of them have ways of leveraging the value inherent in their data. Google has their unique users, their search index and a vast amount of advertising inventory; TomTom their satnav customers; Nokia their handset customers, albeit one level removed with the Mobile Network Operators as an uneasy partner and intermediary.
So what of the open data providers? It’s important to remember here that open doesn’t always mean free, it means the ability to create derived works and to use the data in ways that the originator may not have immediately foreseen. True, a lot of open data is free, but even then it’s the Free Software Foundation’s definition of the word.
Free (software) is a matter of liberty, not price. To understand the concept, you should think of free as in free speech, not as in free beer.”
The poster child of open geo data is OpenStreetMap, the “free editable map of the world”. Founded in 2004 by Steve Coast, OSM has enjoyed phenomenal growth in users and in contributions of data that can be used anywhere and by anyone and which espouses the values of free as in speech and as in beer. As with all community or crowd sourced collaborative projects, OSM’s challenge is to sustain that growth and once complete coverage of a region is reached, in keeping that coverage fresh, current and valid. We’ll leave aside that fact that complete coverage is an extremely subjective concept and means many things to many people.
Traditionally strongest in urban regions, one of OSM’s other key challenges is to match the expectations of their user community who consume that data rather than those who create it. Both internationalisation of the data and expansion out of the urban conurbations will potentially prove challenging in the years to come. That’s not to say OSM isn’t a significant player in this space and the quality of the data, though varying and in some places duplicated, is for the majority of use cases, good enough. This was backed up by research undertaken by Muki Haklay of UCL which answered the perennial question of “how good is OSM data” with a pithy “good enough”.
Attempts to capitalise on and monetize the success and data corpus of OSM through the Venture Capital funded Cloudmade have yet to deliver on the promise and with the exception of a set of APIs, Cloudmade has announced the loss of their OpenStreetMap Community Ambassadors and the closure of their London office. All of which lends credence to the fact that simply owning the data isn’t enough.
So how to solve the dichotomy of geo data? Everyone wants it but no one’s willing to pay for it with the exception of the big players, the Googles, the Yahoos and the Microsofts of the world and control of the proprietary data sources has centralised into TomTom and Nokia, both of whom are well placed to capitalise on their data assets but who haven’t yet delivered on that promise.
Maybe the answer is twofold. Firstly develop an open attribution model whereby the provenance of an atom of data can be tagged and preserved; this would remove a lot of the prohibitions on creating derived works at the original data provenance could still be maintained. Secondly allow limited usage of proprietary data at varying levels of granularity, accuracy and currency, thus creating a freemium model for the data and stimulate developer involvement in donating data to the community as a whole.
It’s too early to see whether this will come to pass or whether an already tight hold on the data will become tighter still.

Posted via email from Gary’s Posterous


12
Nov 09

Forget the Credit Crunch; it’s the Geo Crunch in London

It was a particularly cruel piece of coincidental timing; quite a few of the usual suspects of the London geo scene congregated in Harrogate earlier this week for the one day Where 2.0 Now? conference. I was there representing Yahoo! Geo Technologies as well as Chris Osborne from Ito World, Ed Parsons from Google, John Fagan from MultiMap/Bing Maps, Harry Wood from Cloudmade, John McKerrell from mapme.at, Steven Feldman from  knowwhere and a host of others.

At the same time, back in London, the Credit Crunch was biting hard with the news that Cloudmade were to close their London office. While not officially announced by Cloudmade, both Russ Nelson and Richard Fairhurst reported this on Twitter and several other sources have corroborated this.
As if this wasn’t enough, news also filtered out that Microsoft acquired MultiMap was also shedding staff, with Chris Darby and Burak Gürsoy providing the unofficial news on Twitter and again, several others have corroborated this.
Bleak times for the geo scene in London; an observation that Steven Feldman, chair of this year’s AGI GeoCommunity conference noted wryly, whilst in Harrogate.
It’s an oft touted aphorism that a recession is the best time to found a startup; there’s certainly a cadre of very talented, passionate engineers in and around London. Here’s hoping that this is not the end of the story but the beginning of one or more geo startups that were founded in the depths of the Credit Crunch.

Posted via email from Gary’s Posterous


11
Nov 09

Have You Noticed That noticin.gs Have Noticed WOEIDs?

While everyone, well almost everyone, was fast asleep in London, Twitter quietly dropped a bomb-shell into their API announcements mailing list. Their new Trends API will help the service’s users answer the perennial question “what’s going on where am I“.

So far, so geo but Twitter has noticed what I’ve been saying in my talks and accompanying decks for the last two years or so.

We’re using Yahoo!’s Where on Earth IDs (WOEIDs) to name each location that we have information for — we’re doing so because those IDs give not only language-agnostic, but also permanent, stable, and unique identifiers for geographic locations.  For example, San Francisco has a permanent and unique WOEID of 2487956, London has 44418, and the Earth has WOEID 1.

Whilst there have been other uses of WOEIDs in the wild, including Alex Housley’s Total Hotspots, Twitter picking on WOEIDs rather than another of the competing geo-identifiers is a massive credibility boost for the WOEID as a geographic standard for identifying and describing place.
Using WOEIDs to geotag your content, be it Twitter status messages, blog posts or photos, automagically gives you access to an ever increasing range of data and web services that understand WOEIDs as well as those that still only understand longitude and latitude. Long/lat coordinates are an attribute of WOEIDs in case you were wondering. Proof of this is visible in the elegant and oddly addictive game of Noticings.
Noticings is “a game of noticing things about you” jointly created by Tom Taylor. Tom was responsible for Boundaries, the amazing visualisation of Aaron Cope’s Flickr Alpha shapes which allows geographies, such neighbourhoods, for which no formal definition exists, to be represented and viewed.
Basically you tag Flickr photos with the “noticings” tag and the photo’s location, either from an onboard GPS or on Flickr and then you score points for your photo of something you noticed. Which doesn’t do it justice. The rules are in a constant state of flux but all to the better making it a Mornington Crescent for geotagged photos.
Using WOEIDs as a stable and consistent geoidentifier is the glue that allows such a super-web-mash-up to be created. Flickr uses WOEIDs as a geotagging mechanism, either from the EXIF data embedded in a photo or by dragging and dropping the photo on a Map; these WOEIDs are then exposed via the Flickr API. The same Flickr API can be used to look for photos meeting certain criteria, such as the noticings tag and to discover photos taken in the same location, a fundamental part of Noticings. As Tom puts it …

(WOEIDs and GeoPlanet) gives us the opportunity to use colloquial geography rather than bounding boxes and radial searches and the like. I banged on about this in my talk at the AGI conference recently. I am such a geography bore. Anyway, we couldn’t have built Noticings without it.

For those who like the technical gory details, Tom’s put up an excellent blog post to explain it all.

But it doesn’t stop at photos and Flickr, once you have a WOEID you can pass it to any of the ever growing number of web APIs that know how to handle WOEIDs, Yahoo’s GeoPlanet, Placemaker, Fire Eagle, YQL as well as services that speak long/lat. That’s a lot of services, and the number’s growing. Plus you get access to the horizontal and vertical relationships, parents, children and neighbours that a WOEID has as well as more obtuse colloquial geographies, all in multiple languages.

All of which is somewhat apt as I’m writing this in Munich at the back of the Telematics 2009 conference. While Munich is fine for the English speaking world, it’s München in Germany and Monaco di Baviera to the Italians. But it may also be spelt as Muenchen and Munchen if special characters or accents aren’t used. All of these names are simply multiple versions of the same place, and so are mapped to a single WOEID, 676757.

Now go and notice something.

Posted via email from Gary’s Posterous


6
Oct 09

The Future of Web Apps? Bad Wifi, Booth Mobbing, Geo and Lots of Schwag

(This post was originally written for the Yahoo! Developer Network blog and was published there on October 5th; it’s duplicated here for posterity.)

You’re stuck in a room on the first floor of a venue with no natural light, people keep expressing surprise that you’re there, there’s a bizarre voucher system operating for getting a cup of coffee and the free public wifi is holding up far better than the venue’s net connectivity which is buckling under the strain of multiple laptops, iPhones and Androids.

It can only be a tech conference; this one is in London and it’s called FOWA, or the Future of Web Applications to give it its full name and it was held in the rather grand sounding Kensington and Chelsea Town Hall, near High Street Kensington tube station.

There’s a booth with some strangely comfortable sofas and chairs, a purple orchid, loads of purple swag, “geoballs” and a free wifi point called yahooligans. Sitting cozily between the PayPal and Vodaphone booths, this has been the home of the Yahoo! Developer Network and Yahoo! Geo Technologies teams for the last 48 hours.

I presented on both days as part of the University Sessions track. On Thursday I talked about “Place not Space; Geo without Maps“; which was somewhat incorrect given that it featured a guest appearance by Google Earth. Using Yahoo! Placemaker, I showed how you could extract places from web content and sanitise the content with YQL. Whilst it would be great if all the web used Yahoo! web services, we need to work with the rest of the world, so I showed how you could use the long/lat metadata returned by Placemaker to drive Google Earth.

Then on Friday I talked about how “Geocoding and Geoparsing are Easy“; I may have been somewhat economic with the truth. Geocoding isn’t easy and Geoparsing is even less so. This talk showed some of the pitfalls that frustrate us and how we need to model geography in real and colloquial terms and not simply structured and formal terms. Or to put it another way “we can make the internet work better by making it understand how we speak in the real world”.

Both sessions were really well attended, with people standing at the back during the Friday talk, which is a great thing for a speaker to see. FOWA attendees are a very geo-savvy crowd who asked lots of intelligent, challenging and pretty direct questions. There’s nothing I like more than an audience that “gets” a topic.

Back at the booth we were gently but firmly mobbed during break sessions which was pleasantly surprising, given that we were on the first floor. An entirely non-statistical review of the questions we came across on the booth showed three main trends:

  • Tell me about YQL and YUI - they’re really cool
  • Tell me more about this “geo” stuff
  • Is the wifi really this bad?

As an industry we thrive on a strange barter system based around the acquisition and donation of items of branded schwag. We continued this fine tradition with loads of “geoballs” and some much prized YDN screwdrivers. We also thrive on vast amounts of caffeine so it seemed only fair to run a competition with the prize of a coffee machine which resembles the robots that were used in the Fiat “designed by humans, built by machines” ad campaign. To win, all you had to do was guess the number of unique users that hit the Yahoo! UK network on Tuesday September 1st 2009.

Answers ranged from the hugely optimistic “a lot”, to some very precise, yet very wrong, figures, ranging from 20 thousand all the way up to an insane 2.3 billion. The real answer was 24,452,863 users and we were able to unite Raymond Tamblyn of Visa Worldwide with the coffee machine for his answer of 23 million.

And then after 2 days of no natural light, slightly crazed from too much caffeine and throats croaking from too much talking, the booth was dismantled, the purple orchid found a home and we stepped back into the fading daylight and hip shopping area of High Street Kensington and headed home for the weekend and to an internet connection that works.

Lousy wifi seems to be the hallmark of a great web event. Oh the irony.

Posted via email from Gary’s Posterous


1
Oct 09

NYC Beware : The Trinity of Geo Is Coming

Ever noticed how you never see some people in the same room together? Various conspiracy theories abound on this theme; that they’re really the same person or that they’re mortal enemies. All complete rubbish of course but maybe there’s some truth in this after all … I’ve never been publicly seen in the same room as Aaron Cope and Tom Coates before.

Benedictus de Spinoza said that nature abhors a vaccum and Heisenberg calculated the critical mass needed for a nuclear reaction so maybe there’s a halfway stage between these two extremes, a geocritical mass if you will.

I really should explain …

When people ask me what it is that I do for Yahoo!, I explain that I help use geography to describe people, places and things.

A rather jovial looking Tom.

People are knowing where users are and the things that are important to them. Fire Eagle, Yahoo’s location brokerage platform allows users to share their location on the web, to update anywhere and to choose what you share and don’t share. Tom is the man behind the creation of Fire Eagle and was responsible for leading the (now defunct) Yahoo! Brickhouse team to produce the best location service there is on the ‘net.

Wherecamp ‘09 in Palo Alto; that’s “geotechnologist and ATM user” Tyler Bell on the left, myself in the middle and  Aaron on the right.

Places are knowing geographic locations and the names of places. That’s the remit of Geo Technologies, my group at Yahoo! and you can see this in the public web service platforms we produce such as GeoPlanet and Placemaker, all linked using the geoidentifier we call WOEIDs.

Things are knowing the geographic context of content. Flickr allows you to geotag your photos, using my group’s technology and in February of this year broke the amazing 100 million geotagged photo mark. If you’ve seen him speak at Where 2.0, Wherecamp or previous Hack Days, you’ll know that Aaron knows the power of geo and has used it to produce something rather unique and special at Flickr.

Geocritical mass (which doesn’t currently show us in any search engine, so you saw it first here) may well be reached next week in the Millennium Broadway hotel in Times Square, New York when all three of us will be in the same place, at the same time for Open Hack NYC, 48 hours of hacking goodness with a generous helping of geo. Who knows what will happen, all I can say is that a trinity of geopeople are coming to NYC and that it’ll be geotastic.

Posted via email from Gary’s Posterous


25
Sep 09

Know Your Place; Adding Geographic Intelligence to your Content

Day two of the AGI GeoCommunity conference and the conference as a whole has ended. We discussed neogeography, paleogeography and pretty much all points in between, finally agreeing that labels such as these get in the way of the geography itself. I was fortunate enough to have my paper submission accepted and presented a talk on how to Know Your Place at the end of the morning’s geoweb track. The paper is reproduced below and the deck that accompanies it is on SlideShare.


Know Your Place; Adding Geographic Intelligence to your Content

Abstract

Yahoo! GeoPlanet exposes a geographic ontology of over six million named places, enabling technologies that join users with with most geographically relevant information possible and forms the heart of the Yahoo! Geo Technologies group’s technology platform.

GeoPlanet uses a unique, language neutral identifier for (nearly) all named places around the world. Each place exists within a graph of other places; the relationships between places are categorised by the nature of the relationship, categorised by administrative hierarchy, geographical scope and place type, amongst other. 

GeoPlanet’s geodata repository is exposed by publicly available web service platforms that allow places to be identified within content (Yahoo! Placemaker) and investigated by place name or identifier (Yahoo! GeoPlanet). Users are able to navigate rich metadata associated with a place including the place hierarchies and obtain parent, child, belong-to and neighbouring relationships.

For example, a list of first level administrative entities in a given country may be obtained by requesting the list of the children of that country. In a similar manner the surrounding postal codes of a given post code by be obtained via a request for its neighbours.

The framework for this is uniform and consistent across the globe and facilities geo-enrichment and geo-identification in a wide range of content, both structured and unstructured.

Place-based Thinking

Traditionally geography has been treated as a purely spatial exercise; this is certainly the case on the internet. Places are specified in terms of their longitude and latitude, and so cities or towns are referenced by the co-ordinate pair that identifies the theoretical or arbitrary centre of the place.

From this it can be seen that everything on the internet which is location related is referenced by a co-ordinate pair that has little relevance to a human but much relevance to a geographer or software which can algorithmically undertake a radius search from a point. Instead of a spatially based approach to location, Yahoo! Geo Technologies take a place based approach.

The map above shows a spatially correct map of the central area of the London Underground network similar to those produced up until the early 1930s; in the central area of London the map is compressed due to the close proximity of the lines and their stations.

In 1932 the familiar Tube map, shown below, was produced by Harry Beck in the form of a non geographic linear diagram. Whilst not geographically or spatially correct it is far more accessible and information rich due to Beck’s assumption that people are less concerned with the exact location of a station and more interested in how to change between lines and get to their destination.

We have taken a not dissimilar approach with our repository of named places, where a place can be a monument, a park, a colloquial region such as the Home Counties and continent or even the Earth. We have taken each of these different place names at all of their differing granularities and given them unique identifiers, called Where On Earth Ids.

WOEIDs

The Where On Earth ID is a unique and permanent global identifier, shared publicly via the GeoPlanet and Placemaker API platforms.

They are language neutral, thus the WOEID for London is the same as for Londres, for Londra and for ロンドン, whilst recognising, for the London in the United Kingdom, that London, Central London, Greater London and the City of London are geographically related though separate places.

Their usage ensure that all Yahoo! APIs have the ability to employ geography consistently and globally.

A Global Geographic Ontology

Within our geodata repository we know not only where a place is geographically located, via its centroid, but also how these places relate to each other. This is more than an index of places, it is a geographic ontology of named places, each of which is referenced by a WOEID.

Using the postal town of Stratford-upon-Avon as an example, we can determine the children of a place, its parent, its adjacent places and non administrative or colloquial areas that a place belongs to or is contained within, at the following granularities. 

  • Supernames
  • Continents
  • Countries
  • Counties
  • Regions
  • Neighbourhoods
  • ZIP and Postal Codes
  • Custom Geographies


Joining People with Content and Content with People

We can use Placemaker to parse structured and unstructured content and to identify the places referenced, each of which is represented by a WOEID. Where more than one potential place exists for each name, a ranked list of disambiguated names is presented.

Each of the WOEIDs returned by Placemaker have the notional centroid and the bounding box, described by the South West and North East coordinates, as attributes. This allows the concept of a place to be displayed, such as that for the postal town of Stratford-upon-Avon, as shown below.

For each WOEID, we can use GeoPlanet to determine the vertical relationships of the place, such as which cities are in a country or which postal codes are within a city. We can determine the states, provinces or districts with in a country and which countries are on a continent. This powerful vertical hierarchy can be easily navigated from any WOEID.

GeoPlanet also contains a horizontal-like hierarchy, which frequently overlaps. If searching against a specific place such as a postal code, we can determine the surrounding postal codes as well; if searching for a town, we can determine the surrounding postal towns, as shown below.

GeoPlanet contains a rich ontology of named places, which allows us to look up places and where these places are. But more powerful is the relationship between places which allows users of GeoPlanet to add geographic intelligence to their use cases and applications, browsing the horizontal and vertical hierarchies with ease to discover geographic detail that no other point radius-based search would allow us to do.

Capturing the World’s Geography as it is Used by the World’s People 

The Oxford English Dictionary, often criticised for capturing transient or contentious terms, states its goal as “to capture the English language as it is used at this time” and not to impose how things are called. In the same vein, our goal is to capture the world’s geography as it is used by the world’s people.

We aim to follow the United Nations and ISO 3166-1 guidelines on the official name for a place but we strive to know the informal, the ethnic and the colloquial. We are less concerned with imposing a formal geography as we are with describing how a place is described today and what its relationship is with its parent, its children and its neighbours.

Thus we recognise that MOMA NYC (WOEIDs 23617044 and 2459115) is used to refer to the Museum of Modern Art in New York, that San Francisco (2487956) is the more commonly used form of The City and County of San Francisco and that the London Eye and the Millennium Wheel are synonymous (WOEID 22475381).

A Tale of Two Stratfords

Stratford is an important tourist destination, due to the town being William Shakespeare’s birthplace, with both the “on-Avon” and “upon-Avon” suffixes being used to refer to the town. GeoPlanet recognises both Stratford-on-Avon and Stratford-upon-Avon (WOEID 36424) when referring to the postal town and further recognises Stratford-on-Avon (WOEID 12696101) as the administrative District which is the parent for Stratford-upon-Avon.

“the Council often gets asked why there is a difference in using the terms ‘Stratford-on-Avon’ and ‘Stratford-upon-Avon’. Anything to do with the town of Stratford is always referred to as Stratford-upon-Avon. However, as a district council, we cover a much larger area than the town itself, but did not want to lose the instantly recognised tag of Stratford, so anything to do with the district is referred to as Stratford-on-Avon.” 

Appendix A – Data Background

The GeoPlanet geodata repository is derived from a variety of sources, both spatial data vendors, openly available sources and Yahoo! sourced. In raw form, it occupies 25 GB of storage; after automated  topology generation and semi automated processing to clean the data and to remove duplicates, the final data footprint is around 9.5 GB. A specialised Editorial team assesses overall data quality and integrity, areas of ambiguity and challenging geographics, such as disputed territories and colloquial areas.

Appendix B – Further Reading

  1. Yahoo! Developer Network – Yahoo! Placemaker
  2. Yahoo! Developer Network – Yahoo! GeoPlanet
  3. The London Tube Map Archive
  4. Transport for London – Design Classic
  5. Yahoo! Developer Network – Where On Earth Identifiers
  6. Oxford English Dictionary – Preface to the Second Edition (1989)
  7. Yahoo! Developer Network – On Naming and Representation
  8. Stratford-on-Avon District Council – Community and Living

Posted via email from Gary’s Posterous