Last weekend, myself and the rest of the OpenCage team were in Karlsruhe in Germany for the second annual OpenStreetMap State of the Map Europe conference. It was probably one of the best run and most diverse OSM conferences I’ve been to.
For a start the key essential elements for a conference were there; there was plentiful coffee and the wifi was both fast and more importantly, it didn’t die horribly during the conference.
I’d submitted a talk called Geocoding – The Missing Link For OSM? and had been asked to actually give that talk. That was my reason for being at SOTM-EU. But we were also going to soft launch OpenCage Data’s latest offering, a geocoding API that’s powered by OSM and other Open Data and which is built using open source commodity components. That’s the reason Ed and Marc Tobias were also in Karlsruhe.
The first day of the conference was spent in the lobby, drinking lots of the aforementioned coffee and using lots of the aforementioned wifi, while we made last minute tweaks to the API and the accompanying website. By the end of the afternoon, the API was ready, the website worked and my slide deck was finished.
Of all which meant I could enjoy the second day of the conference and actually listen to the talks until 4.30 in the afternoon when I took to the stage and gave this talk, which was filmed and put up on YouTube.
If you prefer to read an account of the talk and the launch of the OpenCage Geocoder, you’ll find my slides and commentary below.
So, hello, I’m Gary and I’m from the Internet. I’m a self-confessed map addict, a geo-technologist and a geographer. I’m Geotechnologist in Residence for Lokku in London. I used to be Director of Global Community Programs for Nokia’s HERE maps and before that I led Yahoo’s Geotechnologies group in the United Kingdom. I’m a founder of the Location Forum, a co-founder of WhereCamp EU, I sit on the Council for the AGI, the UK’s Association for Geographic Information, I’m the chair of the W3G conference and I’m also a Fellow of the Royal Geographical Society.
Most people in this room, I hope, understand that in today’s geospatial world a geocoder is critically important. Most people outside of this room and outside of this industry probably don’t. They just expect stuff on the interwebs and on their phone to work, for their devices to understand not only what they meant to say or type or tap but also where they meant. So I think it’s worth noting why people need to geocode …
They might have data with geospatial context but without coordinates or have data where the coordinates are questionable.
They might want to show their data on a map or store the coordinates in their data and do more than just cache them.
They might have coordinates but not know where those coordinates actually refer to or to easily cluster their data into whatever geographical grouping makes sense for their use case.
All of this … and more … needs a geocoder that works, works well and probably works globally
But enough about why people want to geocode … why do we want to geocode and by “we” I mean Lokku, the company behind OpenCage Data and behind Nestoria
We need a geocoder because Nestoria gets real estate listings. That means properties with, hopefully, a valid address. That data needs to be cleansed, sanitised and shown on a map, either precisely or in the general area if we can’t get a precise, street level geocode.
Nestoria has been doing this for over 8 years, geocoding and indexing up to 10M properties, every day. That’s a lot of geocoding and it needs to happen in areas of the world which aren’t always served by the commercial geocoders that the proprietary map providers offer.
You’d be forgiven for saying “but that screen shot is for Karlsruhe, that’s not difficult, Germany is well mapped and has a sane addressing system”. And you’d be right.
But we also do this for countries like India, which aren’t well mapped and which have a much more … fluid … approach to addressing. In January of this year we were in India and asked some people in Bangalore how would they geocode a batch of a thousand or so addresses.
The answer we got was simple … “Geocode that many addresses? We wouldn’t”. There’s a long running joke in India to effect that the country does has GPS, but it doesn’t stand for Global Positioning System, instead it stands for General Populace System. You look at an address, get to the nearest spot and then ask someone, repeating the process until you reach your destination.
Yet we’re geocoding in Bangalore and in India to the best of our ability to do so.
When it comes to choosing a geocoder, there’s a lot of choices for you to make and the choice you make has to be the right one for you
This is just a small selection of the geocoding services on offer. Some open. Some proprietary. Some free, some paid and some freemium.
All existing geocoding services have weaknesses and limitations
Most offer very limited coverage in emerging markets
Some allow caching or persistence (storing) of geocodes; some don’t
Almost all services severely rate limit or throttle over a 24 hour period
Commercial and/or proprietary services offer paid for plans ranging from $0.001 per query to €31,250.00 per month!
Not all providers allow for commercial use of geocodes
Some services don’t even offer a useable service, but instead permit hosting your own instances
Almost all proprietary services restrict the map canvas you can display geocodes on, forbid commercial use of geocodes or assert ownership of the geocodes or all of these
For Nestoria we had to make a decision and the one we made is why I’m here speaking to you now. We decided not to go with a proprietary geocoder
We decided to build our own geocoder. Or to be more precise our own geocoders. One for each country Nestoria operates in. This was a hard decision to make but the right one. No other geocoding service offered the right combination of coverage, depth, usage rights and many other factors.
So build our own geocoders we did. With open data. From OSM. From Yahoo! GeoPlanet and from other open data sources. These geocoders are running right now, 24×7, geocoding property listings and making Nestoria work.
When I joined Lokku and OpenCage Data in January of this year I took a long hard look at the back end geo technology that Nestoria has and immediately had a lightbulb moment. We should launch a geocoder. And not just a geocoder that uses the Nestoria geocoders, one that uses many open source and open data geocoders and one that offers global coverage, not just in the countries that Nestoria operates in.
For most people geocoding and OSM mean Nominatim. There’s also other geocoding services, including MapQuest’s Open Geocoder which is powered by Nominatim as well as other services such as geocod.io, geo.io and Photon to name but a few. But all of these services are standalone. It’s one geocoder, behind a single API. There should be more than one.
Because if you look at what’s behind the API for the large proprietary geocoders, there is more than one geocoder. There’s many. This is certainly true for the companies I’ve worked for that offer geocoding services … both Yahoo’s and NAVTEQ’s geocoder is really many country and/or language specific geocoders. You hit a single API and it’s fired off to many geocoders based on country or language so the user gets the best answer they can. This isn’t an easy task to achieve and it’s probably one of the reasons why commercial geocoders cost and cost a lot.
But while the proprietary map and geocoding providers battle it out between themselves OSM and open data are being overlooked and ignored. While the proprietary players have now grudgingly admitted that the map in OSM is a competitor to their offerings, the same cannot be said for their view of OSM being a viable opponent in geocoding.
There is a classic gap in the market and this is one that at OpenCage we’re trying to exploit and one which we home the OSM community as a whole will help exploit.
So we’ve taken the Nestoria geocoders, we’ve added in our own instances of Nominatim, of DSTK and of Two Fishes and we’re wrapped this all into a single API which does just what the proprietary players do, we look at the query, fire it off to all of these geocoders and confidence rank the results which we then return.
This is just the start, we plan on adding more open geocoders and more open data in the future as the service grows. If you have a geocoder you think we could or should be using, come and find us and tell us. As well as myself, there’s Ed Freyfogle and Marc Tobias Metten here at SOTM-EU.
It’s called the OpenCage Geocoder and you’ll find it online here. Right now.
OSM isn’t just for the US or for Europe and neither is this geocoder
Reverse geocoding is just as important as forward geocoding. Indeed with the continuing rise in smartphone use, it’s probably not unfair to say that reverse geocoding is just as important as forward geocoding is, if not more so
For now, we’re launching as a beta service, which means this is a free service. After the beta period, there will be pricing levels introduced, but there will always be a free tier and our pricing will be clear, transparent and above all flexible.
In 2011 at WhereCamp EU in Berlin I codified Gary’s Law of Conference failure. Never work with children, animals or live demos. Now it’s time to put that law to the test. This is what the geocoder looks like and does.
The API runs over HTTP or HTTPS. Here’s the API geocoding Karlsruhe and with a format parameter than gives you back JSON.
Or if you prefer XML we can do that too.
Or if you’re already using Google’s geocoder we can return JSON in the format that v3 of the Google API uses.
Or maybe you’d like to see what the return values from the geocoder looks like on a map? We can do that too.
So feel free to sign up and to try the OpenCage Geocoder out
If you find a problem, want help or have suggestions or want to talk to us … you’ll find us on Twitter
Thank you for listening