Posts about ip

You May Have Helped Map The Internet Without Knowing It

over 900 million things connected to the internet. This isn't the amount of things, computers, mobile phones, tablets, that use the internet, but the number of things that have a public IP address. Maybe by correlating the locations of these public IP addresses you could make a map of the internet?

According to the Internet Systems Consortium there's somewhere over 900 million things connected to the internet. This isn't the amount of things, computers, mobile phones, tablets, that use the internet, but the number of things that have a public IP address. Maybe by correlating the locations of these public IP addresses you could make a map of the internet?

worldmap_16to9_1600x900

Almost anything is possible, but the devil's in the details. Firstly you'd need to find all those internet connected things which respond to an ICMP Ping request, which is a technical way of asking something on the internet are you there? That's a really big amount of things to ask this question of and that would take a lot of time for just one computer to do.

But a researcher tried to do this and in preliminary research found out that an awfully large amount of these internet connected things were servers running some version of UNIX and a scarily large amount of these also either had a root account with a password of root or admin or even no password at all. The root account is a superuser or administrator account on a UNIX system; if you can login with this account you have total control of a UNIX machine.

This is where things get technically interesting, legally dubious and morally questionable in pretty much equal measure.

The, so far anonymous, researcher wrote a small piece of code that could do three things. Firstly, run a scan of a very small subset of those 900 million odd connected things. Secondly, make a copy of itself on another of those connected things which were running UNIX and which had a wide open root account. Thirdly, make that copy of itself, small, unnoticeable, not consume too much system resources or bandwidth and delete itself after it had finished.

This is what's know as a botnet and this botnet mapped the internet and vanished once it was done. At its peak, there were over 420,000 servers unwittingly participating in this map making endeavour. You may even have contributed to the map without even being aware of it. If you know that you have a wide open UNIX server you probably did and you should also run, not walk, and lock down your server right now.

As a map, the Internet Census 2012 map is interesting. As a piece of technology, the map's origins are fascinating. You can also see why the researcher who did this chose to remain utterly anonymous, though I have to wonder how long his anonymity will last.

Geographic and Transport Data; a Tale of Capricousness, Whimsy and Downright Insanity

there's no such thing as a free lunch". So stuff that costs is good and stuff that's free isn't. But normal rules don't apply here.

The industry I work in thrives on data; we consume loads of the stuff and in turn we generate petabytes of it. I'm talking about data in general, not the geographic, mapping or place data that I usually write about.But the longer I work in the Internet industry the more convinced I become that, as an industry, we need to get our act together. How else to explain the bizarre, rapidly changing and capricious nature of how we gain access to, use, pay, don't pay and disseminate data?We're socially conditioned to assume that free does not equate to good, hence the adage "there's no such thing as a free lunch". So stuff that costs is good and stuff that's free isn't. But normal rules don't apply here.

Let's take geographic data; I'm on home ground here so this should be relatively straightforward.The proprietary data vendors, NavteqTeleAtlas and others, charge for their data and limit what you can and can't do with it. OpenStreetMap on the other hand charges nothing for its' data and only places limits on the data to protect the data by way of the Creative Commons Attribution Share Alike license.So naturally the data you pay for should be good and the data you don't pay for should be ... less than good. Naturally.Except OpenStreetMap data isn't less than good. UCL's Muki Haklay summed this up neatly as "How good is OpenStreetMap? Good enough" at the OpenStreetMap conference in Amsterdam this year. Conversely, the proprietary data vendors don't always get it right. One data vendor, who will remain anonymous, shipped a release of data with wildly incorrect centroids, the lat/long coordinate which represents the nominal centre of a place, which meant that amongst others, Covent Garden ended up being centred on Holborn Underground Station. This isn't an isolated incident.On the one hand, the City of Vancouver in British Columbia makes its data, all of its data, free and open. On the other hand, the City of Tempe in Arizona decides to charge a "fair approximation of market value" for its data, which as James Fee recently discovered means that you'll need to cough up $100,000 to use it commercially.In San Francisco, BART, the Bay Area Rapid Transit, makes their data which includes train times freely available and taking a refreshingly prosaic approach to accessibility and licensing.Getting an API key: Psyche: you don't need one. We're opting for "open" without a lot of strings attached. Just follow our simple License Agreement, give our customers good information and don't hog resources. If that doesn't work for you, we can certainly manage usage with keys and write more terms and conditions. But who wants that?Here in the UK TFL, Transport for London, give you some data for free but not the train times and for overground trains the Association of Train Operating Companies (pdf link) value this data at a staggering £27,430 per yearAnd elsewhere in the world, other operators are closing down people who want to use this data, in New York, in Berlin, in New South Wales and we can't really seem to work out who owns the data and whether there's intellectual property being infringed or a public service being undertaken.... and don't even talk about the British postal code data was closed, was then going to be opened up but now isn't. Apparently.With all the data we consume and emit, we spend a lot of time and effort evangelising APIs and web services that use it. But as an industry we really need to start to act clearly and consistently in order to be taken seriously and in order for the Internet industry to realise the potential that we all think it's capable of. Posted via email from Gary's Posterous