Posts about robots

Of Robots And Teapots; Web Geeks Are Not Without A Sense Of Humour

Asimov's Three Laws of Robotics into their respective site's robots.txt file. Sadly, it looks like Yelp's robots.txt is now unfunny and businesslike, but Last.fm's subversion of this file is still there.

There's a line from the first Matrix movie, the only really good one out of the trilogy, where Morpheus says earnestly to Neo ... fate, it seems, is not without a sense of irony. It's time to add a corollary to this quote, along the lines of web geeks, it seems, are not without a sense of humour.

Last year, it was the web geeks who run the web servers for Yelp and Last.fm sticking Asimov's Three Laws of Robotics into their respective site's robots.txt file. Sadly, it looks like Yelp's robots.txt is now unfunny and businesslike, but Last.fm's subversion of this file is still there.

$ curl -X get www.last.fm/robots.txt
User-Agent: *
Disallow: /music?
Disallow: /widgets/radio?

Disallow: /harming/humans
Disallow: /ignoring/human/orders
Disallow: /harm/to/self

But it seems like the BBC's web geeks also are not without a sense of humour. Earlier today, I happened across one of the more bizarre HTTP status codes out there on the interwebs. Not your usual HTTP 200 Success or HTTP 404 Not Found, but HTTP 418 I'm A Teapot ...

418 I'm a teapot (RFC 2324) This code was defined in 1998 as one of the traditional IETF April Fools' jokes, in RFC 2324, Hyper Text Coffee Pot Control Protocol, and is not expected to be implemented by actual HTTP servers. However, known implementations do exist.

no, really. Take a look at the BBC's CBeebies site if you don't believe me.

The page even returns a HTTP 418 status.

Thanks are due to fellow ex-Yahoo! David Overton for pointing this gem out. Honourable second place mention is also due to HTTP 420 Enhance Your Calm.

420 Enhance Your Calm (Twitter) Returned by the Twitter Search and Trends API when the client is being rate limited. Likely a reference to this number's association with marijuana. Other services may wish to implement the 429 Too Many Requests response code instead. The phrase "Enhance Your Calm" is a reference to Demolition Man. In the film, Sylvester Stallone's character John Spartan is a hot-head in a generally more subdued future, and is regularly told to "Enhance your calm" rather than a more common phrase like "calm down".

... web geeks, it seems, are not without a sense of humour.

Creative Use Of Robots

I'm not talking about vaguely human looking machines here, the sort that crop up in Forbidden Planet and Lost In Space, waving their metal arms and saying things like "Danger Will Robinson". What I'm talking about is a small file called robots.txt.

T is for Tofu Robot

If you run your own web server you probably have one of these. It tells the web robots sent out by the search engines, such as Google, Yahoo! and Bing, what pages on your web site should and shouldn't be indexed and searchable. This doesn't mean that those pages can't be viewed, just that they shouldn't be able to be searched for.

Most of the time, a web site's robots.txt file contains stuff that is only of interest to the owner of the site and to people who specialise in getting the content of your web site to figure prominently in search engines. But sometimes, if you're willing to poke around a bit, they contain hidden gems, like a job advert for one of those aforementioned web search specialists, hidden in the UK Daily Mail's robots.txt file.

User-agent: *
Disallow: /tvshowbiz/tvlistings/
Disallow: /home/ireland/
Disallow: /home/scotland/

# August 12th, MailOnline are looking for a talented
# SEO Manager so if you found this then you're the kind
# of techie we need!
# Send your CV to holly dot ward at
# mailonline dot co dot uk

Sometimes the lines between a metal robot and the robots.txt file gets blurred; yelp.com's robots.txt file starts with the famous Three Laws Of Robotics that Isaac Asimov wrote about back in 1942.

#
# 1. A robot may not injure a human being or, through
#     inaction, allow a human being to come to harm.
#
# 2. A robot must obey orders given it by human beings
#     except where such orders would conflict with the
#     First Law.
#
# 3. A robot must protect its own existence as long as
#     such protection does not conflict with the First or
#     Second Law.

User-agent: *
Disallow: /advertise?
Disallow: /biz_share?

Social music site last.fm has taken this one step further and represents Asimov's laws as faked URLs on their web site.

User-Agent: *
Disallow: /music?
Disallow: /harming/humans
Disallow: /ignoring/human/orders
Disallow: /harm/to/self

And finally, in a particularly creative use of robots.txt, Scottish whisky brand Whyte & Mackay have hidden a giveaway promotion in their file.

An SEO nod of the hat must go to Tyler Bell for spotting the robots.txt file on yelp.com and Malcolm Coles for the Daily Mail and Whyte & Mackay robots.txt gems. Photo Credits: Don Solo on Flickr.