Grepping And Grokking The Etymology Of Grep
I've been thinking a lot about the etymology of place names recently. That's a slightly verbose way of saying that I've been thinking about the origin of place names and where they come from. Take London for example. That's pretty easy as most sources of information seem to agree that London derives from Londinium, the name of the Roman settlement from which the modern metropolis of London grew.
Then there's Teddington, the town on the River Thames at the upstream limit of the Tideway, where I currently live. Some people believe that the name derives from Tide's End Town; Rudyard Kipling was one of the people who subscribed to this version of the name's origin. Scholars though tend to believe that the town was named after a Saxon leader, called either Todyngton or Tutington, which morphed into the modern day name over the centuries.
All well and good but this sort of debate over the origin of a name is continuing even today and in a much more geekier vein. To paraphrase John Cleese in Monty Python's Cheese Shop sketch, I was perusing the internet the other day and came across a discussion of the origins of the UNIX command grep. If you know your UNIX command line, you'll probably know that grep is the tool you use to search inside text files. Indeed, just as Robert Heinlein's grok has become part of today's technical culture as a synonym for understand, so grep has become a synonym for search ... I'm just grepping for the time the restaurant opens.
If you'd asked me last week how grep got its name, I'd have said with high confidence that it's an acronym for General Regular Expression Parser, G .. R .. E .. P, grep. But Mike Burns over at Giant Robot offers up an alternate etymology, albeit a rather contrived one to my mind, that the name originated from the commands to search for text within the ed text editor, thus when looking for the regular expression "re", you'd issue the command g/re/p. All of which looks nice and convenient but only works when you're looking for the string "re", which isn't that much of a common event when you think about it.
A bit of background research yields even more versions of how grep got its name. John Barry's book Technobabble offers up a whole slew of alternatives.
- The November 1990 issue of the SunTech Journal states that grep is an acronym forGet Regular Expression and Print.
- The December 1985 issue of UNIX World, thinks that it's really Globally search for a Regular Expression and Print.
- A technical writer at Hewlett-Packard offers the alternative of Generalized Regular Expression Parser.
- An Introduction to Berkeley UNIX disagrees; it's Generalized Regular Expression Pattern.
- Don Libes and Sandy Ressler in Life With UNIX thinks it's Global Regular Expression Print.
- And finally, the authors of UNIX For People prefer the definition as Global Regular Expression or Pattern.
That's 8 differing and conflicting definitions.
And the point of all of this etymological meandering? Well, today's internet community prides itself in being the ultimate source of information in today's society. Yet I find it deliciously ironic that we can pretty much agree on the origins of place names dating from Roman and from Saxon times but can't agree on the origin of a UNIX command that was created on March 3rd. 1973. The irony becomes even deeper when you consider that UNIX systems formed the backbone of the origins of today's internet and World Wide Web and that a substantial proportion of the servers on the net today still run UNIX, and thus still run the grep command.
Photo Credits: Danny Howard on Flickr.