Harvesting Your Digital Dandruff, Crumbs and Footprints for Fun and Profit

"I'm just a face in the crowd, Nothing to worry about, Not even tryin' to stand out, And I have nothing to say, It's all been taken away, I just behave and obey"

Trent Reznor, Nine Inch Nails, Getting Smaller

Ten years ago our online identity, if we had one at all, was a simple affair to manage, comprising of an email address and perhaps an avatar name or two. Fast forward to the close of the first decade of the 21st century and it's an altogether more complex affair. You've probably got several email addresses, possibly some domain names and then there's the plethora of social networking sites that you frequent, Twitter, Facebook, LinkedIn, Bebo, MySpace and so on. All of which define the online version of "you" in much the same way as your passport, driving licence and bank account defines the offline "you".

The key difference is that the online version of "you" is much more subtle, complex and diffuse. We leave scraps of our path through the internet behind us. At the Being Digital conference in London earlier this year, I tried to explain this with the clumsy phrase "digital dandruff"; in the soon to be published book, "My Digital Footprint", Tony Fish far more elegiacally describes it as our digital footprint, which is "the digital 'cookie crumbs' that we all leave when we use the some form of digital service, application, appliance, object or device, or in some cases as we pass through or by".

Managing our digital identity through those sources we know about is a challenge for a significant percentage of the online population. But despite being a challenge, it's one which is achieveable if you're willing to put enough time and effort into it. But most of us don't have the time or are unwilling to put in the effort, so our digital cookie crumbs and the varying online versions of "us" stay online, ready for someone with the time and effort to search for, find and put together with profit in mind.

Some people take an active role in managing their digital footprint and try to exploit it. Some people also try to exploit other people's digital footprint. Let's look at a concrete example of this.

Not Your Average Star Trek Reference

garygale.com Screen Grab

My site at garygale.com pulls together a subset of my digital footprint into one place, drawing on my blog, my social bookmarks on Delicious, articles I've written, photos from Flickr and presentation decks from talks I've given. Inspired by an article written by the Yahoo! Developer Network's Christian Heilman, garygale.com uses PHP and YQL to dynamically pull in the latest version of all my content so my site is always up to date

Spock.com Screen Grab

Now compare and contrast this information with that available on Spock.com, "the first search engine for finding people on the web". It's not as complete as my version, nor formatted as coherently but the key facets of my digital footprint are there. If I wanted to I could add to this digital portrait, supplying tags, biographic information, pictures, quotes and so on.

Spock has crawled the web for my data and it's created a profile on me, without my permission and without my control. It encourages me to enrich the data held but then requires payment for me to access that information. Now would be a good time to point out that in April 2009, Spock was acquired by Intelius, a company that provides background checks and identity theft protection.

Those that Fail to Learn from History, are Doomed to Repeat It?

Can I stop Spock finding and presenting this information about me, without my request or, more importantly, without my control? Spock's help page says the following:

"Before requesting removal, please make sure the original source of the information Spock found for you has been removed or made private (MySpace, blog, Friendster, etc). This will prevent you from being re-indexed on the site."

This means that unless I contact every source that Spock crawls, and not all sources are identified on Spock's site, and then have each source take down content on me or make them private, Spock will crawl these sources again and find my content and republish it. An evident parallel of this Web 2.0 behaviour is the Web 1.0 problem of large scale harvesting of email addresses for subsequent resale to commercial spammers.

My site speaks for me because I control the information and the way in which it's presented; Spock's version of me is out of my control and doesn't speak for me.

What I do know is that neither the privacy advocates nor the aggressive marketers who want to know all about me - let alone the government that thinks my life should be an open book - can speak for me. I want to make my own decisions about what I disclose, knowing all the while that I cannot control what others say about me.

Esther Dyson

In "My Digital Footprint", Tony Fish describes a Rainbow of Trust, which categorises people's online activities as one of Untrusting and Stupid, Untrusting and Wise, Accepting Authority, One Way or My Way.

Untrusting and Stupid give up data without any thought as to the consequences; their online participation is passive and will click on anything, including banners and search ads.

Untrusting and Wise are the opposite of Untrusting and Stupid; they are extremely selective about the information they reveal, concerned about privacy and frequently hide their identify behind multiple digital personas.

Accepting Authority have their computer's default home page still set, Yahoo!, MSN, AOL, etc and are either happy with a portal approach to their online experience or are unwilling or unable to change it. Their digital experience has to work first time, be simple and work with one click.

One Way experiment with one one thing at a time, continuing until they're happy with it and then move onto another online service.

My Way want it their way, un-tethered, un-filtered and unadulterated, trusting no one until they have mastered it and push the boundaries of what's possible online.

The readers of this article will (hopefully) fall within a combination of Untrusting and Wise and MY Way, but the reality is that we are but a small percentage of the global population who have access to the Internet, which as of March 2009, numbered around 1,500,000,000.

Two Cultures; Those Who Understand Tech and The Rest of Us

Mentoring programs such as DigitAll go some way to help inform people about their usage of the internet, not only how to use it, but how to use it responsibly and knowledgeably. At this year's OpenTech in July at the University of London Union, technology critic Bill Thompson lamented the Two Cultures problem; people who understand technology and everyone else. As illustration of this he highlighted how the UK education syllabus places more emphasis on "the ability to format text in Microsoft Word" than on understanding how to use the net and how to identity and protect your digital identity. Until your digital dandruff, crumbs and footprint becomes an integral part of our children's education, we all have a responsibility to understand what is being done with our personal data and pass this onto our colleagues, our friends and our family.

Gary Gale

I'm Gary ... a Husband, Father, CTO at Kamma, geotechnologist, map geek, coffee addict, Sci-fi fan, UNIX and Mac user