Deleting Content on the Web

Filed as Guides on January 26, 2009 10:05 am

Repost This

There’s an old saying that, once something has been uploaded to the Internet, it can not be truly deleted. The nature of the Web, one where content is copied and pasted constantly, makes it impossible, at least in theory, to actually remove any work added to it, no matter how hard one may try.

As true as that may be, what happens if you decide you want to pack up and leave the Web altogether? That you aren’t comfortable having a Web site, blog, Flickr account or anything else in your name? Perhaps its privacy concerns that bother you, a change of heart about what’s important in life or just a wish to have a fresh start. Either way, what happens after you hit “delete” and say goodbye.

As it turns out, the answer isn’t as simple as many think. Removing your content from the Web is not as easy as canceling your accounts, nor is it completely impossible. Much of it depends on the type of content you’ve produced, where you’ve placed it and how the public has responded to it.

There are a lot of questions about where your content goes after you delete it, questions well worth considering just in case one day you do decide to pull the plug.

Is It Possible?

It is often debated about whether or not it is possible to truly delete anything on the Web. Most likely, the answer is no, it is not. With the Web, too many copies exist of a work to safely remove them all. However, whether or not copies exist that can be easily found is a different matter.

This issue has caught the attention of the head of the British Library, Lynne Brindle, who said that “If websites continue to disappear in the same way as those on President Bush and the Sydney Olympics… the memory of the nation disappears too.”

Though the examples Brindle gave, the old White House Web site and various sites for the Sydney Olympics, are likely poor examples, the point does remain that sites do often go down with no clear sources for locating its content. Though, most likely, the information could be found if there were some dire need to, for example, a police investigation, most people are not going to attract enough attention to warrant such steps.

However, the difficulty involved in removing content from the Web is proportional to how interesting others find it. A celebrity sex tape, for example, is virtually impossible to remove from the Web. On the other hand, a 100-page diatribe about plant biology in the South Pacific will likely be much easier to (effectively) remove.

This raises the question, for those of us who are not celebrities, what happens after we hit the “delete” key on our Internet lives? As it turns out, most of the echoes and shadows we cast begin to disappear within just a few moments.

Scrubbing the Web

The good news is that, the moment you delete something from the Web, its presence begins to deteriorate quickly. Though all content we post on the Web leaves shadows, they begin to fade pretty shortly after the work is gone.

Both locally-cached and ISP-cached copies of pages typically disappear within a few hours (though many users will have their local copies preserved for some time, until they purge their local cache). Other temporary cache resources such as Google Cache and Coral Cache typically fade away within a few days.

All of this is automated and requires no action on the part of the content owner. Most sites and services have no interest in permanently caching deleted information.

However, there are exceptions to the rule, most notably the Internet Archive. The Internet Archive not only stores previous versions of Web sites, but keeps the stored versions up after they have been removed. For example, if you want to see what the original pets.com looked like, even though the company behind it is gone, you can check it out.

If you want to be thorough in deleting your content and remove it from the Internet Archive, you need to either use robots.txt exclusion before deleting the site, something that you have to plan for in advance, or file a notice and have them remove it. The former strategy is faster and better, but doesn’t help if you’ve already shut your site down.

However, this just covers the incidental and well-known archives, works are not just copied by search engine spiders and caching systems. Many times, they are copied by people and other bots. In the long run, those are the copies that may be hardest to remove.

Copy Artists

In addition to the copies of a work that get made just by posting a work to the Web, there are many others that humans and bots make that help distribute a work well beyond its original location.

For one, humans, sometimes legitimately, sometimes not, republish work on other Web sites. Whether it is a quote for a review or an outright plagiarism, it’s a common activity. Second, scrapers, spammers and aggregators copy and paste content wholesale, usually without permission.

Though one certainly still holds copyright over any work they create, whether it is online or not, filing takedown notices or taking action against every copy is likely impractical. Not only are many of the uses likely to be within the bounds of fair use, but others will be in countries with no takedown system.

The result is that, while you can definitely file notices for many such uses, you can’t for all. Furthermore, the time and energy to remove every work possible would be incredible. This is a large part of why even the most aggressive copyright holders prioritize the infringers they go after.

On the other hand, those sites too have a limited shelf life. Spam blogs are typically shut down within a few weeks or months and most other sites reach a point where they too are shuttered. As a work gets older, interest in it wanes and copying becomes less frequent. Over time, copies of the work may exist, but they are so few and far between that one searching for it would be unlikely to find it.

How long that takes will vary wildly, but in a few years, typically, most easily located copies of a work will be gone, barring any kind of extreme public interest, would likely be gone.

Conclusions

The bottom line is simple, though there is no way to truly delete something on the Web, it is possible, and even probable, that, with time, it can be arranged so that no one can find the work in question. It may require some pre-planning and some work after the fact, but it can be done.

However, don’t expect it to be so easy, or even possible, if your creation is of some heightened public interest. At that point, the adage of the Web being impossible to disappear in is likely true.

But for the rest of us, who have limited interest in our works and aren’t celebrities, it is a very different story. We might not be able to disappear completely, but we likely can do so enough that one will not easily find us.

It is small comfort, but for those who want to disappear to a villa in Key West without thinking of the Web, it is all they can realistically hope for.

Tags: , , , ,

This post was written by

You can visit the for a short bio, more posts, and other information about the author.

Submissions & Subscriptions

Submit the post to Reddit, StumbleUpon, Digg or Del.icio.us.

Did you like it? Then subscribe to our RSS feed!



  1. By Melissa Donovan posted on January 26, 2009 at 6:06 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    This is good information to have – not necessarily for myself, but for my clients. Asking about deleted content is exactly the kind of question I can expect, and now I can just refer folks to this article. Thanks!

    Reply

  2. By Nan Hoekstra posted on January 26, 2009 at 8:04 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    I’ve wondered about all that, and you filled in the blanks with answers…eloquently too. Ah, a post that is almost a poem…
    “As it turns out, most of the echoes and shadows we cast begin to disappear within just a few moments.”

    Reply

  3. By Miles Technologies posted on January 27, 2009 at 11:13 am
    Want an avatar? Get a gravatar! • You can link to this comment

    With the increasing popularity of search engine use, it is true that content on the web can have lasting effects on one’s online and offline reputation. One way to take control over online content is through Online Reputation Management, which works to increase one’s positive internet visibility.

    Reply

  4. By Steve posted on January 28, 2009 at 10:27 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    This is one of those things that you don’t really consider until it’s a little late. Today’s sophisticated data mining engines are a double edged sword if you are trying to get rid of anything completely on the net. Gemalto has a lot of info about personal data security etc., among other digital security topics.

    Reply

  5. By Scott posted on May 29, 2009 at 3:13 am
    Want an avatar? Get a gravatar! • You can link to this comment

    I’ve always felt that a great way to “effectively” delete unwanted content about something that you’re not able to take offline is to add content on the same subject that is portrayed in a more favorable light.

    Reply

    Your words are your own, so be nice and helpful if you can. If this is the first time you're posting a comment, it might go into moderation. Don't worry, it's not lost, so there's no need to repost it! We accept clean XHTML in comments, but don't overdo it please.

    Current day month ye@r *