The 6 Steps to Stop Content Theft

Filed as Features on November 26, 2007 12:35 pm

Repost This

With spammers and plagiarists becoming more prolific and more aggressive than ever, content theft is no longer a matter of “if”, but “when”.

Where once protecting content was the realm of lawyers and billion-dollar industries, it is now important for Webmasters, large and small, to be familiar with both the laws and the tools available for dealing with content theft.

Fortunately, the steps for fighting plagiarism are easy to follow and, for the most part, the tools are free and readily available.

If you take a few moments to familiarize yourself with the process and technology, you can become a champion plagiarism fighter in short order and get back to the business of running your site before you realize how effective you’ve become.

Step One: Detection

The Internet is vast and detecting content theft can feel like a needle in a haystack. Fortunately, we have tools that are designed to wade through the Web and find exactly what we’re looking for. Though some scrapers and plagiarists are kind enough to leave you trackbacks that lead you straight to their infringement, for those who aren’t that nice, the following tools can make life a lot easier.

  • Copyscape: Punch in a URL, see a list of potential matches. It can’t be any easier. Though the free service might be too limited for for many Webmasters, the paid service starts at pennies a search and is well worth the money. Drastic improvements to the service have made it a force to be reckoned with.
  • Google Alerts: Why search for content yourself when Google can do it for you, every single day? Simply punch in a unique phrase from your work, put it in quote marks and Google Alerts will email you suspect sites every day. Great for static content or keystone pieces that are frequently stolen. Also works great with the Digital Fingerprint Plugin for WordPress.
  • Mahalo’s Plagiarism Detection Tool: Add this applet to your browser’s toolbar, highlight some text and click the button. You’ll be whisked to a Google result with all suspect matches. Though really just a simple JavaScript trick, it is great for random plagiarism checks and quick comparisons.

Step Two: Preserving the Evidence

Once you’ve discovered the misuse of your content, you next need to preserve what you have found. Since the later steps usually result in the infringing site either being altered or taken down, having a personal copy of the site both for your records and to verify what was there previously can be very important in the event that a dispute arises later.

Fortunately, there are several great services to help preserve Web pages on the Web and offer some third-party non-repudiation of the results.

  • WebCite: Originally intended for academics, WebCite creates on-demand caches of Web pages and stores them in simple URL that you can easily access later or offer others as evidence. See a sample snapshot of my site here.
  • Furl: This LookSmart service functions both as an archiving tool and as a bookmarking service. Not only does it create caches of bookmarked pages, but also allows you to organize bookmarks with tags. Also check out MyWeb by Yahoo!.
  • The Internet Archive: The grandfather of all Web archives, the Internet Archive caches pages automatically and keeps a regular archive of most of the Web. Great for situations where you don’t have a personal archive.

Step Three: Contact the Plagiarist (if Practical)

Once the plagiarism has been discovered and the evidence preserved, the next step is to try and resolve the situation. For some, this involves first contacting the plagiarist directly. Though not practical with most spammers and scrapers, it may be possible with human plagiarists and, while results may vary, generally leads to the most amicable solutions.

However, contacting a plagiarist isn’t always as simple as sending an email, sometimes getting in touch takes a little extra work.

  • Domain Tools: If the plagiarist/spammer uses their own domain, you can perform a Whois search at Domain Tools to locate the contact information for them. If you punch in the domain, click submit and scroll to the bottom, you’ll see all of the contact information for the site. Remember, even if the site uses an anonymous service, such as DomainsByProxy, the email address often forwards to the real account so you can still use that address to contact them.
  • 10 Minute Mail: If the plagiarist doesn’t have their own domain and, instead, is on a service such a social network, you may have to register for a new account before you can contact them. If that is so, you may want to keep your personal email private when registering for this one-time use account. You can do that easily by using 10 Minute Mail to create a temporary email account you can use to receive registration emails.
  • Commentful: Though leaving a comment on a site is far from the best way to contact a plagiarist, especially considering that it could lead to other legal issues such as defamation and create unnecessary drama, it sometimes is the only approach available. If that is the case, keeping track of comments and replies can be tricky. Rather than checking back with the site regularly, use Commentful to notify you when a reply is posted.

Step Four: Contacting the Advertisers (optional)

Though many find it faster to just demand takedown of the work and be done with it, others want to strike at the heart of the plagiarist, their profit motive. Since one advertising account can serve hundreds, if not thousands, of spam sites,. targeting the advertisers is an obvious choice for dealing with spammers. Though not always a good solution or a practical one, it can be a powerful tool.

To aid with that, here are several resources to help make the process easier.

  • Adsense DMCA Policy: By far and away the most common advertising network on the Web, Adsense comes up more often than any other service on plagiarist’s sites. Having their DMCA policy handy is absolutely essential if you plan to take this route.
  • Plagiarism Today’s Stock Letters: Since most advertising networks require a DMCA notice to take action, you are going to need a template. Fortunately I provide one on my site that has worked well for me over the years.
  • FaxZERO: Many advertising networks, including Adsense, will only accept faxed or mailed communications. In this age of email and Internet that can be extremely frustrating. Fortunately, FaxZERO provides a means to send quick faxes via the Internet at no cost.

Step Five: Contacting the Host

Of all the methods of resolving plagiarism issues, contacting the host to get he offending site/page removed is almost always the fastest and most reliable. Laws, such as the Digital Millennium Copyright Act (PDF), require hosts to remove infringing materials once they have been properly notified.

For those interested only in a quick, clean resolution to the matter, this route is almost certainly the first, and final, cessation step.

  • Domain Tools (linked above): If the plagiarist is hosting their own domain, then Domain Tools comes to the rescue again. In addition to providing Whois information, Domain Tools also provides information about the host. Simply punch in the domain you want to learn more about and scroll down to the “Server Info” heading. You can see where the domain is hosted and who it is hosted by in the “IP Location” line. You can also click the Red “W” on the line above it to perform an IP Whois and obtain even more information.
  • DMCA Contact Information: If you need to know who the DMCA contact for a particular host is, the DMCA Contact Information Page on Plagiarism Today may be a good place to start. With over 100 major hosts listed, there is a very good chance that the host you are looking for is already included.
  • Copyright Office’s Directory of Agents: If you can’t find the DMCA contact information on the list above or by searching the host’s site, usually under their terms of service or “legal” page, then check with the United States Copyright Office and see if they have registered there. Since the DMCA requires hosts to register in order to obtain the protections that the DMCA provides them, there is a very good chance that they have.

Step Six: Contacting the Search Engines

The last step, if all else has failed, is generally to contact the search engines and get the offending content removed. Though it doesn’t actually remove the content from the Web, it prevents others from finding it, stops the plagiarist from gaining any benefit from it keeps and from the misuse from damaging your rankings for shared search terms.

Fortunately, the DMCA also requires search engines to remove infringing URLs once properly notified and, even if the other techniques fail, this one works very reliably.

  • Google Site Status Wizard: Before you can report a site to Google for infringement, you have to be sure that it is indexed. If you didn’t discover the plagiarism through the search engine initially, use their Site Status tool to see if the site is indexed already. You can also use it to ensure that the site is delisted from the search engine after your complaint is sent.
  • PrimoPDF: Google’s DMCA policy requires a handwritten signature before they will act on a complaint. You can either fax the complaint in, perhaps using FaxZERO above, or you can scan your signature, place it into a document and print it to a PDF. From there, you can email it to their DMCA agent (PDF). PrimoPDF makes it easy to print to a PDF from any application. Also, you can use OpenOffice.org to create the letter and export it to a PDF directly.
  • Plagiarism Today’s Stock Letters (Linked Above): Dealing with search engines requires a special DMCA notice. However, just such a notice is provided at my site. You can use it in conjunction with OpenOffice.org or PrimoPDF to create your letters to send to the various search engines.

Conclusions

Though plagiarism fighting has not traditionally been the realm of your everyday author or artist, the Internet has forever changed the game. Fortunately, the technology has risen to the challenge and empowered us to protect our content in ways that the bad guys never could have envisioned. Even better, it continues to rise even higher, promising us new tools in the coming months and years to even better protect our content.

In the meantime, it is important that we learn the laws, procedures and tools that are at our disposal and do the best with what we have. Though the approach may be somewhat hodgepodge, it is very effective and has worked for me in over 600 cases.

However, what is important about me is that there is nothing special about me in this matter. Six years ago I had no interest in copyright law. I learned the techniques the same way other Webmasters do today, by hitting the books and learning from others.

Though it sounds like a great deal of work, I can not think of anything else that has been so easy for me to learn or had so many wonderful people there to help me.

Considering how important this issue is and how little time and energy is truly required, there is no reason not to familiarize yourself with the procedure and make use of the resources available.

Tags: , , , , ,

This post was written by

You can visit the for a short bio, more posts, and other information about the author.

Submissions & Subscriptions

Submit the post to Reddit, StumbleUpon, Digg or Del.icio.us.

Did you like it? Then subscribe to our RSS feed!



  1. PlagiarismToday » Copyright 2.0 Show - Episode 34 - Internation IncidentsNovember 26, 2007 at 1:30 pm
  2. By TechDune posted on November 26, 2007 at 8:31 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    Thanks a million Jonathan,
    I have been going through some serious problems.Many
    Bloggers started copying articles from my blog and posted in their blogs and some even got stumbled..
    I have made some arrangements already.
    That was a healthy list.

    Reply

  3. By syaifudin zuhri posted on November 26, 2007 at 8:54 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    Fantastic, I need this

    Reply

  4. By Tony Hung posted on November 26, 2007 at 10:40 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    Wow … what a fantastic post. Thanks for breaking it down, Jonathan! :)

    Reply

  5. Linker Barn: Tuesday, November 27November 27, 2007 at 12:31 am
  6. By Vienna posted on November 27, 2007 at 1:22 am
    Want an avatar? Get a gravatar! • You can link to this comment

    An excellent and timely post because indeed, this internet plagiarism is a growing problem for many bloggers.

    It’s really astounding how fast the internet is growing and how easy for plagiarists to copy one’s work so thanks for the advice on how to handle plagiarism. I hope that the tools we use and laws that are implemented now are continously developed to keep up with the growing needs of internet writers, bloggers, etc to be protected from plagiarism.

    Reply

  7. By erisian posted on November 27, 2007 at 11:20 am
    Want an avatar? Get a gravatar! • You can link to this comment

    phenomenal job ! Thanks to both you and Lorell for the great content.

    you have an error in your “about” section..
    Check the word “ago”

    Jonathan Bailey writes at Plagiarism Today, a site about plagiarism, content theft and copyright issues on the Web. He started Plagiarism Today about in 2005 [ago] as a way

    Reply

  8. By Jonathan Bailey posted on November 27, 2007 at 11:58 am
    Want an avatar? Get a gravatar! • You can link to this comment

    TechDune: I’m glad that the article was helpful but I am sorry to hear about your troubles. If I can help in any way, please let me know! You can also post to the Performancing Legal Issues Forum if needed.

    Syaifudin: Very welcome!

    Tony: My pleasure, it is still frustrating that I had to pass over some great tools. Another post though.

    Vienna: I hope so too. However, it appears that the laws are starting to lag behind the needs on the Web, especially when you look at it from an international stance. It may be time for another broad international treaty, just to keep pace.

    Erisian: Thanks for the correction, I’m fixing it now. That’s what I get for hammering that out in about five minutes the day before my first post…

    Reply

  9. By Jonathan Dingman posted on November 27, 2007 at 2:45 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    Wow guys, awesome write up! This is an article I’ve been looking for, for a while now.

    *scratches head and wonders how else he can socialize this article…*

    Reply

  10. Stopping Content Theft at Burst BlogNovember 27, 2007 at 4:37 pm
  11. By Dean Rieck posted on November 27, 2007 at 11:27 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    I’ve found a number of my articles reprinted without my permission. But interestingly, many provide a link or at least a Web address leading people to my site.

    One reason could be that I encourage people to repost my content and all I ask for is a byline and a link to my site. I provide a “reprint policy” link with every article that launches a popup which asks people to not be a jerk and please just ask me before they use my content. This is on my main site and not on my blog, since my blog posts are not as thorough as the articles.

    The real thieves online are print publishers who think they own all the content they publish. Virtually all of my articles have been abducted by print publishers who actually sell my articles online with no compensation for me. I don’t make a fuss, though, since it’s free promotion.

    Reply

  12. By Jonathan Bailey posted on November 28, 2007 at 1:08 am
    Want an avatar? Get a gravatar! • You can link to this comment

    Dean: Like you, I am fine with a lot of reuse of my content. I have a Creative Commons License and encourage others to take advantage of it. The problem is that too many people are now violating the simple “don’t be a jerk” premise.

    A lot of them are spammers and scrapers that do it automatically, others are just idiots that don’t want to give a link back.

    I am a bit confused when you talk about the print publishers. Are you saying that you’ve had work you’ve put on the Web taken by print publishers and sold or is there something else going on? If that is the case, you might want to reconsider making a fuss, it would be very easy to do, if you can prove it, and depending on the situation, there could be a decent amount of money involved.

    That would be something for you discuss with your attorney. However, I don’t think there is anything wrong with getting what you are owed.

    Reply

  13. links for 2007-11-28 at Jacob ChristensenNovember 28, 2007 at 7:24 am
  14. By Guardian Angel posted on November 28, 2007 at 7:47 am
    Want an avatar? Get a gravatar! • You can link to this comment

    Hi! It has been quite a while since I have been looking for a post about his on-line theft prevention, and I am glad to see your site through google alerts. Now I can safely say that somehow I am protecting my self. Another thing, though I have only less than a thousand visitors per day, I think I should have this copyscape widget, it makes sense. Being new here, I really want to thank you for such great tips!

    Reply

  15. By Kelly Jad'on posted on November 28, 2007 at 8:27 am
    Want an avatar? Get a gravatar! • You can link to this comment

    Great article! I myself have been a victim of splogging. Here is another article on the subject: http://desicritics.org/2007/09/04/062516.php

    Kelly Jad’on
    http://www.BasilAndSpice.com

    Reply

  16. By Jonathan Bailey posted on November 28, 2007 at 10:53 am
    Want an avatar? Get a gravatar! • You can link to this comment

    Kelly: Thanks for the compliments and for the link. Great article!

    Reply

  17. How to stop content theft–Blog MagazineNovember 28, 2007 at 12:00 pm
  18. By Paul Egges posted on November 28, 2007 at 9:35 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    What about images? I’ve found were images from one company was used by another.

    Reply

  19. By Jonathan Bailey posted on November 29, 2007 at 2:00 am
    Want an avatar? Get a gravatar! • You can link to this comment

    Paul: Detecting images is tricky, but the rest of the steps pretty much remain the same. I’m working on a new method to detect image plagiarism and I should have something about that on this site and mine soon.

    However, since you already found the plagiarism, steps 2-6 still apply pretty much the same as if it were text. If you have any specific questions, feel free to ask!

    Reply

  20. Cristina will help you. » Before you copy and paste…November 30, 2007 at 7:20 pm
  21. BlogBuzz December 1, 2007 » Webmaster-SourceDecember 1, 2007 at 5:15 am
  22. By Jonathan Dingman posted on December 1, 2007 at 9:46 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    I actually fly on the other side of the spectrum.

    I do not like anyone re-printing — scrapping it — because it doesn’t actually add any value to the web.

    If someone wants to take an excerpt from my article, great! I would happily allow that.

    But do not, by any means, just copy text from my article or re-print it with some random name as the link back — or not even linking back at all.

    I simply do not like sploggers. It’s just cluttering up the Internet more and more everyday with more worthless crap.

    Reply

  23. By Jonathan Bailey posted on December 1, 2007 at 10:14 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    Jonathan: Though I agree that I don’t like any and all spam bloggers, I have also grown to realize that limited reuse can have a purpose, especially if it is targeted at different audiences.

    However, I have to say that the amount of reuse I consider to be acceptable has gone down in recent years, mainly due to spam blogging. Sure, some people use my content in ways that I would gladly permit under my CC licenses, but it seems the bulk of reuse I’m seeing comes from spammers and junk purveyors.

    I guess you hit at a good point, we should stop and ask not so much if this is good for us, but good for the Web. No easy answers there that I see.

    Thank you for your input.

    Reply

  24. This Week’s Bookmarks at Not So RelevantDecember 2, 2007 at 3:04 am
  25. By Jonathan Dingman posted on December 3, 2007 at 12:29 am
    Want an avatar? Get a gravatar! • You can link to this comment

    Jonathan,

    I agree with you to a certain degree about re-producing content, but from a search engine’s point of view, it’s just a bunch of crap.

    That’s why I take such a firm stance on copying any of my content. I don’t show full feeds RSS anymore, only partial, because so many people were stealing my content.

    My subscriber rate went up and so did my traffic, so the switch was good for me.

    But aside from that, I always want to keep the web’s best interest at heart. The web only exists because people like us exist. If webmasters didn’t care, Google would just have a search engine full of pay per click ads and make money on every single click — and they wouldn’t even need organic search (and they wouldn’t care whether someone re-produced your content or not).

    Just another note from my side of things.

    Thanks for the great article again, it’s a good write-up.

    Reply

  26. links for 2007-12-07 « AB’s reflectionsDecember 7, 2007 at 3:27 pm
  27. Sunday Links 1 | Earn BloggerDecember 9, 2007 at 1:04 am
  28. Blog Plagiarism @ Jupiter Media? | Scott Clark - Finding the Sweet SpotDecember 9, 2007 at 9:16 am
  29. Improve Your Blog: Usability : The Blog HeraldDecember 12, 2007 at 1:08 am
  30. bonq.net/flipp » Blog Archive » daily del.icio.usDecember 13, 2007 at 9:59 am
  31. Protecting your work from content theft online « BrainripplesDecember 18, 2007 at 10:47 am
  32. What Do You Want Gone From the Web in 2008? : The Blog HeraldDecember 20, 2007 at 10:47 am
  33. By Senuf posted on December 30, 2007 at 8:34 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    Hello. Your article is NECESSARY stuff, indeed. I bookmarked it, and I’ll add a link to it from my blog, if you don’t mind (anyway, nobody reads my blog, heheh –and don’t even try, Jonathan, it’s in spanish).

    Reply

  34. I Need to Comment More. Don’t You? : The Blog HeraldJanuary 10, 2008 at 6:20 am
  35. links for 2008-01-12 » eWhisper.netJanuary 11, 2008 at 7:37 pm
  36. (EMP) E-Marketing Performance » : » Team Reading List 1.14.08January 14, 2008 at 3:27 pm
  37. By Solutions posted on January 17, 2008 at 12:44 am
    Want an avatar? Get a gravatar! • You can link to this comment

    Great information:

    As your writing skills increase, you are more open to these types of flattery/attack. Sometimes you just want to shake them down so you can confront them with a “Dude, write your own”! Warning, but oftentimes, the webmasters just use harvesting tools and fake aliases for the domain info. Great tips, thanks.

    Reply

  38. By Bud posted on January 18, 2008 at 5:13 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    Great article, I have been having some issues with my blog getting scraped by a few sites. I will try some of these things and hopefully I can get it resolved. I really hate how you work on writing something and people steal it to try and benefit themselves.

    Bud
    http://www.budcalabrese.com

    Reply

  39. The LC on the DL » Blog Archive » links for 2008-02-07February 6, 2008 at 8:25 pm
  40. Protect Your Most Valuable Blog Resource, Stop Content Scraping and Plagiarism | WordPress PhilippinesFebruary 27, 2008 at 7:02 am
  41. Breaking Trust: How Not To Link to a Plagiarist : The Blog HeraldMarch 4, 2008 at 1:06 pm
  42. Blogs Are Public Documents - Bloggers and Commenters Beware : The Blog HeraldMarch 17, 2008 at 7:48 pm
  43. Blogging is About Writing - and Not : The Blog HeraldMarch 25, 2008 at 11:27 am
  44. Cleaning Blogspot Spam: Is Google Responding to Public Pressure? : The Blog HeraldApril 1, 2008 at 8:30 am
  45. Don’t get stressed about blog scrapers stealing your contentMay 15, 2008 at 12:15 pm
  46. By Mia Tyler posted on June 19, 2008 at 10:20 am
    Want an avatar? Get a gravatar! • You can link to this comment

    Hey!…Thanks for the nice read, keep up the interesting posts about scan ip address..what a nice Thursday .

    Reply

  47. By Most important step - Prevention posted on June 22, 2008 at 3:38 am
    Want an avatar? Get a gravatar! • You can link to this comment

    The most important is to prevent plagiarism, and you don’t even mention it. It is time for us to be “more aggressive” and stop them before they steal.
    I had a Google PR5 site with good, unique content, and they come, and they take it away. My page rank drop to 1 , my earnings on ad sense simply melted, i was in despair, then I have found: http://donotcopy.org to be honest, it looks stupid, but it works! Last six months i had 0 copy attempts, my page rank is 5 again, ad sense earnings are back to normal. Thank you do not copy , thank you (sorry for my bad English but i had to spread the word).
    S.V. my site is: http://uradi.com

    Reply

  48. By LukeM posted on July 1, 2008 at 1:11 am
    Want an avatar? Get a gravatar! • You can link to this comment

    Very handy info, many thanks. Here’s an article on secure PDF’s which may also be helpful for preventing content theft.

    Reply

  49. Blogging Jobs: How Much Are Bloggers Paid to Blog? : The Blog HeraldJuly 15, 2008 at 12:35 pm
  50. Prepare Yourself for the Blog Bullies | The Blog HeraldAugust 2, 2008 at 5:19 am

    Your words are your own, so be nice and helpful if you can. If this is the first time you're posting a comment, it might go into moderation. Don't worry, it's not lost, so there's no need to repost it! We accept clean XHTML in comments, but don't overdo it please.

    Current day month ye@r *