With spammers and plagiarists becoming more prolific and more aggressive than ever, content theft is no longer a matter of “if”, but “when”.
Where once protecting content was the realm of lawyers and billion-dollar industries, it is now important for Webmasters, large and small, to be familiar with both the laws and the tools available for dealing with content theft.
Fortunately, the steps for fighting plagiarism are easy to follow and, for the most part, the tools are free and readily available.
If you take a few moments to familiarize yourself with the process and technology, you can become a champion plagiarism fighter in short order and get back to the business of running your site before you realize how effective you’ve become.
Step One: Detection
The Internet is vast and detecting content theft can feel like a needle in a haystack. Fortunately, we have tools that are designed to wade through the Web and find exactly what we’re looking for. Though some scrapers and plagiarists are kind enough to leave you trackbacks that lead you straight to their infringement, for those who aren’t that nice, the following tools can make life a lot easier.
- Copyscape: Punch in a URL, see a list of potential matches. It can’t be any easier. Though the free service might be too limited for for many Webmasters, the paid service starts at pennies a search and is well worth the money. Drastic improvements to the service have made it a force to be reckoned with.
- Google Alerts: Why search for content yourself when Google can do it for you, every single day? Simply punch in a unique phrase from your work, put it in quote marks and Google Alerts will email you suspect sites every day. Great for static content or keystone pieces that are frequently stolen. Also works great with the Digital Fingerprint Plugin for WordPress.
Step Two: Preserving the Evidence
Once you’ve discovered the misuse of your content, you next need to preserve what you have found. Since the later steps usually result in the infringing site either being altered or taken down, having a personal copy of the site both for your records and to verify what was there previously can be very important in the event that a dispute arises later.
Fortunately, there are several great services to help preserve Web pages on the Web and offer some third-party non-repudiation of the results.
- WebCite: Originally intended for academics, WebCite creates on-demand caches of Web pages and stores them in simple URL that you can easily access later or offer others as evidence. See a sample snapshot of my site here.
- Furl: This LookSmart service functions both as an archiving tool and as a bookmarking service. Not only does it create caches of bookmarked pages, but also allows you to organize bookmarks with tags. Also check out MyWeb by Yahoo!.
- The Internet Archive: The grandfather of all Web archives, the Internet Archive caches pages automatically and keeps a regular archive of most of the Web. Great for situations where you don’t have a personal archive.
Step Three: Contact the Plagiarist (if Practical)
Once the plagiarism has been discovered and the evidence preserved, the next step is to try and resolve the situation. For some, this involves first contacting the plagiarist directly. Though not practical with most spammers and scrapers, it may be possible with human plagiarists and, while results may vary, generally leads to the most amicable solutions.
However, contacting a plagiarist isn’t always as simple as sending an email, sometimes getting in touch takes a little extra work.
- Domain Tools: If the plagiarist/spammer uses their own domain, you can perform a Whois search at Domain Tools to locate the contact information for them. If you punch in the domain, click submit and scroll to the bottom, you’ll see all of the contact information for the site. Remember, even if the site uses an anonymous service, such as DomainsByProxy, the email address often forwards to the real account so you can still use that address to contact them.
- 10 Minute Mail: If the plagiarist doesn’t have their own domain and, instead, is on a service such a social network, you may have to register for a new account before you can contact them. If that is so, you may want to keep your personal email private when registering for this one-time use account. You can do that easily by using 10 Minute Mail to create a temporary email account you can use to receive registration emails.
- Commentful: Though leaving a comment on a site is far from the best way to contact a plagiarist, especially considering that it could lead to other legal issues such as defamation and create unnecessary drama, it sometimes is the only approach available. If that is the case, keeping track of comments and replies can be tricky. Rather than checking back with the site regularly, use Commentful to notify you when a reply is posted.
Step Four: Contacting the Advertisers (optional)
Though many find it faster to just demand takedown of the work and be done with it, others want to strike at the heart of the plagiarist, their profit motive. Since one advertising account can serve hundreds, if not thousands, of spam sites,. targeting the advertisers is an obvious choice for dealing with spammers. Though not always a good solution or a practical one, it can be a powerful tool.
To aid with that, here are several resources to help make the process easier.
- Adsense DMCA Policy: By far and away the most common advertising network on the Web, Adsense comes up more often than any other service on plagiarist’s sites. Having their DMCA policy handy is absolutely essential if you plan to take this route.
- Plagiarism Today’s Stock Letters: Since most advertising networks require a DMCA notice to take action, you are going to need a template. Fortunately I provide one on my site that has worked well for me over the years.
- FaxZERO: Many advertising networks, including Adsense, will only accept faxed or mailed communications. In this age of email and Internet that can be extremely frustrating. Fortunately, FaxZERO provides a means to send quick faxes via the Internet at no cost.
Step Five: Contacting the Host
Of all the methods of resolving plagiarism issues, contacting the host to get he offending site/page removed is almost always the fastest and most reliable. Laws, such as the Digital Millennium Copyright Act (PDF), require hosts to remove infringing materials once they have been properly notified.
For those interested only in a quick, clean resolution to the matter, this route is almost certainly the first, and final, cessation step.
- Domain Tools (linked above): If the plagiarist is hosting their own domain, then Domain Tools comes to the rescue again. In addition to providing Whois information, Domain Tools also provides information about the host. Simply punch in the domain you want to learn more about and scroll down to the “Server Info” heading. You can see where the domain is hosted and who it is hosted by in the “IP Location” line. You can also click the Red “W” on the line above it to perform an IP Whois and obtain even more information.
- DMCA Contact Information: If you need to know who the DMCA contact for a particular host is, the DMCA Contact Information Page on Plagiarism Today may be a good place to start. With over 100 major hosts listed, there is a very good chance that the host you are looking for is already included.
- Copyright Office’s Directory of Agents: If you can’t find the DMCA contact information on the list above or by searching the host’s site, usually under their terms of service or “legal” page, then check with the United States Copyright Office and see if they have registered there. Since the DMCA requires hosts to register in order to obtain the protections that the DMCA provides them, there is a very good chance that they have.
Step Six: Contacting the Search Engines
The last step, if all else has failed, is generally to contact the search engines and get the offending content removed. Though it doesn’t actually remove the content from the Web, it prevents others from finding it, stops the plagiarist from gaining any benefit from it keeps and from the misuse from damaging your rankings for shared search terms.
Fortunately, the DMCA also requires search engines to remove infringing URLs once properly notified and, even if the other techniques fail, this one works very reliably.
- Google Site Status Wizard: Before you can report a site to Google for infringement, you have to be sure that it is indexed. If you didn’t discover the plagiarism through the search engine initially, use their Site Status tool to see if the site is indexed already. You can also use it to ensure that the site is delisted from the search engine after your complaint is sent.
- PrimoPDF: Google’s DMCA policy requires a handwritten signature before they will act on a complaint. You can either fax the complaint in, perhaps using FaxZERO above, or you can scan your signature, place it into a document and print it to a PDF. From there, you can email it to their DMCA agent (PDF). PrimoPDF makes it easy to print to a PDF from any application. Also, you can use OpenOffice.org to create the letter and export it to a PDF directly.
- Plagiarism Today’s Stock Letters (Linked Above): Dealing with search engines requires a special DMCA notice. However, just such a notice is provided at my site. You can use it in conjunction with OpenOffice.org or PrimoPDF to create your letters to send to the various search engines.
Though plagiarism fighting has not traditionally been the realm of your everyday author or artist, the Internet has forever changed the game. Fortunately, the technology has risen to the challenge and empowered us to protect our content in ways that the bad guys never could have envisioned. Even better, it continues to rise even higher, promising us new tools in the coming months and years to even better protect our content.
In the meantime, it is important that we learn the laws, procedures and tools that are at our disposal and do the best with what we have. Though the approach may be somewhat hodgepodge, it is very effective and has worked for me in over 600 cases.
However, what is important about me is that there is nothing special about me in this matter. Six years ago I had no interest in copyright law. I learned the techniques the same way other Webmasters do today, by hitting the books and learning from others.
Though it sounds like a great deal of work, I can not think of anything else that has been so easy for me to learn or had so many wonderful people there to help me.
Considering how important this issue is and how little time and energy is truly required, there is no reason not to familiarize yourself with the procedure and make use of the resources available.