Technology has been very kind to the plagiarist.
Where once the plagiarist would have to re-type the paper or repaint the portrait, content theft now is just a mouse click or a keyboard shortcut away. Worse yet, whole technologies have been built around content theft. For example, RSS scraping applications can steal the content from thousands of feeds in a single hour, creating countless spam blogs.
However, technology is a double-edged sword. At the same time it has made content theft easier than ever, it has also empowered content producers with new, more powerful means of monitoring and enforcing their content rights.
No longer does a copyright holder have to wait to accidentally discover plagiarism or hope that a bystander will alert them, no longer is enforcement a long, arduous process. Every Webmaster, no matter how small, has the tools they need to track and stop theft of their content.
It is simply a matter of knowing where to look.
Preventing content theft is something of a holy grail. It is the perfect solution, but also the least practical. The tools needed to prevent copying of work generally do more to annoy legitimate users than to stop plagiarists. That being said, there are a few prevention tools worth taking a look at.
Pictureshark – A hard to remove transluscent watermark is by far the most effective method of preventing image theft. Pictureshark is a fast, free and powerful batch image watermarking tool that can process hundreds of images with a variety of effects.
Devpapers .htaccess Hotlink Protection– For Webmasters that pay their bandwidth bills, image hotlinking is a double problem. Not only is it a form of content theft, but also of bandwidth theft as every load of the plagiarist’s page requires the image be pulled from the original server. Webmasters should test to see if their images can be hotlinkied and, if they can, consider editing the .htaccess file to prevent hotlinking or use a PHP script to achieve that end.
Bad Behavior – A PHP script available for most CMS platforms, Bad Behavior is an anti-spam tool that can also be used to stop some forms of automated content theft. Though not necessarily useful against RSS scraping, any “evil” bots that visit your sites, no matter for what reason, are likely to be caught in Bad Behavior’s net. This can stop malicious spidering and automated saving of content.
Watermark.Ws – Don’t have time to download software to watermark your images? Use Watermark.ws and add your overlays on the Web. Watermark.Ws lets you add text or an image over your copyrighted work and set the alpha level, enabling centrally-located and more powerful watermarks.
Detecting content theft, though not as desireable as prevention, is a much easier method. There are many tools that can easily detect content theft and, from there, one can easily follow up on it. Best of all, this has no impact on the legitimate readers of your site, just the those that abuse your content.
Google Alerts – Rather than searching for your own content by hand from time to time, let Google Alerts do it for you. Punch in a few unique phrases from your work, set Google Alerts to inform you when those phrases appear on the Web and relax. Best of all, it can be combined with other tools below for an even more powerful experience.
Copyscape – Based upon the Google API, Copyscape enables you to search for plagiarism of an entire page. It looks for content theft that traditional Google searches and Google Alerts may miss including sites that take only a part of your work. The free version is very limited and will only display the top ten results. Thus, it may not be practical for sites that allow some reuse of their content.
Digital Fingerprint Plugin – Maxpower’s Digital Finger Plugin for WordPress appends a unique phrase or key to the end of every post in your RSS feed. It then offers tools to help you search for that fingerprint on the Web. The plugin also works well with Google Alerts.
Technorati Watchlists – Much like Google Alerts, Techorati watchlists can be used to inform you instantly when unique keywords or a fingerprint appears on another blog. A very powerful tool for blogs.
Google Image Search – Detecting image plagiarism is very difficult, however, if you give your images unique file names you can search for that name in Google image search and locate duplicates of it that way. Most plagiarists do not bother to change image names when putting it up on their site, making it very easy to spot such infringements.
Once plagiarism has been detected, it has to be stopped before the detection is of any use. Fortunately, there are several tools to help.
Copyfeed – A veritable swiss-army knife of content protection, Copyfeed not only adds a digital fingerprint to detect infringement, but also can be used to embed IP address of RSS scrapers in the posts andt hen, in turn, ban them from accessing the feed. For WordPress users, this plugin is practically a must-have.
Ebay VeRO Program – If your content regularly appears on Ebay, it might be worth your time to sign up for Ebay’s Verified Rights Owner Program to enable you to easily close auctions that infringe upon your rights. VeRO is easily the most powerful program of its kind on the Web and worthwhile for any Webmaster that finds a great deal of their work on Ebay.
Sometimes, when stopping plagiarism or content theft, you can not take action yourself and, instead, have to report it to someone else. In those cases, there are many different tools and resources to help.
Domain Tools – Need to quickly find out who the host is of a dot com? Domain Tools can help. Just type in the domain and you’ll get all of the information you need about the site. Under “Server Data” you can easily locate all of the information about the server, including who operates it.
DMCA Templates – If you’re going to report a site to a U.S.-based host, you are going to need to file a DMCA notice. To do that, you’ll need a DMCA template. Fortunately, Ian McAnerin has posted templates of DMCA notices on his site, including one for each of the major search engines and a generic ISP one.
Plagiarism Today’s DMCA Contact Information – Once you know who the host is, the question becomes who to contact there. On my site, I’ve compiled a list of links to over 100 of the largest hosts, advertising networks and search engines. If you notice infringing content on a site hosted by one of these companies, just follow the link to report it. Odds are the company you need is somewhere on the list.
U.S. Copyright Office DMCA List – Similar to the list on my site, the United States Copyright Office maintains a list of DMCA contact information for various hosts. Though their list has many more companies, many major hosts have not registered and others have let their information fall out of date. However, it remains an excellent backup. This site requires Acrobat Reader or another PDF viewer to use.
Signature Extension – Instead of copying and pasting the template in every time it is needed, it is much easier to use the Signature Firefox extension and drop it in. Works great with shorter blocks of text and any template you might have use for. Functions well with Webmail systems as well as online reporting systems such as what is found at LiveJournal.
Finally, in the event of a dispute regarding the ownership of the work, it may be important to have some evidence that the work is truly yours. With that in mind, there are some great services to help you verify the creation of your work.
Numly – Numly’s ESN system enables users to register their content, which is then fingerprinted and timestamped, and receive a special number that can be used to retrieve all of the pertinent information on it. Free accounts offer three ESNs per month. A WordPress plugin is available.
Registered Commons – From the Creative Commons team comes Registered Commons. Like Numly, RC lets you register your work, receive a certificate and an identification number and gives you a timestamp plus a fingerprint of the work. Both Numly and RC allow you to embed Creative Commons Licenses into your work. RC is completely free to use.
Archive.org – The Web Archive, which famously indexes and preserves old versions of Web pages, makes it possible to backtrack and see roughly how long a page was up. Though not as exact as an ESN or a Registered Commons registration, it can be useful in cases where the work was not registered and only a rough answer is needed.
Furl – Though not a non-repudiation service, Furl can be useful in preserving evidence against a plagiarist. A social bookmarking site, Furl also saves a cached copy of every page saved to it, this can be very useful if the plagiarist changes the page or removes the content. It is also valuable for your own records to have a file of what you did and why, just in case the issue comes up again later.
While technology has been kind to the plagiarist, it has been at least as kind to the author. For the first time in history and individual, without any great expense, can reach a worldwide audience and get his message out in numbers never before dreamed of.
Yes, with it comes a risk of plagiarism and content theft, but solutions are being created to mitigate that risk and streamline the process of protecting content and securing author’s rights.
It is and will continue to be a bumpy road, but if one knows how to navigate it, the ride can be more than tolerable.