Plagiarism-Fighting Network Tools: Part Three

Filed as Features, Guides on August 6, 2007 11:15 am

In the first part of this series, we took a look at the basic networking tools needed to track down a plagiarist or a scraper site. In the second part, we looked at the tools needed to definitively locate the host of that site and determine who to contact about the theft.

However, often times, it is not just your work that is at risk. Many spammers create entire networks of sites, sometimes numbering into the hundreds of thousands, all of which scrape and spam content in hopes of making a quick buck.

Fortunately, there are tools that make it possible to peer into the world of spammers and at least get some idea of how their operations work. They also enable you to backtrack through their operations and, possibly, shut down all of or a large portion of their sites.

Doing so just requires a little bit of sleuthing.

Reverse IP

On the Internet, every machine, be it a home computer, server or your Wii, has an IP address. However, not every IP address is connected to one machine. Similarly, every domain name has an IP address, but not every IP connects to just one domain.

Many spammers, in order to save money, will host many domains on one server and one IP address. A reverse IP check simply takes a look at an IP address and determines what domains are hosted there. If the domain has what is known as a dedicated IP address, then only one site will show up, if many sites are sharing the same IP, thousands might.

As with the previous posts on this topic, we will be using Domain Tools as as our primary site for these services. However, unlike previous tools, this tool is not entirely free to use. Before you can use the Reverse IP tool, you either need to sign up for a free trial at Domain Tools or pay a small fee for a one-time report.

Fortunately, Domain Tools does offer a preview of the results to let you see if the test is even necessary. If there is only one other domain or the other sites in the preview appear to be legitimate, meaning that the spammer is not the only one on the server, then the tool is not worthwhile. However, if the preview spits back other scraper sites and junk domains, it might be worth the small expense.

To see that preview, simply visit the whois screen we discussed in the previous articles and click the yellow “R” button next to the IP address. The screen it takes you too will tell you the number of domains hosted at that IP address and list the first four. You can check those domains to see if anything seems amiss.

As we discussed above, it is a common tactic among spammers to purchase a dedicated server with its own IP address and host thousands of domains at that site. If that is the case, this tool can expose the network, thus enabling you to present the host with a more thorough abuse report.

Though I hesitate to even mention a paid tool, this one should be used only on the rarest of occasions. Since most spammers propagate their networks across free services, such as Blogspot, this tool will not be useful in most cases of spamming. However, the cases where it is useful more than justifies the small expense or the hassle of setting up a trial account.

It can, in one fell swoop, expose and close an entire spam blog operation. It is a powerful tool that can help Web hosts close down thousands of infringing domains. At this time, there is no tool like Domain Tools Reverse IP lookup and it usefulness in these rare cases is undeniable.

Backlink Check

However, an even more common tactic among spammers is crosslinking among their networks and sites. Since much of search engine ranking is determined by how many sites link to you, by linking within their own network, they can artificially inflate their link count, giving the appearance that their sites are more popular than they are.

Fortunately though, these links are easy to follow and, since only spammers would link to a spam site, it makes it easy to determine what other sites are a part of his network.

Even better, a backlink check does not require a trip to Domain Tools or another networking site. Rather, we can use Google to do the check for us.

To perform the check simply do a search using the following syntax “link:domain.com”. By putting the word “link” with a colon before the domain name, you are telling Google to provide you a list of all sites that are linking to a specific page. For example “link:blogherald.com” will produce a list of sites and pages that link to this site. This method can also work with specific pages on a site as well as subdomains.

You can use this tool one of two ways. First, you can simply do a backlink check on the spam blog itself and see if other spam blogs link to it Second, and perhaps most importantly, if you see that the scraper site is regularly linking to another Web page, you can put that url into your search.

Either way, if your plagiarist engages in linking between his/her sites, you can track them down and report more of the network for abuse. This method is very similar to what Microsoft proposed for dealing with search engine spam in their Strider Search Defender program and is easily one of the most effective methods for discovering a spam blog network.

Though this system is not perfect in that some plagiarists keep their sites separate and don’t cross-link, many do and this system can expose a large portion of such a spammer’s network almost instantly. This makes it possible to report these abuses both to the host and, if appropriate, to the search engines.

All in all, it is a very simple tool with many powerful uses. If nothing else, it can be used to see who all is linking to your site and measure your own effectiveness on the Web.

Conclusions

Through the course of this series we have looked at some of the networking tools that bloggers and Webmasters need in order to track down plagiarists and stop misuse of their work. Though not a complete or definitive list, the tools mentioned in this series should be adequate to resolve well over 99% of all plagiarism cases.

Also, through this series, we have discussed several difficult and potentially confusing networking concepts. Though unpleasant, it is important to understand how these elements work. Simply put, posting one’s work to the Web without fundamental knowledge of how the Web works is very dangerous. Rest assured that the spammers and scrapers know these tools well and it is important we do too.

This is not a matter of being a technophile or a geek, this is a matter of self defense. Knowing and understanding these tools is no different than knowing how to escape a dangerous situation on the street, unpleasant to learn, but important to know.

If you learn one thing from this series, learn that these tools are available and that you have them at your disposal. The important thing is to not feel helpless when confronted with plagiarism and/or scraping. You can always learn how to use the tools when needed, but you can’t do anything if you give up before you start.

The important thing is to remember to keep fighting and that the weapons are ready for you when you are ready for them.

Tags: ,

This post was written by

You can visit the for a short bio, more posts, and other information about the author.


Submissions & Subscriptions

Submit the post to Reddit, StumbleUpon, Digg or Del.icio.us.

Did you like it? Then subscribe to our RSS feed!



  1. » Plagiarism-Fighting Network Tools: Part ThreeAugust 6, 2007 at 11:28 am
  2. links for 2007-08-07 at contentious.comAugust 6, 2007 at 6:47 pm
  3. Copyright and Plagiarism on Blogs | Dipping into the BlogpondAugust 8, 2007 at 9:50 pm

    Your words are your own, so be nice and helpful if you can. If this is the first time you're posting a comment, it might go into moderation. Don't worry, it's not lost, so there's no need to repost it! We accept clean XHTML in comments, but don't overdo it please.

    Current ye@r *