Scraping is one of the most annoying things that bloggers have to deal with. It can hurt their search engine ranking, cause confusion among readers and cause them to unwittingly help spammers line their pockets.
Nobody likes being scraped but it seems that some sites are able to survive it relatively unscathed while others are bumped clean out of the search engines, almost instantly replaced by the spammers that take their content.
So how do you ensure that the damage caused by scrapers are kept to an absolute minimum? There is no secret formula, but there are a few tricks that seem to work very well.
Don’t Be Young
The longer your site has been around, the stronger its natural protection both in the search engines and in the minds of users. Though everyone has to start out new and wade through a period of uncertainty, it is another good reason not to chance you brand or move to a new domain without weighing the move heavily.
Build Incoming Links
Building links is an established part of any good SEO practice but it is especially important here. Spammers often have their own link-building system built into their networks and frequently have a decent amount of inbound links before they touch your entries. Building your own inbound links ensures that they are not able to replace you easily.
However, it is important not to be spammy yourself with your inbound links. Don’t simply engage in link exchanges or purchase links. Search engines can often detect those and may penalize you, causing you to lose ground rather than gain it.
Cross-link Your Posts
When writing about something, if you’ve touched on a related topic before, link to it Make the linking natural but try to link to at least a few of your own posts within your entries. When spammers scrape the feed, so long as they don’t strip out the HTML, they will also be taking those links and will point back to your site.
Search engines use these kinds of clues to determine who thwww.e original site is.
Add RSS Footers or Headers
Adding footers and headers to your RSS feed may not be a perfect solution, especially as spammers get more focused on more and more narrow swaths of content, they are a great way to reduce the impact of complete RSS scraping and protect full feeds.
Claim Your Site
Both Google and Technorati allow you to claim your blog on their site. For Google, bloggers should visit Google Webmaster Tools and for Technorati users should create a profile on the site. Doing so may not have a large impact on your site but it makes it clear that a human is behind it to the search engines. Also, on Technorati, it allows you to display an icon next to your blog, clearly distinguishing it from spam to users as well.
Also, consider registering your site on MyBlogLog and similar services, even if you do not plan on participating, just to have further sites vouch for the authenticity of your blog.
Even if you don’t want to take the copyright route and get the spam blogs taken down, report any spammers misusing your content to Google and be sure to use the form in the Google Webmaster Tools as, according to Matt Cutts at Google, it is given more weight.
Even if they do not remove the sites from the search engine, they are at least aware of the problem and can rank accordingly.
Provide Content Outside the Feed
Finally, provide good, useful content that exists outside the RSS feed, typically in static pages. Google loves this kind of content too and it is something that the scrapers won’t have. Inevitably, Google and other search engines will show preference to your site for the large amount of unique content, even if you are being scraped heavily.
Being scraped is never a good thing. Though some talk about making the scrapers work for you, the techniques are not fool-proof and have been known to fail. However, they are often great for mitigating the damage and are good practice for any blogger.
That being said, there are still likely going to be some cases of scraping that requires a higher level of action. Inevitably, a scraper, either through luck or skill, may still be able reach a point where they are able to steal some of your thunder. When that happens, it is important to be aware of the laws and techniques you can use to protect yourself.
But to those who want to avoid that as much as possible, it is a good idea to work on armoring your site against these kinds of attacks in advance.
A little prevention really can help keep the spammers at bay…