How to Help Immunize Your Site Against Scraping

Learn on the Herald

Expert-written content on how to start an online community or business.

View Comments (7)

Ryan D. says:

March 31, 2008 at 1:14 pm

I’m dealing with this now, I get about 2-3 “people” a week scraping my site and I’m trying to figure out a way to prevent it or at least stop them from hammering my server. The issue is these people make scripts that just hammer the site to get all the pages and that puts a huge load on the server, last week I banned an ip that was hitting my site once a second for hours on end. I’ve resorted to scanning the logs(via script) and banning ips but I need to come up with something easier. I have yet to see any of my content(images) on any sites yet but what else would these people be using it for?
infmom says:

March 31, 2008 at 4:36 pm

I found a really good WordPress plugin called AntiLeech (recommended by Lorelle here). It seems to do the job nicely and I’ve already got a fairly long list of URLs on its no-no list.
Jonathan Bailey says:

March 31, 2008 at 8:40 pm

Ryan: Antileech, as recommended in the second comment, is a good start. But also look into a plugin called Copyfeed, which makes the IP location and the banning much more simple.

Finally, if the sites are too annoying, file DMCA notices or abuse complaints with their hosts and get those sites pulled down.

If you need any specific help, email me at jonathan at plagiarismtoday dot com and I’ll do what I can!

Infmom: Agreed, Antileech is a great place to start. It’s a plugin well worth looking up.

Be sure to check out Copyfeed too as it is a huge help as well.

Thanks for the feedback!
DevTopics says:

April 1, 2008 at 8:30 am

What do you do if original content from your website or blog is stolen and republished in full on another site? You fight back!

A splog or “spam blog” is a blog that steals content from other web sites, then aggregates and republishes the content on its own blog. Splogs are created primarily to make money from ads shown on the splog and/or promote affiliated web sites. Splog owners are too dishonest, lazy or stupid to create their own original content and instead thieve yours.

Splogs are harmful because they effectively steal a portion of your blog’s search engine ranking, traffic and ad revenue.

When someone steals your original content, the best recourse is to file a DMCA complaint.

http://www.devtopics.com/how-to-file-a-dmca-complaint/
martin says:

April 3, 2008 at 5:53 pm

Natural SEOis helpful against Scraping Site. Keywords can help you optimize your online advertising strategies by improving your search engine rankings for terms that will help you most.
victor louis says:

April 6, 2008 at 1:36 am

Anti spam webinar-“Spammers Vs Today’s spam filters”

Today’s spam filters are not accurate and spam volumes are increasing rapidly. This will cost $42 billion for US alone. Spammers are using more innovation technology to send spam mails & Today’s spam filters are blocking only 80% of spam mails.

Register for a complimentary Webinar conducted by Abaca and Ferris research to know more about the spammers behind the black market. To register please click the link below:
http://www.surveymonkey.com/s.aspx?sm=LPFKkdkFwOYltiQZtM_2bttw_3d_3d
Michael says:

April 6, 2008 at 2:07 pm

Another great way of stopping scrapping is to report the spam blog to Adsense, this cuts off their revenue stream and hurts them a lot more than any of these other options, if they aren’t making any money from scrapping then why would they do it?

Don’t Be Young

Build Incoming Links

Cross-link Your Posts

Add RSS Footers or Headers

Claim Your Site

Report Spam

Provide Content Outside the Feed

Conclusions