Negotiating with the RSS content scrapers

Filed as News on January 15, 2007 9:51 pm

Repost This

Last week I was browsing Technorati to see which blogs had recently linked to one of my own, when I saw something familiar: a copy of one of my articles on a different web site.

Sometimes it’s just a long quote, used either for another site’s commentary, or to link back, and I’ve no problem with this usage.

The main issue is with sites that scrape content and publish it on their own AdSense-riddled layouts in the hope of making some cash from other people’s work.

Though I abhor this type of site, I’ll often let them pass by because it’s not worth my time reporting them or searching around for a contact. If they’ve lifted an item of news I’ve written based on a press release, then I don’t feel quite so hard done by. The article in question, however, was a verbatim copy of an opinion piece I had spent a considerable amount of time working on. I felt aggrieved. The only reason any links pointed back to my site at all was because I had put internal links within the article itself.

Unusually, I found a contact form for the site (I won’t name and shame them here) and sent a polite but firm explanation stating that they had lifted several articles (I searched further) from my site – as well as from a number of others – and that I’d appreciate the content being removed.

I didn’t expect to get a reply, but today an email arrived which (paraphrased) said:

I didn’t copy and paste your content on my site. I just used your RSS feed from your site (and many other sites) and my script imported it to my site. Our blogger writes content about gadgets from all over the world, and you can find it in another section.

Anyway, I’m sorry if you don’t agree that I used your RSS, and I have two options to offer you.

1. I use your RSS feed in my site and at the end of it link to your site for the complete story. In this way you’ll get more visitors to your site.
2. I remove your RSS from my system.

Any way, I think if you choose option 1 I’ll send you more visitors and you can send me visitors.

So, what do we have here?

  1. Someone who believes (as I’m sure many others do) and blatantly acknowledges that it’s OK to do whatever they want with the contents of an RSS feed. The very fact that I chose to publish it gives them that right.
  2. Someone who thinks that, after I’ve discovered their splog, I might still consider “option 1″ and link back to their site.

Would it have been a big deal to link and be linked? Well, quite apart from my own ‘ethical’ viewpoint, I don’t really want to test out Google’s ‘bad neighbourhood’ penalty, duplicate content penalty, nor any other problems that might arise from being associated with dubious sites.

The dilemma is how much time, in a busy blogging schedule, to spend on tracking down and attempting to eradicate these parasites.

Tags:

This post was written by

You can visit the for a short bio, more posts, and other information about the author.

Submissions & Subscriptions

Submit the post to Reddit, StumbleUpon, Digg or Del.icio.us.

Did you like it? Then subscribe to our RSS feed!



  1. By Rhys posted on January 15, 2007 at 10:10 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    Why don’t you just add a signature to the bottom of each post with your name and URL. That way you get the credit and, maybe, a little traffic.

  2. By Darnell Clayton posted on January 15, 2007 at 10:12 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    These type of people really tick me off!

    After discovering one site from [a certain domain] hosting blogs stealing my content, (which I had shut down) I discovered dozens more!

    You’re right about one thing…fighting these types of blogs is time consuming and irritating as well.

    I have better things to do in life than to send “cease and desist” orders to spam infested blog host sites.

  3. By Tony posted on January 15, 2007 at 10:49 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    One of the biggest reason to publish only partial feeds is to get around this annoyance. Its a bit of a debate, but if you do, you’ll guarantee that this won’t happen — at least to the extent that it does.

    Its funny how sploggers rationalize it by saying that it “sends” “traffic” to your blog — which is patently ridiculous.

    Also, feel free to publish their name if you want … its time we put a spotlight on sploggers, many of whom seem not to have an ethical bone in their bodies.

    Cheers
    Tony.

  4. By alan herrell - the head lemur posted on January 15, 2007 at 11:26 pm
    Want an avatar? Get a gravatar! • You can link to this comment

  5. By Darren posted on January 16, 2007 at 12:15 am
    Want an avatar? Get a gravatar! • You can link to this comment

    I feel your pain Andy – i find one of these every second day and am at a similar point of trying to work out whether it’s worth my time dealing with it any more.

    My main problem with it previously was around duplicate content issues but Google recently came out with info on duplicate content that seemed to indicate that this wouldn’t be too big of an issue in this situation – however for me its also more of an ethical thing – i put many hours of work into my content and to have it republished word for word without them making it more useful in any way and republishing every post I write just feels wrong to me.

    So I fight each one and hope that someone will find a way to stop it or at least automate how we track them down and stop it happening….

  6. By Amit Agarwal posted on January 16, 2007 at 6:40 am
    Want an avatar? Get a gravatar! • You can link to this comment

    Just send a very short email saying that if the content is not removed in the next 24 hours, you are contacting Adsense and their web host.

    You will meet with success in 99% of the cases.

  7. By Andy Merrett posted on January 16, 2007 at 7:28 am
    Want an avatar? Get a gravatar! • You can link to this comment

    The ‘original content’ that this site talks of is rewriting articles from other tech sites – not a huge problem and does give attribution. However, ALL of the articles that are scraped from the site I write for are just pulled verbatim.

    Paranoid? Moi?

    I won’t name and shame them… this time…

  8. By Markku Seguerra posted on January 16, 2007 at 8:25 am
    Want an avatar? Get a gravatar! • You can link to this comment

    If it’s blogger/blogspot, you can always use the “mark as spam” button, though I don’t know how fast they address that. Sometimes, it really is a waste of time going through the process. Would you rather be writing new content than catching scrapers? I’m not sure myself.

  9. By Andy Merrett posted on January 16, 2007 at 11:30 am
    Want an avatar? Get a gravatar! • You can link to this comment

    It wasn’t unfortunately – but a blog with a domain name remarkably similar to one of the BIG tech sites.

  10. By Duncan posted on January 16, 2007 at 11:19 pm
    Want an avatar? Get a gravatar! • You can link to this comment

    I love their response: I’ll take it down if you ask but if I keep it up you’ll get more traffic. This is the *EXACT* same argument Robert Scoble is using with his link blog that republishes full posts.

  11. By Ajay posted on January 17, 2007 at 9:35 am
    Want an avatar? Get a gravatar! • You can link to this comment

    Andy, I use Antileech for my WordPress feeds. I also use Angsuman’s feed copyrighter to embed a copyright below my feeds. So in case my feed is scraped then the splog gets a blaring copyright notice, I get a linkback and so can easily trackback.

    And I suggest you directly report them to Adsense and the webhost with the DMCA note. He may take off your feed but there are other webmasters out there who will be suffering right now. We need to work together against splogs.

  12. By Ajay posted on January 17, 2007 at 9:37 am
    Want an avatar? Get a gravatar! • You can link to this comment

    @Duncan, Scoble got a link blog that republishes full posts? As in is it a scraper?

  13. By Andy Merrett posted on January 17, 2007 at 11:09 am
    Want an avatar? Get a gravatar! • You can link to this comment

    You’re right Ajay. I’ll look into it. The site in question is still taking content from our site.

  14. Deep Jive Interests » Recommended Reads for January 17, 2007January 17, 2007 at 2:33 pm
  15. Socially Driven Today, 01-17-2006January 18, 2007 at 5:01 am
  16. Andy Merrett» Blog Archive » iGizmodo (The Gadget World) shamelessly steals other tech sites contentFebruary 14, 2007 at 10:22 am
  17. The iGizmodo splogger rant at The Blog HeraldMarch 28, 2007 at 4:56 pm
  18. Why You Should Use Full Feeds | Internet Marketing BlogApril 23, 2007 at 4:07 pm
  19. Article Database » Blog Archive » Why You Should Use Full FeedsFebruary 13, 2008 at 9:39 am