April 1, 2008
In Google! Clean Up Blogger! Now!, I wrote about how a simple search for a news story turned into a massive multi-page hunt through Google search results of Blogger Blogspot spam blogs, finally finding a possible legitimate blog answer on something like page five but having to go through even more pages to find a second and third possible answer to my search question. I plowed through page after page of spam blogs and splogs, much containing copyright violating content and spinning spammers, although most of it was totally unintelligible collections of keywords.
In the article, I wrote an open letter to Google asking them to clean up Blogspot by removing the tons of spam blogs that litter its surface and abuse our content, as well as help us help them by making the process easier:
There has to be a nice way of doing this. Sure, there is always room for abuse, but let us help you. The good white hat wearing web users represent the majority and we are tired of this. We want Google cleaned up. We think starting with cleaning up Blogspot/Blogger is a good place to begin.
We, the bloggers of the world, really like you Google. We put your ads, search, maps, news, and gadgets on our blogs. We write our post content to meet your needs so you will like us. We design our web designs not just with web standards but Google standards in mind. Our lust for all things Google puts billions in your pockets. We live and breath through Google, so let us help you help us.
I’m not the only one to complain. In fact, bloggers around the world have been complaining for years about the vast quantity of splogs on Blogspot.
Tags: autoblogging software, Blogging, copyright, Social Media, Spam, Splogs
November 12, 2007
Last week, Tony posted an article about a somewhat different kind of spam blogger.
The spammer had taken an article from this site, scraped it and then modified it before republishing. Though the method of modification remains debatable, it is clear that it was through some automated means as the duplicate version was mangled and borderline unintelligible.
However, the unfortunate truth is that this type of scraping is not as uncommon as we might wish and the technology to do it has been around for several years. Worse still, this type of scraping is growing much more popular as search engines clamp down on duplicate content and ad networks get better at detecting traditional content theft.
Modified scraping is a rising threat that bloggers need to be aware of as it presents a whole new set of challenges for content creators.
Tags: autoblogging software, copyright, Legal
November 7, 2007
So, as a disclaimer, I don’t actually know if that Spanish translation is accurate for “Sploggers are getting trickier”, because I used Google’s Translation service at translate.google.com to illustrate a point.
In my ongoing fascination with sploggers, I’ve found out that there’s a new kind of “autoblogging” software that has been sending out trackbacks. But I presume you know the usual kind of thing I’m talking about: the offenders are blogs that end in .info, scrape your posts and then reproduce the first few paragraphs that end with an ellipsis and “you should read more over here”, or “<insert author name> has the details”, or some such dreck.
Well, in examining the latest splogging garbage to cross my desk, I have found that some new autoblogging software is doing something pretty sneaky to get past *your* defenses.
I know that when I look back at such blogs to verify that they’re in fact auto-generated (to create adsense income), so that I can add their IP and domain to my blacklist, I usually check “by hand” to see that they’ve scraped a post.
Well, much to my surprise I found that there were some *very* interesting posts that were “tracking back” to the BlogHerald that had very familiar posts — but not quite identical or literal ripoffs of our content.
Tags: autoblogging software, Splogs