Sploggers Get Craftier. Or Should I Say “Sploggers son cada vez más complicado!”
So, as a disclaimer, I don’t actually know if that Spanish translation is accurate for “Sploggers are getting trickier”, because I used Google’s Translation service at translate.google.com to illustrate a point.
In my ongoing fascination with sploggers, I’ve found out that there’s a new kind of “autoblogging” software that has been sending out trackbacks. But I presume you know the usual kind of thing I’m talking about: the offenders are blogs that end in .info, scrape your posts and then reproduce the first few paragraphs that end with an ellipsis and “you should read more over here”, or “<insert author name> has the details”, or some such dreck.
Well, in examining the latest splogging garbage to cross my desk, I have found that some new autoblogging software is doing something pretty sneaky to get past *your* defenses.
I know that when I look back at such blogs to verify that they’re in fact auto-generated (to create adsense income), so that I can add their IP and domain to my blacklist, I usually check “by hand” to see that they’ve scraped a post.
Well, much to my surprise I found that there were some *very* interesting posts that were “tracking back” to the BlogHerald that had very familiar posts — but not quite identical or literal ripoffs of our content.
But it was really close.
And then I had another look at things sideways and realized that the grammar was bad. Really bad. Almost comically so. Almost as if someone had run it through a translator *twice*. Once out of English, and again into English. Or maybe *four* times, even, as I can’t quite replicate it.
Consider the following passage:
In the time I’ve been blogging personally in the new media side of the blogosphere, there have been some unwritten rules that I’ve taken notice of that some bloggers seem to follow religiously.
And then consider the mimicked one:
In the instance I’ve been blogging personally in the newborn media lateral of the blogosphere, there hit been whatever spoken rules that I’ve condemned attending of that whatever bloggers seem to study religiously.
… yes, “WTF” indeed.
Now, in a cursory Google check for autoblogging software that double translates, I’ve found nothing yet; but certainly auto-translating software *does* exist for sploggers to plug into their autoblogging software, with the intention of grabbing more time with Google, as differently translated auto-scraped content will translate into more unique content, which translates into more pages, which translates, perhaps into more income.
The other possibility is an auto-synonomizer piece of software which already exists, as something which automatically substitutes synonyms for given words in scraped content. One kind is advertised over here , costs $45, and was released near the end of October.
Since I can’t actually find software to doubly-translate scraped text it may be that its synonymizer software that’s doing it, although the grammar doesn’t quite fit.
Have you noticed this kind of trackback spam? If you have, leave a message and let’s have a vote: synonymized or translated?
Tony Hung is the editor of the BlogHerald. He is also a physician finishing his last year of residency in General Internal Medicine, and blogs at Deep Jive Interests , where he rants, occasionally, on new media topics.
I keep getting the John Doe has written a great article. Excerpt from my article. Read more here…
I do see some Spanish versions too. I don’t think I’ve seen any synonymized versions yet. But, I might be just overlooking it.
I don’t know how widespread things are yet, Justin. But if you see something be sure to let us know. ;)
On my latest post I have the words Ryanair and credit card in the title. It was instantly scraped by two sites: travel and financial. As you describe, they repeat the first few lines, then link to the main content.
You know, it’s such a pain chasing after these people, I just let it go. It could be worse. At least they’re linking back to my blog. Others simply lift it and call it their own.
The Spanish actually says “Sploggers are more complicated every single time” which is a nice illustration of your story.
I’ve received a lot of trackback spam from .info but also from .cn domains which I find rather ironic. Only spammers can reach out from China’s Great (fire)Wall?
I recently received a Russian trackback and even Google’s Translater or Babelfish could help me. I have no idea in what context my post was mentioned. I took a look at the design, the links and other features of the site and decided it was legitimate. But I’ve no idea actually.
I haven’t noticed synonymized or translated trackback spam yet but maybe because I haven’t noticed it was there.
I’ve seen a few like Justin mentioned but the latest one I’m getting a ton of and Akismit catches for moderation but doesn’t auto-mark as spam just has “news…” in the body and has a name, usually a drug name and a url in the web page entry but no email. I know I require emails to make comments. They are pretty much all on the same post too.
I tend to agree with your double translation theory. I posted an example recently. Quite bizarre.
Tony, you should add a ‘nofollow’ tag to the auto-synonomizer link.
Noticed it yesterday. I’m not even going to post the exact link of the offending site, but it’s “research hypen hub dot info”. The stuff on that site appears high in Google search too.