Download Squad profiles Twerp Scan, a nice little application that checks your followers to see if there are any known spammers there. Personally, I wonder why Twitter is so quiet about the fact that there’s a, probably great, number of spammers perusing Twitter and luring users into their scams.
Twitter, a service created by Evan Williams (who founded Blogger.com and Odeo.com, both sold to Google and Sonic Mountain, respectively) has helped hundreds of thousands (if not millions) of users find out the “latest happenings” via micro updates that the service has become famous for.
While Twitter is a useful tool to help one get out of a dangerous situation (or two), a few less than honest companies/individuals may be trying to use the service to promote their product.
Last weekend at The Next Web Conference in Amsterdam I spoke with Anton Johansson and CEO Martin Källström from the new blog search engine Twingly. They present themselves as a new spam-free blog search engine with a strong focus on the conversational nature of the blogosphere.
Lorelle VanFossen recently addressed the issue of spam in blog search engines and keeping their index spam free is one of the main objectives of Twingly. On top of that they focus on conversational search in the blogosphere by partnering with traditional media. They have closed several deals with major newspapers in Europe which provide links to the blogs that reference them. This is another step in showing the two-way links between blogs and online newspapers. Their main competitor in this area is of course Sphere but Twingly focuses on different markets. Read all about their ideas to start another blog search engine in the following interview and grab a special Blog Herald beta invite code while you can!
In Google! Clean Up Blogger! Now!, I wrote about how a simple search for a news story turned into a massive multi-page hunt through Google search results of Blogger Blogspot spam blogs, finally finding a possible legitimate blog answer on something like page five but having to go through even more pages to find a second and third possible answer to my search question. I plowed through page after page of spam blogs and splogs, much containing copyright violating content and spinning spammers, although most of it was totally unintelligible collections of keywords.
In the article, I wrote an open letter to Google asking them to clean up Blogspot by removing the tons of spam blogs that litter its surface and abuse our content, as well as help us help them by making the process easier:
There has to be a nice way of doing this. Sure, there is always room for abuse, but let us help you. The good white hat wearing web users represent the majority and we are tired of this. We want Google cleaned up. We think starting with cleaning up Blogspot/Blogger is a good place to begin.
We, the bloggers of the world, really like you Google. We put your ads, search, maps, news, and gadgets on our blogs. We write our post content to meet your needs so you will like us. We design our web designs not just with web standards but Google standards in mind. Our lust for all things Google puts billions in your pockets. We live and breath through Google, so let us help you help us.
I’m not the only one to complain. In fact, bloggers around the world have been complaining for years about the vast quantity of splogs on Blogspot.
No one wants to attract spammers to their site. They scrape content, post junk comments and turn search engines off to your site.
Unfortunately, the bitter truth is that all blogs, regardless of age, topic and readership, will attract at least some attention from the purveyors of junk. That is a simple byproduct of having a blog and publishing an RSS feed.
However, to spammers, all blogs are not created equal and some sites are going to attract far more attention from spammers than others. But while many of the elements that will attract spammers may be unpredictable and outside of our control, others are not.
One of the biggest indicators of how much trouble a blog will have with spam is the niche that it is operating in. This is because, by in large, the niche a blog is in will determine the keywords most commonly associated with it and those keywords, in turn, determine which sites the spammers latch on to.
The question then becomes, which niches suffer the most at the hands of spammers.
The Usual Suspects
If you want to know whether your niche is a popular target for spammers, you need to look no farther than the spam folder in your email box.
Whether or not Web spammers and email spammers are often the same, it is clear that they share many of the same targets. Keywords and topics that are popular targets for email spammers will, often times, be targets for Web ones as well.
As such, blogs in known spam niches such as gambling, prescription drugs, contests, travel, adult content and financing, are going to be frequent targets for spam blogs.
Of course, the catch is that it is not necessarily a matter of your blog promoting the same products or services as spam blogs, it is a matter of it being within the same broad topic. Spam bots, much like search engines, can not inherently tell the difference between favorable and unfavorable posts. As such, a news report about a crackdown on online gambling is just as likely to be scraped as a blog offering tips for for winning at poker.
In short, if your site routinely has keywords that are familiar to email spam, odds are you’ve already seen more than your fair share of trouble from dark side of the Web. But even if you don’t meet those criteria, there is still a good chance you could, unwittingly, be attracting the attention of spammers.
Of course, not all Web spam deals with the same topics as email spam. Since Web spam is driven by many different factors, it is inevitable some categories will show up on the Web that don’t in our inboxes.
One such factor is the amount of money a spammer can hope to make off of a single click. When one takes a look at the most expensive Adsense keywords, they find that the list is top-heavy not with traditional spam topics, but legal searches.
Since many spam blogs only earn a few clicks before being shut down, having a keyword that generates a decent amount of revenue is critical. As such, spammers are drawn to topics such a Mesothelioma, dwi/dui, personal injury and insurance simply because they are terms they can hope to make approximately fifty dollars a click from. Though these terms are not as heavily targeted by spammers since they are less likely to be searched for than the traditional spam workhorses, cost definitely plays a factor.
On the flip side, search frequency also plays a role. Looking at the top search terms gives you an idea of what people are searching for and where the spammers are likely to follow. In that regard, celebrity news is a frequent topic of interest with technology and television shows also making an appearance.
Though these terms might not be as valuable per click, they can make up for that in sheer quantity. Simply put, spammers are guaranteed not just a constant stream of potential viewers, but a ready supply of sites to latch onto. This approach may be better for spam sites less focused on earning clicks on ads and more interested in using spam to pump the rankings of another site.
Still, of all the potential indicators, it appears that search volume is the least helpful. The amount of Britney Spears spam, for example, remains remarkably low for the term and seems likely to stay that way.
But like the other factors, it is worth being aware of as it can give you a clue as to the problems that may be coming down the road.
What It Is Bad
None of this is to say that you should change your niche simply because it is targeted by spammers, just that having a topic targeted by them can create additional problems for your site. All in all, there are at least three reasons you should take note if your site does happen to fall in a spam-friendly niche:
- 1. Increased Scraping: Perhaps the first repercussion of having a spam-friendly niche is that your content will be scraped much more heavily than it would otherwise. This can even be the result of just sending out one post on a targeted keyword and is only amplified the more often such posts are made.
- 2. Increased Comment Spam: Though comment spam is more random in nature than scraping, there is an element of it that is keyword based. Posts and sites with popular spam keywords are more frequent targets for comment spam and sites that routinely deal with such topics may want to take extra anti-spam measures. Also note, in conjunction with the increased scraping, there will also be a rise in the amount of trackback/pingback spam.
- 3. Increased Confusion: If your site is in a spammy niche and users are likely to have seen many spam blogs in that area, you are going to have to work harder to ensure that users realize your blog is genuine. Likewise, there is an increase in the likelihood that search engines will confuse your product with spam or that your site will be dealing with strong search engine competition from its spam counterparts. All in all, setting your site apart from the spammers will be a much greater challenge.
The good news is that, with work and awareness, most of the problems that come from being in a spam-friendly zone can be overcome. by using known anti-scraping tools, taking anti-comment spam measures and clearly distinguishing yourself from the spammers, it is possible to thrive in these niches, as many blogs do.
It is far more important to write what you know and what you love than it is to avoid being in a spam-friendly niche. Spam attacks can be overcome, but there is no overcoming a lack of ambition or love for one’s topic.
But it is still important to be aware if your selected niche is a likely target for spammers. Doing so gives you the chance to take counter-measures and prevent the spammers from latching in too deep. It also gives you the chance to proactively search for and protect your content, block comment spam and work to separate yourself from the junk.
In short, being aware of the spamminess of your niche is the first, and most important, step in overcoming the drawbacks it brings. Fortunately, that is easy information to obtain.
But for those of us involved in content theft and spamming issues, 2007 was something of a bittersweet year. A lot of progress was made in the fight against spam, but a great deal went wrong. It seemed that, for every victory, there were at least two setbacks.
Sadly, it seems that we can expect a very similar year in 2008. However, there are new tools and new possibilities that might make the next year a little bit more bright than the one gone by. Perhaps, with a little bit of luck, 2008 can be a brighter year than 2007 when it comes to spam blogs.
You may have also discovered a surge in trackback spam recently as autoblogging software is being used by more and more spammers to reach out and cull RSS feeds. This phenomenon has led to many disabling trackbacks, or raising the “blacklist” level so high that you might never see some trackbacks again. Or, as some newer remotely-hosted commenting technologies like IntenseDebate and Disqus show, they simply do not show trackbacks because of the spam problem.
[As an aside, that’s not to say that they will never implement it; I have it on good account that Disqus will probably implement it as soon as they *can* find a way to clean up the spam-detecting components in the trackback issue.]
The problem is that in my own blogging success, I have found trackbacks to be instrumental.