The Shyftr Saga

Weekends are typically slow times for copyright news. With the courts closed and most Web hosts gone for 48 hours, very little usually happens.

However, this weekend was a definite exception. It saw a veritable blogstorm over the RSS aggregation service Shyftr and its republishing of RSS content. Some bloggers, such as Robert Scoble and Louis Gray came down in favor of the service while others, including Tony Hung and Raoul Pop were firmly against it.

In the end, Shyftr backed down and changed its policy but not before drawing a vast amount of unwanted attention and dozens of angry blog posts.

However, now that things have died down some, we can take a look back at what happened and what it may mean for both bloggers and for other companies that may want to enter a similar market.

[Read more…]

Copyright Infringement: The Most Common Outcomes

It seems that a large number of bloggers run their sites with very little thought about copyright law. Though they don’t plagiarize content or scrape feeds, they grab images, copy large blocks of text and embed media without much thought to the original author or whether their use is truly “fair use“.

It seems that many bloggers simply want to share what they find interesting. But while that is a noble cause, some make the mistake of not merely linking to what they like, but wholesale copying and pasting it.

Though many don’t mind their works being copied, others do. It only takes one angry copyright holder to cause a great deal of headaches for a site, especially a small one, and many are caught off guard at exactly how much trouble a copyright dispute can be.

“But what is the worst that can happen?” Many bloggers ask. The answer, unfortunately, is quite a lot.
[Read more…]

Cleaning Blogspot Spam: Is Google Responding to Public Pressure?

Examples of splogs in Technorati's feed for Blogging Tips tag

In Google! Clean Up Blogger! Now!, I wrote about how a simple search for a news story turned into a massive multi-page hunt through Google search results of Blogger Blogspot spam blogs, finally finding a possible legitimate blog answer on something like page five but having to go through even more pages to find a second and third possible answer to my search question. I plowed through page after page of spam blogs and splogs, much containing copyright violating content and spinning spammers, although most of it was totally unintelligible collections of keywords.

In the article, I wrote an open letter to Google asking them to clean up Blogspot by removing the tons of spam blogs that litter its surface and abuse our content, as well as help us help them by making the process easier:

There has to be a nice way of doing this. Sure, there is always room for abuse, but let us help you. The good white hat wearing web users represent the majority and we are tired of this. We want Google cleaned up. We think starting with cleaning up Blogspot/Blogger is a good place to begin.

We, the bloggers of the world, really like you Google. We put your ads, search, maps, news, and gadgets on our blogs. We write our post content to meet your needs so you will like us. We design our web designs not just with web standards but Google standards in mind. Our lust for all things Google puts billions in your pockets. We live and breath through Google, so let us help you help us.

I’m not the only one to complain. In fact, bloggers around the world have been complaining for years about the vast quantity of splogs on Blogspot.
[Read more…]

How Spam-Friendly is Your Niche?

No one wants to attract spammers to their site. They scrape content, post junk comments and turn search engines off to your site.

Unfortunately, the bitter truth is that all blogs, regardless of age, topic and readership, will attract at least some attention from the purveyors of junk. That is a simple byproduct of having a blog and publishing an RSS feed.

However, to spammers, all blogs are not created equal and some sites are going to attract far more attention from spammers than others. But while many of the elements that will attract spammers may be unpredictable and outside of our control, others are not.

One of the biggest indicators of how much trouble a blog will have with spam is the niche that it is operating in. This is because, by in large, the niche a blog is in will determine the keywords most commonly associated with it and those keywords, in turn, determine which sites the spammers latch on to.

The question then becomes, which niches suffer the most at the hands of spammers.

The Usual Suspects

If you want to know whether your niche is a popular target for spammers, you need to look no farther than the spam folder in your email box.

Whether or not Web spammers and email spammers are often the same, it is clear that they share many of the same targets. Keywords and topics that are popular targets for email spammers will, often times, be targets for Web ones as well.

As such, blogs in known spam niches such as gambling, prescription drugs, contests, travel, adult content and financing, are going to be frequent targets for spam blogs.

Of course, the catch is that it is not necessarily a matter of your blog promoting the same products or services as spam blogs, it is a matter of it being within the same broad topic. Spam bots, much like search engines, can not inherently tell the difference between favorable and unfavorable posts. As such, a news report about a crackdown on online gambling is just as likely to be scraped as a blog offering tips for for winning at poker.

In short, if your site routinely has keywords that are familiar to email spam, odds are you’ve already seen more than your fair share of trouble from dark side of the Web. But even if you don’t meet those criteria, there is still a good chance you could, unwittingly, be attracting the attention of spammers.

Unexpected Surprises

Of course, not all Web spam deals with the same topics as email spam. Since Web spam is driven by many different factors, it is inevitable some categories will show up on the Web that don’t in our inboxes.

One such factor is the amount of money a spammer can hope to make off of a single click. When one takes a look at the most expensive Adsense keywords, they find that the list is top-heavy not with traditional spam topics, but legal searches.

Since many spam blogs only earn a few clicks before being shut down, having a keyword that generates a decent amount of revenue is critical. As such, spammers are drawn to topics such a Mesothelioma, dwi/dui, personal injury and insurance simply because they are terms they can hope to make approximately fifty dollars a click from. Though these terms are not as heavily targeted by spammers since they are less likely to be searched for than the traditional spam workhorses, cost definitely plays a factor.

On the flip side, search frequency also plays a role. Looking at the top search terms gives you an idea of what people are searching for and where the spammers are likely to follow. In that regard, celebrity news is a frequent topic of interest with technology and television shows also making an appearance.

Though these terms might not be as valuable per click, they can make up for that in sheer quantity. Simply put, spammers are guaranteed not just a constant stream of potential viewers, but a ready supply of sites to latch onto. This approach may be better for spam sites less focused on earning clicks on ads and more interested in using spam to pump the rankings of another site.

Still, of all the potential indicators, it appears that search volume is the least helpful. The amount of Britney Spears spam, for example, remains remarkably low for the term and seems likely to stay that way.

But like the other factors, it is worth being aware of as it can give you a clue as to the problems that may be coming down the road.

What It Is Bad

None of this is to say that you should change your niche simply because it is targeted by spammers, just that having a topic targeted by them can create additional problems for your site. All in all, there are at least three reasons you should take note if your site does happen to fall in a spam-friendly niche:

  1. 1. Increased Scraping: Perhaps the first repercussion of having a spam-friendly niche is that your content will be scraped much more heavily than it would otherwise. This can even be the result of just sending out one post on a targeted keyword and is only amplified the more often such posts are made.
  2. 2. Increased Comment Spam: Though comment spam is more random in nature than scraping, there is an element of it that is keyword based. Posts and sites with popular spam keywords are more frequent targets for comment spam and sites that routinely deal with such topics may want to take extra anti-spam measures. Also note, in conjunction with the increased scraping, there will also be a rise in the amount of trackback/pingback spam.
  3. 3. Increased Confusion: If your site is in a spammy niche and users are likely to have seen many spam blogs in that area, you are going to have to work harder to ensure that users realize your blog is genuine. Likewise, there is an increase in the likelihood that search engines will confuse your product with spam or that your site will be dealing with strong search engine competition from its spam counterparts. All in all, setting your site apart from the spammers will be a much greater challenge.

The good news is that, with work and awareness, most of the problems that come from being in a spam-friendly zone can be overcome. by using known anti-scraping tools, taking anti-comment spam measures and clearly distinguishing yourself from the spammers, it is possible to thrive in these niches, as many blogs do.


It is far more important to write what you know and what you love than it is to avoid being in a spam-friendly niche. Spam attacks can be overcome, but there is no overcoming a lack of ambition or love for one’s topic.

But it is still important to be aware if your selected niche is a likely target for spammers. Doing so gives you the chance to take counter-measures and prevent the spammers from latching in too deep. It also gives you the chance to proactively search for and protect your content, block comment spam and work to separate yourself from the junk.

In short, being aware of the spamminess of your niche is the first, and most important, step in overcoming the drawbacks it brings. Fortunately, that is easy information to obtain.

2008: The Year Ahead for Spam Blogs

As the year draws to a close, the blogging community has a great deal to reflect on and look ahead toward. Between the viral videos, blogstorms and major upgrades, it has been a busy year.

But for those of us involved in content theft and spamming issues, 2007 was something of a bittersweet year. A lot of progress was made in the fight against spam, but a great deal went wrong. It seemed that, for every victory, there were at least two setbacks.

Sadly, it seems that we can expect a very similar year in 2008. However, there are new tools and new possibilities that might make the next year a little bit more bright than the one gone by. Perhaps, with a little bit of luck, 2008 can be a brighter year than 2007 when it comes to spam blogs.

[Read more…]

The Five Worst Ideas in Content Theft

When it comes to detecting and stopping content theft, there is a great deal of progress to be seen. New plugins are constantly being developed to stop scrapers, search techniques are constantly being improved and new tracking methods are being explored.

But despite all of the effective ways to monitor your content and protect it from misuse, it seems some of the worst ways never die.

No matter how many times these techniques to get shot down, disproved or otherwise defeated, there are still those that preach them as gospel. However, these systems not only provide a false sense of security, but often times irritate readers and, in some cases, can actually make the problem worse.

So let us take a moment to look at the five worst methods of dealing with content theft on the Web and analyze why they are so bad.

[Read more…]

The 6 Steps to Stop Content Theft

With spammers and plagiarists becoming more prolific and more aggressive than ever, content theft is no longer a matter of “if”, but “when”.

Where once protecting content was the realm of lawyers and billion-dollar industries, it is now important for Webmasters, large and small, to be familiar with both the laws and the tools available for dealing with content theft.

Fortunately, the steps for fighting plagiarism are easy to follow and, for the most part, the tools are free and readily available.

If you take a few moments to familiarize yourself with the process and technology, you can become a champion plagiarism fighter in short order and get back to the business of running your site before you realize how effective you’ve become.

[Read more…]

5 Content Theft Myths and Why They Are False

When it comes to content theft, there is a great deal of confusion.

Not only is copyright law almost impossible to understand, even by most lawyers’ standards, but the technology used to steal content on the Web is often confusing in and of itself.

This confusion has given rise to a series of myths and misunderstandings about content theft, many of which have very negative implications for Webmasters concerned with the rising tide of scraping and plagiarism.

To help dispel some of those myths I, along with Lorelle from Lorelle on WordPress, have put together a list of the most common myths in content theft and explanations for why they are false.

[Read more…]

Sploggers Get Craftier. Or Should I Say “Sploggers son cada vez más complicado!”

So, as a disclaimer, I don’t actually know if that Spanish translation is accurate for “Sploggers are getting trickier”, because I used Google’s Translation service at to illustrate a point.

In my ongoing fascination with sploggers, I’ve found out that there’s a new kind of “autoblogging” software that has been sending out trackbacks. But I presume you know the usual kind of thing I’m talking about: the offenders are blogs that end in .info, scrape your posts and then reproduce the first few paragraphs that end with an ellipsis and “you should read more over here”, or “<insert author name> has the details”, or some such dreck.

Well, in examining the latest splogging garbage to cross my desk, I have found that some new autoblogging software is doing something pretty sneaky to get past *your* defenses.

I know that when I look back at such blogs to verify that they’re in fact auto-generated (to create adsense income), so that I can add their IP and domain to my blacklist, I usually check “by hand” to see that they’ve scraped a post.

Well, much to my surprise I found that there were some *very* interesting posts that were “tracking back” to the BlogHerald that had very familiar posts — but not quite identical or literal ripoffs of our content.

[Read more…]

Good News For Splogs! Word Verification (aka CAPTCHA’s) May Become Useless In The Future

In an age where spammers choose to promote themselves by harassing others, many bloggers, social networks, etc. have resorted to using CAPTHA’s as an inexpensive way to keep fake machine comments/user names/purchases from flooding their world.

Unfortunately it seems that the days of funny letters (and numbers) may be coming to an end, as it seems that a company has created software capable of “reading” those funky image phrases.

But before we begin to explain how much of an impact this will make upon the blogosphere, we need to address the background story–starting with Hannah Montana.
[Read more…]