The Evil Linking of Sploggers and Scrapers Messing With Your Content
The other day, I found a fascinating post on blogging programs that really held my interest, exploring why someone would stay with a specific program even when it made them unhappy. I noticed a reference to Matt Mullenweg, founder of WordPress, but the link wasn’t to his name but on the word “of”.
I thought that was odd and noticed that other strange words were in links: her, the, in, about, and he. Very odd linking pattern.
A hover over the links found that they were links to porn sites.
Suspecting this was a splog, I wondered if the author had intended to write this interesting article and stuff it with the nasty porn links or if this was indeed a splog. I selected a block of text with unique phrasing and ran a search in Google for the phrase wrapped in quotes. Indeed, I found the original author and then informed them of the scraping and copyright violation as well as the nasty links.
Getting Scraped is a Compliment
Many feel that getting your blog content stolen and scraped by a splogger is a compliment. It means they care enough about your content to “spread the word”. Or they think that it won’t hurt them, but benefits them due to the trackbacks and link love.
This is crap. Loads of it. Piled very high.
It’s a load of garbage because few splogs give credit to the original author. In fact, they have programs which strip the HTML links and tags so they are free to insert their own with no good links getting in the way.
Google’s new PageRank algorithm now investigates and considers links and content in many ways. It’s about keyword matching and relative content linking.
If there is a credit link back to your blog, and the links within the blog post are not inline with the blog contents, on the blog and the linking blogs, the discrepancy is noticed and can score against you. If the content from two different sites match, and the links within don’t, it can score against you.
If you are worried about duplicate content, then be more worried. If the duplicated content is matched up with your blog, then your site may get scored low for such duplication. It isn’t just the duplication on your blog but the duplication of your content off your blog.
Many a blogger’s PageRank has dropped due to splogs scraping their content, so help stop scrapers and sploggers from stealing and abusing your content. If others abusing your content is a compliment, it’s a painful one.
The author of Lorelle on WordPress and the fast-selling book, Blogging Tips: What Bloggers Won't Tell You About Blogging, as well as several other blogs, Lorelle VanFossen has been blogging for over 15 years, covering blogging, WordPress, travel, nature and travel photography, web design, web theory and development extensively as web technologies developed.
I thought Google is able to identify the stronger blog (the original most of the time) and won’t penalize it even though its content is duplicated elsewhere.
The PageRank you point out was published in 2005, I am sure certain things have changed since.
The article continues to be valid today. I’ve updated it and if anything, it is more relevant today due to the increasing ability to compare data in the PageRank algorithm.
Google could identify the “stronger” blog if there was enough history to compare to. I’ve heard many stories from bloggers who had been blogging for a year or so and found their PageRank falling and staying that way until they discovered a splogger scraping their content. Once the splogger was shut down, it took a few months but their PageRank increased.
I truly believe what you’ve written here is rubbish. Google’s pagerank algorithm is private has not been leaked, no information is roaming around about how it operates. Again yet another story of a blogger who thinks they know how something works when in reality they don’t. Will you stop already…
Teaching bloggers in this fashion is really misleading.
The following are the publicly released patents on the Google PageRank:
Google’s patent for their search engine ranking technique from 2005
Google PageRank patent release from 2007
Could you be a little less offensive in your attack? I thought you’d grown out of this.