October 8, 2007
After much soul searching and internal debate, I’ve decided that I’m done with splogs and feed-driven blogs generating content from my blogs. Aren’t you?
Here is the scenario.
A trackback comes in with the following starting off the “quote”, followed by the start of your blog post content:
- [...]admin wrote an interesting post today onHere’s a quick excerpt[...]
- […] Jim Phillips wrote an interesting post today!.Here’s a quick excerpt…
- […] Novak wrote an interesting post today on 100 bloggers worldwide collaborate to benefit charityHere’s a quick […]
Notice the similarity? These all involve the words “wrote an interesting post today” and “here’s a quick excerpt”.
I’m considering adding these two phrases to my banned commenters list, but it’s a difficult decision as many use these words perfectly innocently. I wish there was a way to put them in the filter using all of the words without kicking out the innocent usages.
It’s that, or teach all bloggers to never introduce a blockquote using those phrases.
Tags: Blogging, Content Scraping, copyright, Ethics, Legal, plagiarism, Splogs
September 17, 2007
Most blogging applications, including WordPress, are set up by default to let the entire world know when you update your blog. They either use a lengthy list of ping URLs or a pinging service such as Pingomatic to help you ensure that every service that wants to know about your new post does so.
These lists, by default, do not include just search engines and RSS readers, but also “central” pinging services that provide updates to other tools and applications.
However, as we discussed previously, this use of central servers is also very convenient for spammers. Many spam bots watch some of these services and scout them for content they might find useful to scrape, making them a potentially risky move.
Fortunately, most of the major services we, as humans, rely on offer a direct means of communication with them and that enables us to bypass these central pinging services altogether and avoid at least some of the spammers completely.
All it requires is a new pinging strategy, one that goes straight to the source.
Tags: Content Scraping, copyright, Legal, Splogs
September 10, 2007
It’s a sad fact that pretty much any content posted to a blog or otherwise passed through an RSS feed is going to get scraped at one point or another. On the Web today, there are so many spammers spitting out junk blogs that almost everyone will be a victim of content theft if they blog long enough.
That being said, some content and some sites are more at risk than others. The type of content you post and the way you present it both are major factors in exactly how at risk your site is.
Even though you can not mitigate or change many of these factors, being aware of them can help you assess your site’s risk in this area and take appropriate action. Meanwhile, ignoring the factors could leave you seriously unprepared for a very real problem.
Tags: Content Scraping, Legal, Splogs