You are currently browsing the tag archive for Content Scraping

November 6, 2007

Blog Writing: You Can Get It Fast, Good, or Cheap

The sign in master luthier Jeff Elliott’s workroom reads “We are slow and we’re expensive.”

A favorite saying of a friend in marketing is “You can get it fast, good, or cheap. Not all three.” You can get it fast and cheap but not good. You can get it good but it won’t be cheap nor fast. If you want good, it takes time to do it well.

The same applies to your blog and its content. It’s easy to deliver fast content, but is it good? Maybe, maybe not, but it is certainly cheap. If you want good, it takes time.

Developing quality content on your blog takes planning. It takes purpose. It takes a commitment. But most of all, it takes intent.
read more

Tags: , , ,

October 16, 2007

Who’s Talking About You and Your Blog

I haven’t made time recently to find out whose been talking about me and my blog. It’s one of those tasks easily put off. So I thought I’d take you along for the ride to show you how I keep track of whose been talking about me and my blog, and remind you to not put this important blog task off.

Google Blog Search

Google Blog Search helps you track what others are saying about your blog fairly easily. In the search form, use:

link:http://example.com/

This will result in a list of all the blogs indexed by Google’s Blog Search with a link to your blog. and uses Google Blog Search to track incoming links on the WordPress Dashboard panel, so you can click through to Google Blog Search incoming links to your blog through those links. Luckily, they are the few.

I checked and found a lot of interesting bloggers writing about me or my posts and went visiting to see what they had to say and leave a few comments along the way. I also found inspiration for some blog posts, posts in which I will reference and link back to them, saying something about them and their blogs, too.

For the most part, I found bloggers saying delightful things about my blog, and a lot of positive reinforcement that I’m still blogging down the right path. I also found some splogs and copyright thieves, so it pays to check these out frequently. I also found a few who didn’t have nice things to say about me, which is their right in this world of free speech.
read more

Tags: , , , ,

October 8, 2007

Goodbye to Splogs and Feed-Driven Blogs

After much soul searching and internal debate, I’ve decided that I’m done with splogs and feed-driven blogs generating content from my blogs. Aren’t you?

Here is the scenario.

A trackback comes in with the following starting off the “quote”, followed by the start of your blog post content:

  • […]admin wrote an interesting post today onHere’s a quick excerpt[…]
  • […] Jim Phillips wrote an interesting post today!.Here’s a quick excerpt…
  • […] Novak wrote an interesting post today on 100 bloggers worldwide collaborate to benefit charityHere’s a quick […]

Example of trackback splog comment spam

Notice the similarity? These all involve the words “wrote an interesting post today” and “here’s a quick excerpt”.

I’m considering adding these two phrases to my banned commenters list, but it’s a difficult decision as many use these words perfectly innocently. I wish there was a way to put them in the filter using all of the words without kicking out the innocent usages.

It’s that, or teach all bloggers to never introduce a blockquote using those phrases.
read more

Tags: , , , , , ,

September 17, 2007

How To Avoid Spambots By Using Pinging Services

Most blogging applications, including WordPress, are set up by default to let the entire world know when you update your blog. They either use a lengthy list of ping URLs or a pinging service such as Pingomatic to help you ensure that every service that wants to know about your new post does so.

These lists, by default, do not include just search engines and RSS readers, but also “central” pinging services that provide updates to other tools and applications.

However, as we discussed previously, this use of central servers is also very convenient for spammers. Many spam bots watch some of these services and scout them for content they might find useful to scrape, making them a potentially risky move.

Fortunately, most of the major services we, as humans, rely on offer a direct means of communication with them and that enables us to bypass these central pinging services altogether and avoid at least some of the spammers completely.

All it requires is a new pinging strategy, one that goes straight to the source.

read more

Tags: , , ,

September 10, 2007

Content Theft: How At Risk is Your Blog?

It’s a sad fact that pretty much any content posted to a blog or otherwise passed through an RSS feed is going to get scraped at one point or another. On the Web today, there are so many spammers spitting out junk blogs that almost everyone will be a victim of content theft if they blog long enough.

That being said, some content and some sites are more at risk than others. The type of content you post and the way you present it both are major factors in exactly how at risk your site is.

Even though you can not mitigate or change many of these factors, being aware of them can help you assess your site’s risk in this area and take appropriate action. Meanwhile, ignoring the factors could leave you seriously unprepared for a very real problem.

read more

Tags: , ,