Studying Spider Crawl Rate to Find Your Most Trusted Pages

by Patrick Altoft on November 8, 2007

One of the tools I find very valuable on both my own sites and those of certain clients is my Crawl Rate Analysis Program (CRAP, for short). It logs every single visit by a search engine spider into a database so that I can analyse which pages are being spidered the most and, most importantly, why.

Some years ago the concept of website trust was still in its infancy and I realised that PageRank, incoming links and TrustRank were all related but that measuring them was very tricky thanks to Google keeping toolbar PR inaccurate. One day I decided to start tracking spider activity and quickly found that the pages that were spidered the most were the ones with the most incoming link juice.

Most SEO’s know that more link equity = deeper crawling & deeper indexing but what most people don’t mention is the fact that more link equity = more frequent crawling of already indexed content.

Keeping a watch on how spiders hit your pages is hugely valuable for link builders as it is possible to see exactly which links work and which don’t. Imagine you have a page about Green Widgets which has been spidered by Google at a rate of 4 times per week for a year. You can go and buy a text link for a month and analyse the spider activity to see if it increases. No increase in spider activity means the link is not a trusted link and you can cancel the subscription. A dramatic increase in spider activity means you need to buy more links from that site.

If you want to go and buy a bunch of deep links you can point them to different test pages on your site and measure the spider activity for each page giving you a picture of which links are working and which are not passing equity.

The log files for spider activity on BlogStorm during October are here and you can see how the most trusted pages (the ones with loads of quality incoming links) are being spidered about 10 times per day. Drill down into the actual spider logs by clicking on the number of crawls and you can see that Yahoo spiders my site a huge amount while Google doesn’t visit quite as often. MSN visits the least often, hitting the homepage about once per day.

Interestingly Google sometimes visits the BlogStorm homepage 13 times per day, I guess it likes my content.

Some of the pages with a lot of link equity are category pages which have a sitewide link but the most valuable data comes from looking at the links pointing to the blog posts with the most spider activity, that’s where the data gets valuable.

Patrick Altoft is Director of Search at Leeds based digital & SEO agency Branded3. Patrick also runs Blogstorm.

You can get our blog posts delivered for free by email every day - simply add your email address to the box below or alternatively grab the RSS feed.

Read some similar posts

{ 9 comments… read them below or add one }

Nature Wallpaper 08 Nov 2007 at 7:24 pm

PR ranking fell dramatically last couple days, what happened!

Jeremy Luebke 08 Nov 2007 at 7:25 pm

Now releasing that tools would get ou a few links. ;)

Natasha 08 Nov 2007 at 9:02 pm

I think its time you released this tool so we can all benifit from it Big Grin

JoeTech.com 08 Nov 2007 at 10:20 pm

I agree with Natasha. I’d love to use this tool.

perros 12 Nov 2007 at 2:25 am

What a funny acronym… Definitely one that is not easily forgotten. I’d love to use it.

Greg 12 Nov 2007 at 5:08 am

Hey dude, that’s some nice info. We can get a copy of this tool how?

Polo in yorkshire 13 Dec 2007 at 9:18 am

Please can i have a copy of the tool for crawling my site. Seems really handy for SEO

DIY Forum 10 Aug 2008 at 6:23 pm

I like the idea, but the name is too funny to take seriously. Initially, before reading about it. Thank you, since I am both interested in spiders visiting my forum, and identifying where they are visiting.

thx,

rudy 28 Aug 2008 at 11:05 am

baidu ?

{ 1 trackback }

The Blogger Tips
09.22.08 at 3:09 pm

Leave a Comment (registration is optional)

Registration is free, takes about 5 seconds and is worth doing.

You can use these HTML tags and attributes:
<a href=""> <b> <blockquote> <code> <em> <i> <strike> <strong>