Studying Spider Crawl Rate to Find Your Most Trusted Pages

One of the tools I find very valuable on both my own sites and those of certain clients is my Crawl Rate Analysis Program (CRAP, for short). It logs every single visit by a search engine spider into a database so that I can analyse which pages are being spidered the most and, most importantly, why.

Some years ago the concept of website trust was still in its infancy and I realised that PageRank, incoming links and TrustRank were all related but that measuring them was very tricky thanks to Google keeping toolbar PR inaccurate. One day I decided to start tracking spider activity and quickly found that the pages that were spidered the most were the ones with the most incoming link juice.

Most SEO’s know that more link equity = deeper crawling & deeper indexing but what most people don’t mention is the fact that more link equity = more frequent crawling of already indexed content.

Keeping a watch on how spiders hit your pages is hugely valuable for link builders as it is possible to see exactly which links work and which don’t. Imagine you have a page about Green Widgets which has been spidered by Google at a rate of 4 times per week for a year. You can go and buy a text link for a month and analyse the spider activity to see if it increases. No increase in spider activity means the link is not a trusted link and you can cancel the subscription. A dramatic increase in spider activity means you need to buy more links from that site.

If you want to go and buy a bunch of deep links you can point them to different test pages on your site and measure the spider activity for each page giving you a picture of which links are working and which are not passing equity.

The log files for spider activity on BlogStorm during October are here and you can see how the most trusted pages (the ones with loads of quality incoming links) are being spidered about 10 times per day. Drill down into the actual spider logs by clicking on the number of crawls and you can see that Yahoo spiders my site a huge amount while Google doesn’t visit quite as often. MSN visits the least often, hitting the homepage about once per day.

Interestingly Google sometimes visits the BlogStorm homepage 13 times per day, I guess it likes my content.

Some of the pages with a lot of link equity are category pages which have a sitewide link but the most valuable data comes from looking at the links pointing to the blog posts with the most spider activity, that’s where the data gets valuable.

Reader Comments leave yours >>

PR ranking fell dramatically last couple days, what happened!

 

Now releasing that tools would get ou a few links. ;)

 

I think its time you released this tool so we can all benifit from it Big Grin

 

I agree with Natasha. I’d love to use this tool.

 

What a funny acronym… Definitely one that is not easily forgotten. I’d love to use it.

 

Hey dude, that’s some nice info. We can get a copy of this tool how?

 

Please can i have a copy of the tool for crawling my site. Seems really handy for SEO

 

Read our comment policy
We moderate first time commenters

Name (required)
E-mail (required - never shown publicly)
Your website
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong> in your comment.

Trackback URI

Design by Patrick, theme by Justin Tadlock & code by Wordpress