Since releasing the Wordpress Crawl Rate plugin a few people have been asking whether their crawl rate is “normal” or not. The answer is that crawl rates are determined by a number of factors so there really isn’t a normal crawl rate.
Here are some of the main factors that affect crawl rate:
- Trust
- PageRank
- Number of pages / site structure
- Number & quality of incoming links
- Frequency content changes
- Type of site
- Whether you run Adsense
- How often you ping Google using RSS
If you have a very large site you will get a much higher number of pages being crawled per day but the actual rate each individual page is crawled will be lower. If a search engine finds your content is changing a lot they will increase your crawl rate to take this into account and vice versa.
Google tends to crawl the web and load all the links they find into a “To be crawled” database without duplicating entries. Yahoo appears to crawl links as and when they find them without worrying about crawling pages too often. Sometimes Yahoo crawls the same page a few times per hour whereas Google usually just visits a page once every day or so depending on your crawl cycle.
The end result of this behaviour is that the more inbound links your site has the more Yahoo will crawl your pages. Small sites with lots of links will see higher Yahoo activity while large sites will almost always see more Google activity.
Below are some screenshots of the Crawl Rate Tracker in action. Although the data is only for a couple of days I find that the actual crawl rates are pretty consistent so the data is reliable.
Coolest Gadgets
Coolest Gadgets has 73,000 pages and a massive 350,000 incoming links. Google spiders over 11,000 pages per day and sometimes hits the homepage every 3 minutes!

Blogstorm
Blogstorm has 500 pages and 100,000 incoming links. Because it has lots of links Yahoo spiders the most.

GadgetVenue.com
Gadget Venue has 15,000 pages and 40,000 links. Because it has lots of pages we see that Google spiders the most pages per day.

Joost de Valk
Joosts SEO blog has 540 pages and 65,000 links. Again we see Yahoo spidering the most.

Self Made Minds
Self Made Minds has 330 pages and 70,000 incoming links – Yahoo spiders the most.

You can get our blog posts delivered for free by email every day - simply add your email address to the box below or alternatively grab the RSS feed.







{ 12 comments… read them below or add one }
That’s an interesting analysis. It would be interesting to know what % of traffic comes from Google and Yahoo on each site and if the more links/less pages sends more traffic from yahoo then to those with high pages, less links. So if both pages and links are high (like Coolest-gadgets) I would expect the split is slightly closer.
For me, google accounted for 35% of my traffic over the last 30 days with Yahoo being just 1.17%.
Although I would imaging Google comes out tops on all sites, I would think that for blogstorm the % would be a lot closer to google then mine.
Find me on Twitter
Google sends 22% and Yahoo sends 3% for Blogstorm.
More comments from Patrick AltoftInteresting findings Patrick, I was amazed at how much the gbot was gobbling.
On CG Yahoo sends ~3% of search engine traffic whilst Google is ~95%
I rarely comment on SEO blogs but this one was useful enough that I thought I’d drop in and say thanks! Good info and bookmarked!
Great work
i have a question:
lets say i have ecommerce site thats is store.com(example) and i make blog store.com/blog and update very nice content every days
will my crawl rate go up even on mail page that is store.com
and will my main site get more trust
Find me on Twitter
In short, yes. Any incoming links to a page will help the entire domain (assuming the page links to the other pages).
More comments from Patrick Altoftperfect patrick
Nice work will recommend this to all my mates
Patrick,
Great plugin! I just installed it and I’ll send you some screenshots once I get some data.
Where do I find the # for incoming links? I don’t think I have as many incoming links as the Self Made Minds site, but the crawl stats look almost as good as theirs. MSNbot maxed at 88, Googlebot maxed around 67, Yahoo Slurp maxed around 24.
I only have 151 posts.
Hey Patrick,
I’ve updated my screenshot (about a week’s worth of data) here:
http://moneymakingblogs.com/blog/2008/update-on-crawl-rate-tracker/
Patrick
I am presenting at SMX Advanced in Seattle on SEO analytics and one of my slides is about crawl rate tracker. I am wondering if you would be willing to provide me with an updated screenshot of your data and possible some analytics details for the pages being crawled so I can show if there is a correlation between crawl rate and traffic or rank.
I am also curious if you have any other non-traditional metric you would have me include.
Jonah
Is there any how that you can force yahoo to crawl your site? (e.g. resubmit a url?) it seems that yahoo crawl my site once a month or so.
Leave a Comment (registration is optional)