Recently Google decided that the internet wasn’t quite big enough and started creating extra pages on a number of websites.
Googlebot does this by making up random words, entering them into web forms and indexing the results.
You can see that blogstorm.co.uk has 57 of these auto-generated pages indexed in Google already.
It’s quite easy to block these pages being indexed as other people have pointed out but 99.9% of webmasters won’t know about the issue. The best method to “fix” the problem is to noindex,follow the pages you don’t want to be indexed. Don’t use robots.txt as this is a waste of any links you happen to gain to these pages.
As long as a site has lots of authority (PageRank & TrustRank) then Google adding a hundred or so extra pages isn’t going to have an effect on your rankings. It might even get you some long tail traffic (as long as the pages are optimised) but you need to keep an eye on the situation to make sure Google isn’t creating thousands of near duplicate pages.
My guess is that these auto generated pages might cause a big issue for some people so watch this space.
You can get our blog posts delivered for free by email every day - simply add your email address to the box below or alternatively grab the RSS feed.







{ 7 comments… read them below or add one }
I may be a bit slow here so please bear with me.
If Google is indexing more pages, why is this a bad thing? Surely the more pages you have in Google, the more authority you can develop over time? A one page site would, in theory, have less authority over a 100 page site if both had no inbound links and no domain history.
I understand that the more pages you have, the more that PageRank can be distributed amongst these pages. But I can’t see why that is bad.
As for duplicate content, can it not be safely assumed that if the Googlebot is creating these queries, the Google algorithm will reflect such an activity and hence not penalise you?
Hopefully you can clarify this for me as from where I’m standing, this looks like a good thing. I could be wrong though
Haha…Google’s too huge for the Internet
Find me on Twitter
A site has a finite amount of PageRank that is split amongst all of it’s pages. If you have 100 pages but suddenly the site grows to 1000 pages without gaining any links you end up with each page having 1/10th of the PR it did before assuming equal PR distribution.
In some cases having more pages is fine but if you are struggling for PR and authority then if Google suddenly adds a thousand or so pages it can cause problems, especially if those pages are low value non-optimised pages.
More comments from Patrick AltoftThe interesting question to ask is why is it indexing particular search queries to begin with…unless you have Adsense on search results? Googlebot would then come to visit and voila, a new page would be born waiting to be indexed.
Interesting!
Barbara
I noticed this too, and thought that somebody at Google was having some fun with me… til I noticed that they’d picked up and indexed legitimate pages. Schweet, thinks I.
Though that spreading of the PR is worth thinking about, but that’s what robots.txt is good for as well, no?
I have noticed this as well but can not figure out why this occurs. If it helps one of my sites I’m all for it.
regards Gregory
Surely PR is only spread between pages if a direct link from one page to the other exists? i.e a navigational link to the newly created page? Maybe I misunderstand exactly how the PR is distributed, but as I understand it internal PR is distrubuted via the navigation of your site. In theory there should be no direct link to these pages on your site as these are all query based pages
I might be wrong, but interesting subject and I will keep an eye out for this on our sites
Leave a Comment (registration is optional)