How to Measure & Analyse Long Tail Search

by Patrick Altoft on June 16, 2010

Now that Google Caffeine is 100% live & following up from the recent May Day update I thought it would be good to talk about some of more advanced aspects of long tail SEO.

First of all Caffeine is a new infrastructure rather than an algorithm update so it’s not related to the May Day changes. What Caffeine does is increase the freshness of the Google index by increasing crawl capacity and also decreasing the time it takes to get the crawled pages live and searchable in the index. For bloggers this might not be a big change for new pages because blog platforms normally ping Google and get indexed in a couple of minutes anyway – for the rest of the web this should make a big difference and open up the door for much fresher long tail results.

Google Caffeine

The May Day update basically means that some sites with thin content and a lack of internal links are no longer getting the authority benefits they used to. Google is showing more relevant pages instead which is certainly a positive step.

What is the Long Tail?

Long tail SEO describes the thousands or millions of search terms that individually generate very little traffic but collectively generate a large percentage (perhaps 70%) of a sites overall search traffic. Long tail doesn’t mean keyphrases with 4, 5 or 6 words in the phrase – these may fall into the long tail group but that isn’t always the case. We have some very large 4 word phrases that send thousands of visitors per month and they are classed as short tail terms.

The best way to classify the terms is to look at the chart below from SEOmoz which breaks search traffic into 3 buckets:

  • Short tail – 18.5%
  • Mid tail – 11.5%
  • Long tail – 70%

These figures are approximate but as long as we are consistent it doesn’t matter too much what we choose.

Long Tail

The next step is to do some analysis to measure your current short, mid & long tail traffic numbers so that you can monitor each month how things improve. We set this up as an advanced segment in Google Analytics as well as an Excel chart and find that the following figures tend to give the percentages we want for most websites.

  • Short tail – keyphrase with 100 or more visits per month
  • Mid tail – keyphrase with 6 to 99 visits per month
  • Long tail – keyphrase with 5 or less visits per month

You need to run some figures for your site until you get the percentages in the chart above – don’t forget to remove brand searches.

Segmenting long tail traffic

Visualising millions of keywords that each send a handful of visitors every month is an impossible task so we need to try and segment the data in order to try to improve the numbers. The best way to do this is to split the site into the same sections we have create for our multiple sitemaps above and for each section analyse & monitor the following:

  • Number of pages indexed
  • Number of landing pages receiving > 1 visit per month
  • Number of keywords sending visitors to the section each month

Long & short phrases

Having said that long tail doesn’t necessarily correspond to the number of words in a keyphrase it is still very important to track and monitor the distribution of words in your keyphrases every month. You should do this in two ways, by setting up filters in Google Analytics but also by exporting all your keyword data to Excel and running a pivot table query to show figures such as conversion rate vs keyphrase length and visitor or revenue numbers vs keyphrase length.

Measuring Indexation

The best way to measure indexation on large sites is to split the site into sections and create a different xml sitemap for each section. By doing this in Webmaster Tools you can quickly visualise what pages are getting indexed and which are not.

Multiple sitemaps indexation on large sites

If you find that a particular section has an indexation issue then we need to diagnose what’s going wrong. This get’s a bit technical but the best method we have found is to create a script to check the indexation status of each page as follows:

  • Check to see if the page is indexed using the info:http://www.site.com/page.htm command on Google
  • Check server logs to see how many times the URL has been spidered in last 30 days
  • Use SEOmoz API to find total links to the page & mozRank

Once we have this data we can look into what might be going on & try to fix it.

Measuring number of landing pages

Again, this needs to be done by splitting your site down into different sections, you can do it in bulk but that doesn’t give the right data. The key is to use this method but to add a filter to only show the landing pages from the sub-folder or category you want to analyse.

Indexation

The final result of your analysis should be a chart that looks something like the one below, taken from one of our ecommerce clients.

Patrick Altoft is Director of Search at Leeds based digital & SEO agency Branded3. Patrick also runs Blogstorm.

You can get our blog posts delivered for free by email every day - simply add your email address to the box below or alternatively grab the RSS feed.

Read some similar posts

{ 11 comments… read them below or add one }

carl 17 Jun 2010 at 10:02 am

Really interesting, but i'm having problem creating the advanced segment for the long tail. :(

Metrics –> Visits –> Less than or equal to 5
AND
Keyword –> Does not contain –> MyBrand

Alden DoRosario 17 Jun 2010 at 7:20 pm

Excellent insights. Will try and get our research guys to kick off some studies based on these insights.

Simon 20 Jun 2010 at 1:40 pm

Are you saying that 70% of searches are long tail?

mondex1 22 Jun 2010 at 12:37 am

Very interesting post. I didn't know that I can segment long tail searches. Thanks for a very detailed instruction how to measure our long tail searches. :)

dotcompals 22 Jun 2010 at 4:15 pm

very useful and fantastic information Pat. Thank you., Let me read in full..

Alessio 22 Jun 2010 at 5:18 pm

Interesting post but, in my experience, i think is a bit more tricky create that advanced segment: the metric "visits" isn't correct for it.

Patrick, how did you create it? :)

holidays to magaluf 24 Jun 2010 at 10:51 pm

interesting post. searches are long tail?

Jade @ No Longer 25 09 Jul 2010 at 11:23 am

I just tried to subscrube to you via your feedburner link at the top right – it's not working. No worries though I've done it manually but thought you'd like to know.

Really interesting post, I'm still trying to work out everything I can do with Google Analytics so this is great.
Jade

Rahul 09 Jul 2010 at 5:59 pm

very useful

seo spidy 09 Jul 2010 at 6:00 pm

Excellent Post I m Lovin it bookmark it

Webmaster 01 Sep 2010 at 8:56 am

The best SEO technique in one sentence: Great content and great links! Google does the rest…
Very interesting post!

Leave a Comment (registration is optional)

Registration is free, takes about 5 seconds and is worth doing.

You can use these HTML tags and attributes:
<a href=""> <b> <blockquote> <code> <em> <i> <strike> <strong>