Search engine optimisation from Blogstorm

Ultimate Guide to htaccess and mod_rewrite

by Patrick Altoft on July 14, 2007

Often described as “voodoo” by frustrated webmasters the use of mod_rewrite and htaccess files is one of the more advanced tasks a web developer has to face.

The good news is that unless you are looking for really advanced solutions you don’t have to fully understand how they work to use them on your website. Most of the htaccess and mod_rewrite tips on this page can simply be cut and pasted into a text file and uploaded to your server.

Over the last few years I’ve given the same htaccess tips to hundreds of webmasters so I decided to create a page with all the common uses.

htaccess is a configuration file that controls Apache web servers, mod_rewrite is a rewrite engine used by web servers to modify urls before they load.

The htaccess file is a text file called .htaccess - htaccess is the file extension, there is no filename. Normally it resides in the main root directory on your server but you can also create individual htaccess files for different directories on your site.

Canonicalization

The easiest htaccess trick is to make sure that your site doesn’t have any canonicalization issues on the homepage.

A lot of websites suffer from poor search engine rankings by having a number of different versions of the homepage, for example:

http://www.yoursite.com
http://yoursite.com
http://www.yoursite.com/index.html
http://yoursite.com/index.html

These pages are all seen as different urls, despite them having exactly the same content in most cases. Google has got better at deciding which version to use over the past 12 months but you can still run into problems.

To solve this issue simply add the following to your htaccess file:

Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^yoursite.com
RewriteRule (.*) http://www.yoursite.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ http://www.yoursite.com/ [R=301,L]

This will redirect all versions to http://www.yoursite.com

Changing html files to php

Sometimes you might have a static html website and need to use php code on the html pages. Rather than redirecting all your html pages to the equivalent php versions you simply need to tell your server to parse html files as if they were php.


AddHandler application/x-httpd-php .html

This works with any files so if you want to create dynamic xml or asp files that behave like php files you simply edit the code as required:


AddHandler application/x-httpd-php .xml
AddHandler application/x-httpd-php .asp

Error pages

Custom error pages can be set up in cpanel fairly easily, if you want to create a custom error page in htaccess instead use this line:


ErrorDocument 404 http://www.yoursite.com/404.php

Directory Indexes

To avoid Google indexing your directory indexes you might need to specify an index page for your directories. This is not required on some servers.


DirectoryIndex index.php3

My preference is to redirect the directory index page to either the homepage or another suitable page. For example www.yoursite.com/images/ can normally be redirected to www.yoursite.com and www.yoursite.com/forum/ can normally be redirected to www.yoursite.com/forum/index.php

Redirecting pages

A nice simple use of htaccess is to redirect one page to another:


redirect 301 /old-page.php http://www.yoursite.com/new-page.php

Sending your feed to Feedburner

If you want to switch your feed to the Feedburner service you will need to redirect your current feed to the new http://feeds.feedburner.com/yourfeed location.

The redirect needs to apply to all users except the Feedburner spider:


RewriteCond %{HTTP_USER_AGENT} !FeedBurner
RewriteRule ^your-feed\.xml$ http://feeds.feedburner.com/your-feed [R,L]

Advanced hotlink protection

If you want to block other websites from hotlinking your images, but allow indexing of your images in the Google, Yahoo and MSN image search engines, you should use the code below:


RewriteEngine on
RewriteCond %{HTTP_REFERER} .
RewriteCond %{HTTP_REFERER} !^http://([^.]+\.)?yoursite\. [NC]
RewriteCond %{HTTP_REFERER} !google\. [NC]
RewriteCond %{HTTP_REFERER} !search\?q=cache [NC]
RewriteCond %{HTTP_REFERER} !msn\. [NC]
RewriteCond %{HTTP_REFERER} !yahoo\. [NC]
RewriteCond %{REQUEST_URI} !^/hotlinker\.gif$
RewriteRule \.(gif|jpg|png)$ /hotlinker.gif [NC,L]

The hotlinker.gif image is a custom image that you have created. I suggest using something like “This image was hotlinked from www.yoursite.com” and your logo.

My personal preference is to allow hotlinking but to implement a solution to make use of Google Images and hotlinkers to build links to your site.

Create beautiful url’s with mod_rewrite

The Apache rewrite engine is mainly used to turn dynamic url’s such as www.yoursite.com/product.php?id=123 into static and user friendly url’s such as www.yoursite.com/product/123


RewriteEngine on
RewriteRule ^product/([^/\.]+)/?$ product.php?id=$1 [L]

Another example, rewrite from:

www.yoursite.com/script.php?product=123 to www.yoursite.com/cat/product/123/


RewriteRule cat/(.*)/(.*)/$ /script.php?$1=$2

Removing query strings

Some websites like to link to you by adding an query string, for example I could link to www.yoursite.com/index.php?source=blogstorm just so you know where your traffic came from. This creates duplicate content issue for your site so you really need to redirect back to your homepage:


RewriteCond %{QUERY_STRING} ^source= RewriteRule (.*) /$1? [R=301,L]

Further reading:


If you have any questions please post in the comments.

You can get our blog posts delivered for free by email every day - simply add your email address to the box below or alternatively grab the RSS feed.

Read some similar posts

Published in: Best Posts, Design, Search Engine Optimisation delicious | digg | reddit | StumbleUpon | Google Bookmarks | Sphinn

{ 2 trackbacks }

How to stop image hotlinking and bandwidth theft | David Airey :: Graphic and Logo Designer
01.19.08 at 1:14 am
SEO Mod Rewrite | Search Engine Optimisation (SEO) Feed from Position Gold Ltd
09.30.08 at 9:07 am

{ 17 comments… read them below or add one }

1 Richie 15/07/2007 at 5:06 am

You’ve some great tips there,thanks Patrick!

2 Free SMS Andy 16/07/2007 at 3:14 am

Fantastic guide! You’ve made everything seem so simple. As you mentioned, htaccess has always been ‘voodoo’ for me

3 othello 27/07/2007 at 6:05 am

simple, elegant but rocks!

4 dew 27/07/2007 at 1:28 pm

eh, it’s still pretty voodoo to me, even after reading lots of places, i still don’t seem to get it.

I’m trying to take something like
site.com/pages/page1.php
but make it really go to:
site.com/pages/index.php?id=1

and I have about 10 of those pages, each with different names like page2.php and page3.php. so I would need a rule for each one i think, but I just can’t seem to get it to work how I need… any suggestions or help?

5 TE 08/08/2007 at 8:21 pm

Excellent advice - thanks.

One thing I cant fathom tho:

I have http://www.example.eu pointing to the same name servers as my main site at http://www.example.com

How would I write into the htaccess for anything coming to the .eu site to be rewritten to the .com site so as to avoid dupe issues?

Thanks

6 youngus 09/08/2007 at 1:19 pm

thnx for all these. Smile have a little problem with my site. The situation is: I have LoadModule rewrite_module modules/mod_rewrite.so
in my httpd.conf,
the DocumentRoot is “/var/www/html”, and my site is under the html directory, that means /var/www/html/mysite/,
I put .htaccess file these directives:
php_value error_reporting 7
RewriteEngine On
RewriteBase /
Options +FollowSymlinks
RewriteRule ^([0-9]+)/(.*)?$ /index.php?module=item&action=show_item_decr&item_id=$1

but it seems that it doesn’t work ! it gives me a 404 page, and even if I put one of your directives it doesn’t work , can you help please ! is there anything I have to add ! thnx !

7 Benjamin 30/08/2007 at 3:48 am

When I use
RewriteCond %{QUERY_STRING} ^source= RewriteRule (.*) /$1? [R=301,L]

I get a 500 server error.

Am I not using it right?

8 Wouter 05/12/2007 at 2:55 am

Thanks for the useful tips!

9 David Airey 19/01/2008 at 1:11 am

Cheers Patrick. I’ll mention this article on my latest blog post.

10 Dave 26/01/2008 at 9:44 am

dew: eh, it’s still pretty voodoo to me, even after reading lots of places, i still don’t seem to get it.

Apache mod_rewrite IS voodoo. That’s why it seems like voodoo to you.

It even says so in the official documentation. (Although they removed that quote from the 2.0 docs, mod_rewrite is still the same, and still voodoo.)

11 Josh 21/04/2008 at 4:15 am

Very good tips. I spent one whole morning on .htaccess and got it done only when I found this article.

12 AskApache 25/04/2008 at 12:50 am

You might enjoy checking out the .htaccess guides here too.

Nice and simple tutorial, thanks!

13 Jon 22/05/2008 at 10:53 am

Thanks for the tutorial, Patrick

I used it to redirect an old static site to a Wordpress site. I discovered some extra bits along the way that might be useful -

- If you’re using WP permalinks, WP adds some lines to htaccess. make sure you put additional stuff outside the code block that WP adds. Otherwise it may get overwritten next time WP writes to htaccess.

- Put your redirects before the code that WP adds, so that they are processed before control is passed to WP

- If there are spaces in the old URLs, put quotes around them. Eg.
redirect 301 "/old page.htm" http://www.mysite.com/newpage

(NB. there is a WP plugin called “Redirection” but I wanted to learn about htaccess, and keep my list of plugins as short as possible!)

Thanks for a great site, good to read a UK perpsective on SEO stuff!

Cheers, Jon

14 Ian Wortley 14/07/2008 at 10:20 pm

Wow… that’s a great artical. You have no idea how many different versions of rewrites I have crawled through…. none of them worked…. until yours. I am more than grateful!

I have a question though. Windows live has indexed http://www.mysite.ie/?

How or why I don’t know but I’d like to get rid of it. I have not got my head around regex yet and can’t work it out… I ended up with a loop when I tried to modify your QUERY_STRING source= (which was absolutely ideal for getting rid of my index.php?url=home problem.. Thanks again).

While I owe you big time already I’d appreciate any help you could offer with this… be it a solution or where to look for one… I’m stumped!

You have my site address and email address should you need them.

Thanks again for your brilliant page.
Ian.

15 Patrick Altoft 15/07/2008 at 8:59 am

I suggest just redirecting it in your domain registrar control panel rather than using htaccess.

16 ganesan 28/07/2008 at 1:33 pm

hi mr Patrick Altoft,
Thank you i am new to htaccess. Very help ful for your wonderful tips

17 Craig Mullins 05/08/2008 at 12:50 am

Great info. One issue with the Canonicalization script. The url’s and directory’s should end with a trailing slash.

http://www.yoursite.com/
http://www.yoursite.com/directory/

It will save an extra query on your web server and make it just a tad faster.

Let me know if you modify that, so I can ad it to my htaccess file.

Leave a Comment (get an avatar from Gravatar first)

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>