htaccess 301 redirect and Canonical Issues
You are upgrading your web site and as part of the upgrade, it means moving and renaming particular files.
Search engines have indexed your entire site and pages you're going to move or rename rank well. By altering these files, you run the risk of losing a lot of traffic and leaving visitors to your site who follow a search engine link with the dreaded "Error 404 - File not found".
A 301 redirect is the best way to go and I go into some detail on how to implement one in this tutorial, but first let's take a look at a couple of other strategies I often see mentioned around the web to get around the problem and why you shouldn't use them.
Custom Error Page
You could create a custom error page. The problem with this solution:
a) You will lose ranking for the page as the file will appear to be non-existent next time it's requested by the search engine spider. It could be some time before the page in it's new location or with a new name reappears and given you'll lose the power of inbound links from other sites to the page in question, it may not rank as well.
b) Your web site visitors may be frustrated by the fact that they then have to dig through your site to find the desired information.
Strategy 2 - Meta Refresh
A meta refresh can be implemented in the statement of your source code in blank page with the old file name, which then automatically redirects visitors to the new page. Example:
Warning: This is a technique often used by spammers to trick search engines and it should be avoided, unless the page is in a section of your site that isn't spidered.
What the search engine spammers do is to create a page that is optimized for certain keywords and phrases - it usually has no real content. The page is then picked up by some search engines, but when a visitor clicks on the search engine entry, they are redirected to another site, often unrelated.
It's a despicable trick, but thankfully most search engines have filters to detect this. Using this form of SE deception will see a site eventually banned or penalized by major players such as Google.
The "ROBOTS" statement in the code example above tells search engines to ignore this page, a safeguard against copping a slap from the engine.
Aside from the perceived spam issue, this approach also has the disadvantages of using a custom error page
The right way - a htaccess 301 Redirect
A 301 redirect is the most efficient and spider/visitor friendly strategy around for web sites that are hosted on servers running Apache (check with your hosting service if you aren't sure).
It's not that hard to implement and it should preserve your search engine rankings for that particular page. If you *have* to change file names or move pages around, it's the safest option.
A 301 redirect is implemented in your .htaccess file.
What is a .htaccess file?
When a visitor/spider requests a web page via any means, your web server checks for a .htaccess file. The .htaccess file contains specific instructions for certain requests, including security, redirection issues and how to handle certain errors.
What is a 301 redirect?
The code "301" is interpreted as "moved permanently". After the code, the URL of the missing or renamed page is noted, followed by a space, then followed by the new location or file name
Implementing a 301 redirect for static pages
First of all, you'll need to download the .htaccess file in the root directory of where all your web pages are stored. If there is no .htaccess file there, you can create one with Notepad or a similar application. Make sure when you name the file that you remember to put the "." at the beginning of the file name. This file has no tail extension.
If there is a .htaccess file already in existence with lines of code present, be very careful not to change any existing line unless you are familiar with the functions of the file.
Scroll down past all the existing code, leave a line space, then create a new line that follows this example:
redirect 301 /old/old.htm http://www.you.com/new.htm
It's as easy as that. Save the file, upload it back into your web and test it out by typing in the old address to the page you've changed. You should be instantly and seamlessly transported to the new location.
Notes: Be sure not to add "http://www" to the first part of the statement - just put the path from the top level of your site to the page. Also ensure that you leave a single space between these elements:
redirect 301 (the instruction that the page has moved)
/old/old.htm (the original folder path and file name)
http://www.you.com/new.htm (new path and file name)
Implementing a 301 redirect for dynamic pages
A dynamic page is one generated by a database driven application, such as blog or forum software. A file name is appended by a query string, looking something like this:
http://www.example.com/page.php?id=13
Where a query string is used, the 301 redirect solution for static pages above will not work; you'll need to use a rewrite solution. Using the page.php?id=13 example, here's what you'll need to use in your htaccess file:
RewriteEngine on
RewriteCond %{QUERY_STRING} ^id=13$
RewriteRule ^/page.php$ http://www.example.com/newname.htm? [L,R=301]
In the example above the id=13 should be replaced with the query string of the page you wish to redirect and the page.php with the name of your file prior to the query string.
A more powerful set of directives for manipulating URLs is contained in the Apache mod_rewrite module, especially useful when changing domain names and/or folder names containing large numbers of files. Read our basic tutorial on the apache mod_rewrite module.
301 redirect for file names with spaces
If you have file names with a space you wish to redirect, for example "old page.htm", a standard redirect 301 line won't work, but by using quotes around the original file path, it will then function correctly.
Example:
redirect 301 "/old page.htm" http://www.example.com/newpage.htm
Redirecting entire sites with 301
The 301 directive is quite powerful. You can redirect not just single files but entire sites, for example when changing domain names e.g.
redirect 301 / http://www.you.com/
The first "/" indicates that everything from the top level of the site down should be redirected. As long as you are using the same paths and filenames, then this option is a very simple way to perform site redirection in the situation where you have only changed your domain name.
If the site redirection doesn't work for you, check to ensure you have the trailing "/" on the destination URL. You may also like to try some of the other suggestions in our basic tutorial on the apache mod_rewrite module.
Canonical issues: www vs. non-www
There's been much talk about canonical issues and search engines. This is where both the www and non-www versions of your pages are listed in a search engine. This is said to possibly trigger a duplicate content penalty and/or split page rank. If this is of concern to you, you may wish to use the following, but be aware that you may suffer a further loss of traffic while the engines sort out what's what. This example is where you wish to direct all non-www traffic to www. Add the following to your .htaccess file.
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^yoursite.com [NC]
RewriteRule ^(.*)$ http://www.yoursite.com/$1 [L,R=301]
Ensure that all your links to folders always end in a trailing / if there is no filename after that link.
FrontPage users: in addition to the above, you'll also need to change the .htaccess files in:
_vti_bin
_vti_bin /_vti_adm
_vti_bin/ _vti_aut
Replace "Options None" to "Options +FollowSymLinks"
Those folders are part of your FrontPage extensions on the server, so you'll need to gain access via FTP.
Search engines have indexed your entire site and pages you're going to move or rename rank well. By altering these files, you run the risk of losing a lot of traffic and leaving visitors to your site who follow a search engine link with the dreaded "Error 404 - File not found".
A 301 redirect is the best way to go and I go into some detail on how to implement one in this tutorial, but first let's take a look at a couple of other strategies I often see mentioned around the web to get around the problem and why you shouldn't use them.
Custom Error Page
You could create a custom error page. The problem with this solution:
a) You will lose ranking for the page as the file will appear to be non-existent next time it's requested by the search engine spider. It could be some time before the page in it's new location or with a new name reappears and given you'll lose the power of inbound links from other sites to the page in question, it may not rank as well.
b) Your web site visitors may be frustrated by the fact that they then have to dig through your site to find the desired information.
Strategy 2 - Meta Refresh
A meta refresh can be implemented in the statement of your source code in blank page with the old file name, which then automatically redirects visitors to the new page. Example:
Warning: This is a technique often used by spammers to trick search engines and it should be avoided, unless the page is in a section of your site that isn't spidered.
What the search engine spammers do is to create a page that is optimized for certain keywords and phrases - it usually has no real content. The page is then picked up by some search engines, but when a visitor clicks on the search engine entry, they are redirected to another site, often unrelated.
It's a despicable trick, but thankfully most search engines have filters to detect this. Using this form of SE deception will see a site eventually banned or penalized by major players such as Google.
The "ROBOTS" statement in the code example above tells search engines to ignore this page, a safeguard against copping a slap from the engine.
Aside from the perceived spam issue, this approach also has the disadvantages of using a custom error page
The right way - a htaccess 301 Redirect
A 301 redirect is the most efficient and spider/visitor friendly strategy around for web sites that are hosted on servers running Apache (check with your hosting service if you aren't sure).
It's not that hard to implement and it should preserve your search engine rankings for that particular page. If you *have* to change file names or move pages around, it's the safest option.
A 301 redirect is implemented in your .htaccess file.
What is a .htaccess file?
When a visitor/spider requests a web page via any means, your web server checks for a .htaccess file. The .htaccess file contains specific instructions for certain requests, including security, redirection issues and how to handle certain errors.
What is a 301 redirect?
The code "301" is interpreted as "moved permanently". After the code, the URL of the missing or renamed page is noted, followed by a space, then followed by the new location or file name
Implementing a 301 redirect for static pages
First of all, you'll need to download the .htaccess file in the root directory of where all your web pages are stored. If there is no .htaccess file there, you can create one with Notepad or a similar application. Make sure when you name the file that you remember to put the "." at the beginning of the file name. This file has no tail extension.
If there is a .htaccess file already in existence with lines of code present, be very careful not to change any existing line unless you are familiar with the functions of the file.
Scroll down past all the existing code, leave a line space, then create a new line that follows this example:
redirect 301 /old/old.htm http://www.you.com/new.htm
It's as easy as that. Save the file, upload it back into your web and test it out by typing in the old address to the page you've changed. You should be instantly and seamlessly transported to the new location.
Notes: Be sure not to add "http://www" to the first part of the statement - just put the path from the top level of your site to the page. Also ensure that you leave a single space between these elements:
redirect 301 (the instruction that the page has moved)
/old/old.htm (the original folder path and file name)
http://www.you.com/new.htm (new path and file name)
Implementing a 301 redirect for dynamic pages
A dynamic page is one generated by a database driven application, such as blog or forum software. A file name is appended by a query string, looking something like this:
http://www.example.com/page.php?id=13
Where a query string is used, the 301 redirect solution for static pages above will not work; you'll need to use a rewrite solution. Using the page.php?id=13 example, here's what you'll need to use in your htaccess file:
RewriteEngine on
RewriteCond %{QUERY_STRING} ^id=13$
RewriteRule ^/page.php$ http://www.example.com/newname.htm? [L,R=301]
In the example above the id=13 should be replaced with the query string of the page you wish to redirect and the page.php with the name of your file prior to the query string.
A more powerful set of directives for manipulating URLs is contained in the Apache mod_rewrite module, especially useful when changing domain names and/or folder names containing large numbers of files. Read our basic tutorial on the apache mod_rewrite module.
301 redirect for file names with spaces
If you have file names with a space you wish to redirect, for example "old page.htm", a standard redirect 301 line won't work, but by using quotes around the original file path, it will then function correctly.
Example:
redirect 301 "/old page.htm" http://www.example.com/newpage.htm
Redirecting entire sites with 301
The 301 directive is quite powerful. You can redirect not just single files but entire sites, for example when changing domain names e.g.
redirect 301 / http://www.you.com/
The first "/" indicates that everything from the top level of the site down should be redirected. As long as you are using the same paths and filenames, then this option is a very simple way to perform site redirection in the situation where you have only changed your domain name.
If the site redirection doesn't work for you, check to ensure you have the trailing "/" on the destination URL. You may also like to try some of the other suggestions in our basic tutorial on the apache mod_rewrite module.
Canonical issues: www vs. non-www
There's been much talk about canonical issues and search engines. This is where both the www and non-www versions of your pages are listed in a search engine. This is said to possibly trigger a duplicate content penalty and/or split page rank. If this is of concern to you, you may wish to use the following, but be aware that you may suffer a further loss of traffic while the engines sort out what's what. This example is where you wish to direct all non-www traffic to www. Add the following to your .htaccess file.
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^yoursite.com [NC]
RewriteRule ^(.*)$ http://www.yoursite.com/$1 [L,R=301]
Ensure that all your links to folders always end in a trailing / if there is no filename after that link.
FrontPage users: in addition to the above, you'll also need to change the .htaccess files in:
_vti_bin
_vti_bin /_vti_adm
_vti_bin/ _vti_aut
Replace "Options None" to "Options +FollowSymLinks"
Those folders are part of your FrontPage extensions on the server, so you'll need to gain access via FTP.
Post a Comment