O'Reilly

Search Engine Friendly URLs with .htaccess and mod_rewrite

By on  

I was recently developing a PHP website that used mod_rewrite to make its URLs search engine friendly. Websites have been using mod_rewrite and .htaccess strategies to do this for years now and there are a 100 ways to accomplish the task. One issue that was occurring with this site was URLs without the a trailing slash would work, but URLs with a trailing slash would break (trigger a 404 error):

//works
http://mydomain.com/my-page
//breaks
http://mydomain.com/my-page/

The original .htaccess source was:

#adds ".php" to a URL that isn't a directory or a file
RewriteCond %{REQUEST_URI} !(\.[^./]+)$
RewriteCond %{REQUEST_fileNAME} !-d
RewriteCond %{REQUEST_fileNAME} !-f
RewriteRule (.*) $1.php [L]

The solution was simple:  an extra statement to accommodate for the trailing slash:

#removes trailing slash if not a directory
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)/$ /$1 [R=301,L]

#adds ".php" to a URL that isn't a directory or a file
RewriteCond %{REQUEST_URI} !(\.[^./]+)$
RewriteCond %{REQUEST_fileNAME} !-d
RewriteCond %{REQUEST_fileNAME} !-f
RewriteRule (.*) $1.php [L]

This method may be a bit inefficient as there are two redirects but it does the job.  Do you have a better solution?  If so, share it!

Track.js Error Reporting

Recent Features

Incredible Demos

  • MooTools Star Ratings with MooStarRating

    I've said it over and over but I'll say it again:  JavaScript's main role in web applications is to enhance otherwise boring, static functionality provided by the browser.  One perfect example of this is the Javascript/AJAX-powered star rating systems that have become popular over the...

  • Create a Simple Slideshow Using MooTools

    One excellent way to add dynamism to any website is to implement a slideshow featuring images or sliding content. Of course there are numerous slideshow plugins available but many of them can be overkill if you want to do simple slideshow without controls or events....

Discussion

  1. Not necessarily better but I use PHP to do the same thing after initialising and starting the search through pages in a DB:

    $uri = trim($uri, '/');
  2. PS my htaccess files route every request through a single php file rather than having them scattered – which makes it easier to maintain consistent flow through the app, which is possibly the main reason I don’t redirect via htaccess:


    RewriteEngine On
    RewriteBase /
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_FILENAME} !-l
    RewriteRule ^(.*)$ index.php?$1 [L,QSA]

  3. .htaccess redirection is quite powerful… why would you want to add a .php extension though? Isn’t just the page name ie: /about better? Adding .php would for instance break all SEO attained if your site would ever move to another platform (not that it would most likely, but still useful when developing client sites)

    • seelts

      yes, htaccess is powerfull enough, but php is much more powerfull in it’s logical statements.
      it is not necessarily to add .php extension to URLs. you can still use ie “example.com/about” or “example.com/about/” (slash in the end) or any other kind of URLs while processing them in PHP.
      I think andrew ment just the same.
      The only thing to be done – is to redirect ALL requests to some your php file, which will only parse the URI and “include”/”require” desired page.

  4. RewriteRule ^([-_a-zA-Z0-9]+)/?([-_a-zA-Z0-9]+)?/?([-_a-zA-Z0-9]+)?/?([-_a-zA-Z0-9]+)?/?([-_a-zA-Z0-9]+)?/?([-_a-zA-Z0-9]+)?/?([-_a-zA-Z0-9]+)?/?$ /$1.php?data1=$2&data2=$3&data3=$4&data4=$5&data5=$6&data6=$7 [L,QSA]

  5. I believe the reason for the 404 is your first RewriteCond statement, actually.
    RewriteCond %{REQUEST_URI} !(\.[^./]+)$

    Tells it to match anything not ending with a trailing slash. Where you have 3 conditional statements, it must pass all three, i believe.

    I use quite a similar rewrite as the one you’ve given example of, and I’ve never had an issue with trailing slashes causing 404’s. Is there any particular reason you need to match the REQUEST_URI?

    I’ve found that the REQUEST_FILENAME is usually sufficient enough.

    ie;
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^content/?(.+)?$ /index.php?request=$1 [L]
    ...

  6. I use this code snippet on all my sites and it does the job. It doesnt matter if url ends or doesnt with /.It alows multiple subcategories like http://www.sitename.com/sports/football/national/. In index.php you just have to explode $_GET[‘url’] with / and than you have all parts. Also automaticly transfers visitors from non-www to www version of site.


    RewriteEngine on
    RewriteBase /
    RewriteCond %{HTTP_HOST} ^sitename\.com
    RewriteRule (.*) http://www.sitename.com/$1 [R=301,L]

    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d

    RewriteRule ^(.+)(/?)$ index.php?url=$1 [NE,L]

  7. Binyamin

    Matt Cutts, head of the webspam team at Google, prefer trailing slash, http://www.mattcutts.com/blog/seo-advice-url-canonicalization/.

  8. Binyamin

    .htaccess redirecting to trailing slash urls

    RewriteBase /
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_URI} !(.*)/$
    RewriteRule ^(.*)$ http://%{HTTP_HOST}/$1/ [L,R=301]

  9. iams

    I was doing some work with mod_rewrite this week and found the No Case rule to be extremely helpful

    [NC,L]

    the issue I originally had was I had two different rules but they seemed to cancel each other out

    index.php?category=$1$&op=$2
    index.php?category=$1&task=$2

    so i ended up having to make 2 rulls where i manually set the category so they would both work, but then if the category started with a capital letter the url wouldn’t work, luckily NC fixed that problem

  10. Thanks again! it fixed my problem!

  11. Bruce Lim

    RewriteCond %{REQUEST_URI} ^/([^/]+)/([^/\.]+)(/$|$)
    RewriteCond %{DOCUMENT_ROOT}/%1 -d
    RewriteCond %{DOCUMENT_ROOT}/%1/%1.html -f
    RewriteRule ([^/]+)/(.+) index.php?/$1/$2

    I redirect all my traffic back to index.php and use dispatchers/controllers to handle the model and view.

  12. crivion

    What’s wrong with adding an extra rule which will include the trailing slash?
    RewriteRule ^(.*)/$ index.php?$1 [L,QSA]

  13. Here is a more simpler code to make URLs SEO-friendly in about 2 lines of code.

    This is mainly for single URLs that can be changed manually.

    In .htaccess, simply put:

    RewriteEngine On
    RewriteRule ^index.html$ index.php

    Of course, you usually never see the extension anyway when typing in mydomain.com. But typing in mydomain.com/index.php will lead you to the same place as mydomain.com/index.html.

    So another URL that is visible:
    RewriteRule ^register.html$ register.php

    mydomain.com/register.php will take me to the php file just as register.html takes me to the same place.

    Now if you are outputting a page using javascript or have a CMS that makes pages on the fly for you, they often show up as SEO unfriendly, so the idea is to make them friendly. For example, my privacy page is page.php?p=1– it is the first page I generated.

    RewriteRule ^privacy\.html$ /page.php?p=1 [L]

    Now I can type in mydomain.com/privacy.html and that page will come up.

    Just make sure you don’t already have an existing privacy.php or privacy.html file.

    And after doing all this, just change your links to be directed towards using the .html file instead of the .php file. Hope this simplifies some

  14. is it possible to change the folder name to another

  15. sagar

    Hi, i am using the following method.
    eg : http://www.{mysite}.com/?page=main&a=id&v=2
    where I use iframe tag and the frame will the load as src=main.php?id=2 where 2 is the content id.
    how can i edit the main url from http://www.{mysite}.com/?page=main&a=id&v=2 to http://www.{mysite}.com/main/id/2

  16. chairul anwar

    can you help me.
    i have .htaccess like

    RewriteRule ^index.html$ /index.php [QSA]
    RewriteRule ^pdf/.* /a-single.php [QSA]
    RewriteRule ^ebook/.* /a-single-e.php [QSA]
    

    output:

    /pdf/post-title-id.pdf
    /ebook/post-title-id.pdf
    and i want to change those to 
    post-title-id.pdf
    post-title-id.pdf
    

    help me please

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!

Recently on David Walsh Blog

  • Making Data Management Easy with Transpose

    One problem with data collection is that we all want capture to be quick and we all want data to be organized -- but there's seldom a utility that allows both. Evernote allows note-taking but it can lose its structure...and no one wants...

  • Serve a Directory via Python

    Sometimes I'm working with a test HTML file and some JavaScript but need to work off of a served space.  In that case, I sometimes need to swap out folders within MAMP Stack which leads to a maintenance nightmare.  Bleh. I recently found out that you can...

  • OSCON Portland:  Conference  Discount!

    O'Reilly puts on the best web industry conferences in the world.  These conferences include Fluent Conference, Velocity Conference, and the upcoming OSCON in Portland, Oregon from July 20-24.  Open Source Convention (OSCON) is a conference that focuses specifically on open source developers and the tools and possibilities...

  • Follow Redirects with cURL

    I love playing around with cURL. There's something about loading websites via command line that makes me feel like some type of smug hacker, just like tweeting from command line does. I recently cURL'd the Google homepage and saw the following: I found it weird that Google...

  • Developers Have WordPress, Amateurs Have Squarespace, Professional Designers Have the NEW Webydo!

    Web design platforms have traditionally come in one of two varieties. There are the solutions like WordPress and Drupal that are incredibly powerful, but an understanding of web development and coding is required to be able to use those platforms effectively. On the other side of the...