Search Engine Friendly URLs with .htaccess and mod_rewrite

By  on  

I was recently developing a PHP website that used mod_rewrite to make its URLs search engine friendly. Websites have been using mod_rewrite and .htaccess strategies to do this for years now and there are a 100 ways to accomplish the task. One issue that was occurring with this site was URLs without the a trailing slash would work, but URLs with a trailing slash would break (trigger a 404 error):

//works
http://mydomain.com/my-page
//breaks
http://mydomain.com/my-page/

The original .htaccess source was:

#adds ".php" to a URL that isn't a directory or a file
RewriteCond %{REQUEST_URI} !(\.[^./]+)$
RewriteCond %{REQUEST_fileNAME} !-d
RewriteCond %{REQUEST_fileNAME} !-f
RewriteRule (.*) $1.php [L]

The solution was simple:  an extra statement to accommodate for the trailing slash:

#removes trailing slash if not a directory
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)/$ /$1 [R=301,L]

#adds ".php" to a URL that isn't a directory or a file
RewriteCond %{REQUEST_URI} !(\.[^./]+)$
RewriteCond %{REQUEST_fileNAME} !-d
RewriteCond %{REQUEST_fileNAME} !-f
RewriteRule (.*) $1.php [L]

This method may be a bit inefficient as there are two redirects but it does the job.  Do you have a better solution?  If so, share it!

Recent Features

  • By
    Regular Expressions for the Rest of Us

    Sooner or later you'll run across a regular expression. With their cryptic syntax, confusing documentation and massive learning curve, most developers settle for copying and pasting them from StackOverflow and hoping they work. But what if you could decode regular expressions and harness their power? In...

  • By
    I’m an Impostor

    This is the hardest thing I've ever had to write, much less admit to myself.  I've written resignation letters from jobs I've loved, I've ended relationships, I've failed at a host of tasks, and let myself down in my life.  All of those feelings were very...

Incredible Demos

  • By
    CSS Selection Styling

    The goal of CSS is to allow styling of content and structure within a web page.  We all know that, right?  As CSS revisions arrive, we're provided more opportunity to control.  One of the little known styling option available within the browser is text selection styling.

  • By
    Fullscreen API

    As we move toward more true web applications, our JavaScript APIs are doing their best to keep up.  One very simple but useful new JavaScript API is the Fullscreen API.  The Fullscreen API provides a programmatic way to request fullscreen display from the user, and exit...

Discussion

  1. Not necessarily better but I use PHP to do the same thing after initialising and starting the search through pages in a DB:

    $uri = trim($uri, '/');
  2. PS my htaccess files route every request through a single php file rather than having them scattered – which makes it easier to maintain consistent flow through the app, which is possibly the main reason I don’t redirect via htaccess:

    RewriteEngine On
    RewriteBase /
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_FILENAME} !-l
    RewriteRule ^(.*)$ index.php?$1 [L,QSA]
    
  3. .htaccess redirection is quite powerful… why would you want to add a .php extension though? Isn’t just the page name ie: /about better? Adding .php would for instance break all SEO attained if your site would ever move to another platform (not that it would most likely, but still useful when developing client sites)

    • seelts

      yes, htaccess is powerfull enough, but php is much more powerfull in it’s logical statements.
      it is not necessarily to add .php extension to URLs. you can still use ie “example.com/about” or “example.com/about/” (slash in the end) or any other kind of URLs while processing them in PHP.
      I think andrew ment just the same.
      The only thing to be done – is to redirect ALL requests to some your php file, which will only parse the URI and “include”/”require” desired page.

  4. RewriteRule ^([-_a-zA-Z0-9]+)/?([-_a-zA-Z0-9]+)?/?([-_a-zA-Z0-9]+)?/?([-_a-zA-Z0-9]+)?/?([-_a-zA-Z0-9]+)?/?([-_a-zA-Z0-9]+)?/?([-_a-zA-Z0-9]+)?/?$ /$1.php?data1=$2&data2=$3&data3=$4&data4=$5&data5=$6&data6=$7 [L,QSA]

  5. I believe the reason for the 404 is your first RewriteCond statement, actually.
    RewriteCond %{REQUEST_URI} !(\.[^./]+)$

    Tells it to match anything not ending with a trailing slash. Where you have 3 conditional statements, it must pass all three, i believe.

    I use quite a similar rewrite as the one you’ve given example of, and I’ve never had an issue with trailing slashes causing 404’s. Is there any particular reason you need to match the REQUEST_URI?

    I’ve found that the REQUEST_FILENAME is usually sufficient enough.

    ie;
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^content/?(.+)?$ /index.php?request=$1 [L]
    ...

  6. I use this code snippet on all my sites and it does the job. It doesn’t matter if url ends or doesn’t with /.It allows multiple subcategories like http://www.sitename.com/sports/football/national/. In index.php you just have to explode $_GET[‘url’] with / and than you have all parts. Also automatically transfers visitors from non-www to www version of site.

    RewriteEngine on
    RewriteBase /
    RewriteCond %{HTTP_HOST} ^sitename\.com
    RewriteRule (.*) http://www.sitename.com/$1 [R=301,L]
    
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    
    RewriteRule ^(.+)(/?)$ 			index.php?url=$1 [NE,L]
    
  7. Binyamin

    Matt Cutts, head of the webspam team at Google, prefer trailing slash, http://www.mattcutts.com/blog/seo-advice-url-canonicalization/.

  8. Binyamin

    .htaccess redirecting to trailing slash urls

        RewriteBase /
        RewriteCond %{REQUEST_FILENAME} !-f
        RewriteCond %{REQUEST_URI} !(.*)/$
        RewriteRule ^(.*)$ http://%{HTTP_HOST}/$1/ [L,R=301]
    
  9. iams

    I was doing some work with mod_rewrite this week and found the No Case rule to be extremely helpful

    [NC,L]

    the issue I originally had was I had two different rules but they seemed to cancel each other out

    index.php?category=$1$&op=$2
    index.php?category=$1&task=$2

    so i ended up having to make 2 rulls where i manually set the category so they would both work, but then if the category started with a capital letter the url wouldn’t work, luckily NC fixed that problem

  10. Thanks again! it fixed my problem!

  11. Bruce Lim

    RewriteCond %{REQUEST_URI} ^/([^/]+)/([^/\.]+)(/$|$)
    RewriteCond %{DOCUMENT_ROOT}/%1 -d
    RewriteCond %{DOCUMENT_ROOT}/%1/%1.html -f
    RewriteRule ([^/]+)/(.+) index.php?/$1/$2

    I redirect all my traffic back to index.php and use dispatchers/controllers to handle the model and view.

  12. crivion

    What’s wrong with adding an extra rule which will include the trailing slash?
    RewriteRule ^(.*)/$ index.php?$1 [L,QSA]

  13. Here is a more simpler code to make URLs SEO-friendly in about 2 lines of code.

    This is mainly for single URLs that can be changed manually.

    In .htaccess, simply put:

    RewriteEngine On
    RewriteRule ^index.html$ index.php

    Of course, you usually never see the extension anyway when typing in mydomain.com. But typing in mydomain.com/index.php will lead you to the same place as mydomain.com/index.html.

    So another URL that is visible:
    RewriteRule ^register.html$ register.php

    mydomain.com/register.php will take me to the php file just as register.html takes me to the same place.

    Now if you are outputting a page using javascript or have a CMS that makes pages on the fly for you, they often show up as SEO unfriendly, so the idea is to make them friendly. For example, my privacy page is page.php?p=1– it is the first page I generated.

    RewriteRule ^privacy\.html$ /page.php?p=1 [L]

    Now I can type in mydomain.com/privacy.html and that page will come up.

    Just make sure you don’t already have an existing privacy.php or privacy.html file.

    And after doing all this, just change your links to be directed towards using the .html file instead of the .php file. Hope this simplifies some

  14. is it possible to change the folder name to another

  15. sagar

    Hi, i am using the following method.
    eg : http://www.{mysite}.com/?page=main&a=id&v=2
    where I use iframe tag and the frame will the load as src=main.php?id=2 where 2 is the content id.
    how can i edit the main url from http://www.{mysite}.com/?page=main&a=id&v=2 to http://www.{mysite}.com/main/id/2

  16. chairul anwar

    can you help me.
    i have .htaccess like

    RewriteRule ^index.html$ /index.php [QSA]
    RewriteRule ^pdf/.* /a-single.php [QSA]
    RewriteRule ^ebook/.* /a-single-e.php [QSA]
    

    output:

    /pdf/post-title-id.pdf
    /ebook/post-title-id.pdf
    and i want to change those to 
    post-title-id.pdf
    post-title-id.pdf
    

    help me please

  17.  # This tag ensures the rewrite module is loaded
    
      # enable the rewrite engine
      RewriteEngine On
      # Set your root directory
      RewriteBase /v2
    
      # remove the .html extension
      RewriteCond %{THE_REQUEST} ^GET\ (.*)\.html\ HTTP
      RewriteRule (.*)\.html$ $1 [R=301]
    
      # remove index and reference the directory
      RewriteRule (.*)/index$ $1/ [R=301]
    
      # remove trailing slash if not a directory
      RewriteCond %{REQUEST_FILENAME} !-d
      RewriteCond %{REQUEST_URI} /$
      RewriteRule (.*)/ $1 [R=301]
    
      # forward request to html file, **but don't redirect (bot friendly)**
      RewriteCond %{REQUEST_FILENAME}.html -f
      RewriteCond %{REQUEST_URI} !/$
      RewriteRule (.*) $1\.html [L]
    

    How do I get trailing slash in my rewrite?

  18. Mangaldeep

    .htaccess is not working on the godaddy linux hosting server Please suggest me i have already try more..

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!