Generate Search Engine Friendly URLs with PHP Functions

By  on  

Generating search engine friendly (SEF) URLs can dramatically improve your search engine results. There's a big difference between "/post.php?id=2382" and "/great-php-functions/". Having search engine friendly URLs also gives the user an idea of what will be on the page they are clicking on if the link text isn't adequate.

I've created sister PHP functions to generate search engine friendly URLs for the CMS' I create for my customers. The idea is fairly simple. I take the user-created page title and feed it to a scrubbing function to:

  • remove all punctuation
  • switch the URL to lowercase
  • remove spaces, replace with a given delimiter (in this case, a dash)
  • remove duplicate words
  • remove words that aren't helpful to SEO

The Code

/* takes the input, scrubs bad characters */
function generate_seo_link($input, $replace = '-', $remove_words = true, $words_array = array()) {
	//make it lowercase, remove punctuation, remove multiple/leading/ending spaces
	$return = trim(ereg_replace(' +', ' ', preg_replace('/[^a-zA-Z0-9\s]/', '', strtolower($input))));

	//remove words, if not helpful to seo
	//i like my defaults list in remove_words(), so I wont pass that array
	if($remove_words) { $return = remove_words($return, $replace, $words_array); }

	//convert the spaces to whatever the user wants
	//usually a dash or underscore..
	//...then return the value.
	return str_replace(' ', $replace, $return);
}

/* takes an input, scrubs unnecessary words */
function remove_words($input,$replace,$words_array = array(),$unique_words = true)
{
	//separate all words based on spaces
	$input_array = explode(' ',$input);

	//create the return array
	$return = array();

	//loops through words, remove bad words, keep good ones
	foreach($input_array as $word)
	{
		//if it's a word we should add...
		if(!in_array($word,$words_array) && ($unique_words ? !in_array($word,$return) : true))
		{
			$return[] = $word;
		}
	}

	//return good words separated by dashes
	return implode($replace,$return);
}

The Explanation

The function accepts four values:

  1. $input - string - will be SEO'd, in my case, the page title
  2. $replace - string - the word separator, in most cases a dash or underscore
  3. $remove_words - boolean - remove specific, non-helpful SEO words
  4. $words_array - array - an array of words that should be removed from every URL because they aren't helpful to SEO

Example Results

$bad_words = array('a','and','the','an','it','is','with','can','of','why','not');
echo generate_seo_link('Another day and a half of PHP meetings', '-', true, $bad_words);
//displays :: another-day-half-php-meetings

echo generate_seo_link('CSS again?  Why not just PHP?', '-', true, $bad_words);
//displays :: css-again-just-php

echo generate_seo_link('A penny saved is a penny earned.', '-', true, $bad_words);
//displays :: penny-saved-earned

Do yourself a favor -- make your dynamic pages more search engine friendly with clean URLs!

Recent Features

  • By
    9 Mind-Blowing WebGL Demos

    As much as developers now loathe Flash, we're still playing a bit of catch up to natively duplicate the animation capabilities that Adobe's old technology provided us.  Of course we have canvas, an awesome technology, one which I highlighted 9 mind-blowing demos.  Another technology available...

  • By
    7 Essential JavaScript Functions

    I remember the early days of JavaScript where you needed a simple function for just about everything because the browser vendors implemented features differently, and not just edge features, basic features, like addEventListener and attachEvent.  Times have changed but there are still a few functions each developer should...

Incredible Demos

  • By
    PHP Woot Checker – Tech, Wine, and Shirt Woot

    If you haven't heard of Woot.com, you've been living under a rock. For those who have been under the proverbial rock, here's the plot: Every day, Woot sells one product. Once the item is sold out, no more items are available for purchase. You don't know how many...

  • By
    Create Twitter-Style Dropdowns Using jQuery

    Twitter does some great stuff with JavaScript. What I really appreciate about what they do is that there aren't any epic JS functionalities -- they're all simple touches. One of those simple touches is the "Login" dropdown on their homepage. I've taken...

Discussion

  1. Ben

    Hi,

    Many thanks for the post, very useful. Im implementing the code however finding that it does not correctly replace the delimiter e.g. -. So for example 2 i get “cssagainwhynotjustphp” returned.

    I cant spot why! Any thoughts?

    Many thanks, Ben.

  2. Thanks for posting Ben. I found the problem. WordPress stripped a slash out on me. Change:

    $return = trim(ereg_replace(' +',' ',preg_replace('/[^a-zA-Z0-9s]/','',strtolower($input))));
    

    to:

    $return = trim(ereg_replace(' +',' ',preg_replace('/[^a-zA-Z0-9\s]/','',strtolower($input))));
    
  3. Ben

    Superb, thanks for the quick response David, much appreciated.

    Now working perfectly!

    All the best, Ben.

  4. Hi David,
    And thanks for all your stuffs.

    I have a little question, i try this function, it’s works well, but how can i replace accents characters by non-accents ?

    For example: Génépi -> genepi ?

    Now i have gnpi…

    Thanks in advance

  5. Hi David,

    I am a newbie to this and have been reading my tail off trying to get up to speed since I know that it’s hurting my SERPs. How would I change this url/mess in my custom CMS (functions)?

    http://www.domain.com/articles.php?art_id=885

    Please help! Thanks in advance…

  6. Maïs: You’ll want to create another function that does that. You’ll send it an array of keys you want to replace with values. For example:

    $result = replace_accents(array(é=>'e'));
    
    function replace_accents($input) {
        // for every key=>value, replace the key with the value;
    }
    
  7. Josh

    Or put this at the top of the generate_seo_link function:

    $return = htmlentities($input, ENT_COMPAT, 'utf-8');
    $return = preg_replace("`&([a-z])(acute|uml|circ|grave|ring|cedil|slash|tilde|caron|lig|quot|rsquo);`i", "\\1", $return);
    

    And don’t forget to replace the original first line with:

    $return = trim(ereg_replace(' +', ' ', preg_replace('/[^a-zA-Z0-9\s]/', '', strtolower($return))));
    

    Thanks for this function, it’s very very useful!

  8. Slawek

    Just as a tip… I’m using Full-Text Stopwords from MySQL as a $bad_words array…
    This helps a lot as it removes most of the words that google wouldn’t index anyway…
    http://dev.mysql.com/doc/refman/5.0/en/fulltext-stopwords.html

  9. Thanks Everybody, everything is working now, Great !

  10. kram

    hi David, quick question. i understand completely that “There’s a big difference between “/post.php?id=2382″ and “/great-php-functions/”….what function or method tells our webserver in php somethat like

    /great-php-functions/ => post.php?id=2382

    IE if I just did
    http://www.testdummyurl.com/great-php-functions
    with nothing special….my webserver would give a 404
    so what do we need to do to let our webserver know that this actually
    http://www.testdummyurl.com/ post.php?id=2382

    thanks,
    Kram

  11. Great! That’s exactly what I was looking for! You’ve saved my day. :-) Thanks a lot Ben!

  12. Lol I meant David, not Ben… I shouldn’t do two things at once ;-) Sry

  13. Specs

    I’m also a little in the dark as to how you relate your SEO friendly URL to the actual URL (as Kram said)?

  14. Rich

    A little late, but I just noticed that when i have actual text that has a “-” that is needed for example “stir-fry” the “-” is removed when running the function. I got around this before by adding an extra step:

    print("$this->FixDashed = str_replace("-"," ",$_POST['SOMEPOST']);
    $this->SEOName = generate_seo_link($this->FixDashed,'-',false,$bad_words);");
    

    Dave or anyone else, I am wondering if you knew of a way to get around this in the function itself instead of adding the extra step to basically trick it into doing this.

  15. For unique words you could use array_unique (a php function) :-)

  16. For filtering unique words you could use array_unique (a php function) :-)

  17. Marc

    How does the web server know that this is the url it should go to?

  18. Totally digging this article, but like the above comment by @Marc: “How does the web server know that this is the URL it should go to?”

    Also, how does this relate to the mod_rewrite and .htaccess? Do we also need to use that as well as this PHP function?

    Please let me know!

  19. Still, same question as above. Can I get a response, please? :)

  20. aditya

    i m a begineer can you please tell me how to use this ..

  21. Johnny Hauser

    For those of you wondering how to actually make this work, it’s necessary to have yoursitename/index.php or whatevername.php you use. As long as the php file is in the url, the following /stuff/here is available to you so that your php can delimit it and use it as variables.

    I’m new to this as well, so I’m not positive on removing the whatevername.php part from the url. I believe you do that through your .htaccess file. It’s a simple sort of search and replace that will apply to everything.

  22. Johnny Hauser

    I have this working on my site now. To have index.php automatically added to the url (not necessary in the address bar) put the following code in the .htaccess file in the same directory as the index.php file being accessed.

    RewriteEngine on
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^(.*)$ index.php/$1 [L]
    

    This is the same code that WordPress uses. This will affect relative links in html.

    In case my advice was unclear, here’s an example.
    You site is example.com. In the root folder of your site, you have a php file called index.php that contains the information that should be displayed when a visitor navigates to example.com. They are actually seeing example.com/index.php. We now want to add parameters to the url that our php code can use (as in the tutorial above), so we need example.com/index.php/parameters/here. A url like that will work (except relative links will get messed up, that’s a whole other issue). Now you have your working parameters in the url, separated by / that can be used as a delimiter. Now, in the same folder as index.php (the root folder in this example) open or create a file called .htaccess and put the above code in it. This will rewrite the url example.com/parameters/here to example.com/index.php/parameters/here. There ya go! Clean SEF URLs!

    • Johnny Hauser

      hmm… those tags shouldn’t be there in my previous comment. Make sure to remove those.

    • Johnny Hauser

      lol I’m getting trolled by the code thing here. I repeat, the
      tags shouldn’t be in that .htaccess code

  23. Johnny Hauser

    Seriously!? It won’t let me put the paragraph tag in code in my posts lol, and I can’t edit my posts or delete them. Ahhhhh. Maybe it will work this time. If not, I give up. I’m sure everyone know/can figure out what paragraph tags I mean.

  24. Milos

    PHP Warning: in_array() expects parameter 2 to be array, null given in line 33

    here i have this error
    if(!in_array($word,$words_array) && ($unique_words ? !in_array($word,$return) : true))

  25. Veronica

    This works, except when there is an single quote, double quote, etc. I tried adding the solution “Josh” posted back in 08, but it still only semi-works. Here’s an example of what is going on:

    Post Title: Here’s an example
    Displays: here039s-example

    It looks like it’s converting the symbol into html special characters and leaving out the ampersand and pound sign. Is there anything else I can add to the code to make the single quote just disappear from the mix?

    Any help is appreciated!!! :D

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!