Generate Search Engine Friendly URLs with PHP Functions

Written by David Walsh on Tuesday, November 13, 2007


Generating search engine friendly (SEF) URLs can dramatically improve your search engine results. There’s a big difference between “/post.php?id=2382″ and “/great-php-functions/”. Having search engine friendly URLs also gives the user an idea of what will be on the page they are clicking on if the link text isn’t adequate.

I’ve created sister PHP functions to generate search engine friendly URLs for the CMS’ I create for my customers. The idea is fairly simple. I take the user-created page title and feed it to a scrubbing function to:

  • remove all punctuation
  • switch the URL to lowercase
  • remove spaces, replace with a given delimiter (in this case, a dash)
  • remove duplicate words
  • remove words that aren’t helpful to SEO

The Code

/* takes the input, scrubs bad characters */
function generate_seo_link($input,$replace = '-',$remove_words = true,$words_array = array())
{
	//make it lowercase, remove punctuation, remove multiple/leading/ending spaces
	$return = trim(ereg_replace(' +',' ',preg_replace('/[^a-zA-Z0-9\s]/','',strtolower($input))));

	//remove words, if not helpful to seo
	//i like my defaults list in remove_words(), so I wont pass that array
	if($remove_words) { $return = remove_words($return,$replace,$words_array); }

	//convert the spaces to whatever the user wants
	//usually a dash or underscore..
	//...then return the value.
	return str_replace(' ',$replace,$return);
}

/* takes an input, scrubs unnecessary words */
function remove_words($input,$replace,$words_array = array(),$unique_words = true)
{
	//separate all words based on spaces
	$input_array = explode(' ',$input);

	//create the return array
	$return = array();

	//loops through words, remove bad words, keep good ones
	foreach($input_array as $word)
	{
		//if it's a word we should add...
		if(!in_array($word,$words_array) && ($unique_words ? !in_array($word,$return) : true))
		{
			$return[] = $word;
		}
	}

	//return good words separated by dashes
	return implode($replace,$return);
}

The Explanation

The function accepts four values:

  1. $input – string – will be SEO’d, in my case, the page title
  2. $replace – string – the word separator, in most cases a dash or underscore
  3. $remove_words – boolean – remove specific, non-helpful SEO words
  4. $words_array – array – an array of words that should be removed from every URL because they aren’t helpful to SEO

Example Results

$bad_words = array('a','and','the','an','it','is','with','can','of','why','not');
echo generate_seo_link('Another day and a half of PHP meetings','-',true,$bad_words);
//displays :: another-day-half-php-meetings

echo generate_seo_link('CSS again?  Why not just PHP?','-',true,$bad_words);
//displays :: css-again-just-php

echo generate_seo_link('A penny saved is a penny earned.','-',true,$bad_words);
//displays :: penny-saved-earned

Do yourself a favor — make your dynamic pages more search engine friendly with clean URLs!


Epic Discussion

Commenter Avatar January 05 / #
Ben says:

Hi,

Many thanks for the post, very useful. Im implementing the code however finding that it does not correctly replace the delimiter e.g. -. So for example 2 i get “cssagainwhynotjustphp” returned.

I cant spot why! Any thoughts?

Many thanks, Ben.

David Walsh January 05 / #
david says:

Thanks for posting Ben. I found the problem. WordPress stripped a slash out on me. Change:

$return = trim(ereg_replace(‘ +’,’ ‘,preg_replace(‘/[^a-zA-Z0-9s]/’,”,strtolower($input))));

to:

$return = trim(ereg_replace(‘ +’,’ ‘,preg_replace(‘/[^a-zA-Z0-9\s]/’,”,strtolower($input))));

Commenter Avatar January 05 / #
Ben says:

Superb, thanks for the quick response David, much appreciated.

Now working perfectly!

All the best, Ben.

Commenter Avatar June 17 / #
Maïs says:

Hi David,
And thanks for all your stuffs.

I have a little question, i try this function, it’s works well, but how can i replace accents characters by non-accents ?

For example: Génépi -> genepi ?

Now i have gnpi…

Thanks in advance

Commenter Avatar July 25 / #
Stan says:

Hi David,

I am a newbie to this and have been reading my tail off trying to get up to speed since I know that it’s hurting my SERPs. How would I change this url/mess in my custom CMS (functions)?

http://www.domain.com/articles.php?art_id=885

Please help! Thanks in advance…

David Walsh July 26 / #
david says:

Maïs: You’ll want to create another function that does that. You’ll send it an array of keys you want to replace with values. For example:

$result = replace_accents(array(é=>'e'));

function replace_accents($input)
{
    // for every key=>value, replace the key with the value;
}
Commenter Avatar August 29 / #
Josh says:

Or put this at the top of the generate_seo_link function:

$return = htmlentities($input, ENT_COMPAT, ‘utf-8′);
$return = preg_replace(“`&([a-z])(acute|uml|circ|grave|ring|cedil|slash|tilde|caron|lig|quot|rsquo);`i”, “\\1″, $return);

And don’t forget to replace the original first line with:

$return = trim(ereg_replace(‘ +’, ‘ ‘, preg_replace(‘/[^a-zA-Z0-9\s]/’, ”, strtolower($return))));

Thanks for this function, it’s very very useful!

Commenter Avatar October 01 / #
Slawek says:

Just as a tip… I’m using Full-Text Stopwords from MySQL as a $bad_words array…
This helps a lot as it removes most of the words that google wouldn’t index anyway…
http://dev.mysql.com/doc/refman/5.0/en/fulltext-stopwords.html

Commenter Avatar October 29 / #
Maïs says:

Thanks Everybody, everything is working now, Great !

Commenter Avatar January 03 / #
kram says:

hi David, quick question. i understand completely that “There’s a big difference between “/post.php?id=2382″ and “/great-php-functions/”….what function or method tells our webserver in php somethat like

/great-php-functions/ => post.php?id=2382

IE if I just did
http://www.testdummyurl.com/great-php-functions
with nothing special….my webserver would give a 404
so what do we need to do to let our webserver know that this actually
http://www.testdummyurl.com/ post.php?id=2382

thanks,
Kram

Commenter Avatar January 27 / #
Alex says:

Great! That’s exactly what I was looking for! You’ve saved my day. :-) Thanks a lot Ben!

Commenter Avatar January 27 / #
Alex says:

Lol I meant David, not Ben… I shouldn’t do two things at once ;-) Sry

Commenter Avatar May 15 / #
Specs says:

I’m also a little in the dark as to how you relate your SEO friendly URL to the actual URL (as Kram said)?

Commenter Avatar May 27 / #
Rich says:

A little late, but I just noticed that when i have actual text that has a “-” that is needed for example “stir-fry” the “-” is removed when running the function. I got around this before by adding an extra step:

print(“$this->FixDashed = str_replace(“-”,” “,$_POST['SOMEPOST']);

$this->SEOName = generate_seo_link($this->FixDashed,’-',false,$bad_words);”);

Dave or anyone else, I am wondering if you knew of a way to get around this in the function itself instead of adding the extra step to basically trick it into doing this.

Be Heard!

I want to hear what you have to say! Share your comments and questions below.

Name*:
Email*:
Website:  


© David Walsh 2007-2010. Contact David Walsh. Powered by the remarkable MooTools javascript framework.