Generate Search Engine Friendly URLs with PHP Functions
Generating search engine friendly (SEF) URLs can dramatically improve your search engine results. There's a big difference between "/post.php?id=2382" and "/great-php-functions/". Having search engine friendly URLs also gives the user an idea of what will be on the page they are clicking on if the link text isn't adequate.
I've created sister PHP functions to generate search engine friendly URLs for the CMS' I create for my customers. The idea is fairly simple. I take the user-created page title and feed it to a scrubbing function to:
- remove all punctuation
- switch the URL to lowercase
- remove spaces, replace with a given delimiter (in this case, a dash)
- remove duplicate words
- remove words that aren't helpful to SEO
The Code
/* takes the input, scrubs bad characters */
function generate_seo_link($input,$replace = '-',$remove_words = true,$words_array = array())
{
//make it lowercase, remove punctuation, remove multiple/leading/ending spaces
$return = trim(ereg_replace(' +',' ',preg_replace('/[^a-zA-Z0-9\s]/','',strtolower($input))));
//remove words, if not helpful to seo
//i like my defaults list in remove_words(), so I wont pass that array
if($remove_words) { $return = remove_words($return,$replace,$words_array); }
//convert the spaces to whatever the user wants
//usually a dash or underscore..
//...then return the value.
return str_replace(' ',$replace,$return);
}
/* takes an input, scrubs unnecessary words */
function remove_words($input,$replace,$words_array = array(),$unique_words = true)
{
//separate all words based on spaces
$input_array = explode(' ',$input);
//create the return array
$return = array();
//loops through words, remove bad words, keep good ones
foreach($input_array as $word)
{
//if it's a word we should add...
if(!in_array($word,$words_array) && ($unique_words ? !in_array($word,$return) : true))
{
$return[] = $word;
}
}
//return good words separated by dashes
return implode($replace,$return);
}The Explanation
The function accepts four values:
- $input - string - will be SEO'd, in my case, the page title
- $replace - string - the word separator, in most cases a dash or underscore
- $remove_words - boolean - remove specific, non-helpful SEO words
- $words_array - array - an array of words that should be removed from every URL because they aren't helpful to SEO
Example Results
$bad_words = array('a','and','the','an','it','is','with','can','of','why','not');
echo generate_seo_link('Another day and a half of PHP meetings','-',true,$bad_words);
//displays :: another-day-half-php-meetings
echo generate_seo_link('CSS again? Why not just PHP?','-',true,$bad_words);
//displays :: css-again-just-php
echo generate_seo_link('A penny saved is a penny earned.','-',true,$bad_words);
//displays :: penny-saved-earnedDo yourself a favor -- make your dynamic pages more search engine friendly with clean URLs!
Discussion
Be Heard!
Share your thoughts with fellow developers of all skill levels! I want to hear from you!
Hi,
Many thanks for the post, very useful. Im implementing the code however finding that it does not correctly replace the delimiter e.g. -. So for example 2 i get “cssagainwhynotjustphp” returned.
I cant spot why! Any thoughts?
Many thanks, Ben.
Thanks for posting Ben. I found the problem. WordPress stripped a slash out on me. Change:
$return = trim(ereg_replace(‘ +’,’ ‘,preg_replace(‘/[^a-zA-Z0-9s]/’,”,strtolower($input))));
to:
$return = trim(ereg_replace(‘ +’,’ ‘,preg_replace(‘/[^a-zA-Z0-9\s]/’,”,strtolower($input))));
Superb, thanks for the quick response David, much appreciated.
Now working perfectly!
All the best, Ben.
Hi David,
And thanks for all your stuffs.
I have a little question, i try this function, it’s works well, but how can i replace accents characters by non-accents ?
For example: Génépi -> genepi ?
Now i have gnpi…
Thanks in advance
Hi David,
I am a newbie to this and have been reading my tail off trying to get up to speed since I know that it’s hurting my SERPs. How would I change this url/mess in my custom CMS (functions)?
http://www.domain.com/articles.php?art_id=885
Please help! Thanks in advance…
Maïs: You’ll want to create another function that does that. You’ll send it an array of keys you want to replace with values. For example:
Or put this at the top of the generate_seo_link function:
$return = htmlentities($input, ENT_COMPAT, ‘utf-8′);
$return = preg_replace(“`&([a-z])(acute|uml|circ|grave|ring|cedil|slash|tilde|caron|lig|quot|rsquo);`i”, “\\1″, $return);
And don’t forget to replace the original first line with:
$return = trim(ereg_replace(‘ +’, ‘ ‘, preg_replace(‘/[^a-zA-Z0-9\s]/’, ”, strtolower($return))));
Thanks for this function, it’s very very useful!
Just as a tip… I’m using Full-Text Stopwords from MySQL as a $bad_words array…
This helps a lot as it removes most of the words that google wouldn’t index anyway…
http://dev.mysql.com/doc/refman/5.0/en/fulltext-stopwords.html
Thanks Everybody, everything is working now, Great !
hi David, quick question. i understand completely that “There’s a big difference between “/post.php?id=2382″ and “/great-php-functions/”….what function or method tells our webserver in php somethat like
/great-php-functions/ => post.php?id=2382
IE if I just did
http://www.testdummyurl.com/great-php-functions
with nothing special….my webserver would give a 404
so what do we need to do to let our webserver know that this actually
http://www.testdummyurl.com/ post.php?id=2382
thanks,
Kram
Great! That’s exactly what I was looking for! You’ve saved my day. :-) Thanks a lot Ben!
Lol I meant David, not Ben… I shouldn’t do two things at once ;-) Sry
I’m also a little in the dark as to how you relate your SEO friendly URL to the actual URL (as Kram said)?
A little late, but I just noticed that when i have actual text that has a “-” that is needed for example “stir-fry” the “-” is removed when running the function. I got around this before by adding an extra step:
print(“$this->FixDashed = str_replace(“-”,” “,$_POST['SOMEPOST']);
$this->SEOName = generate_seo_link($this->FixDashed,’-',false,$bad_words);”);
Dave or anyone else, I am wondering if you knew of a way to get around this in the function itself instead of adding the extra step to basically trick it into doing this.
For unique words you could use array_unique (a php function) :-)
For filtering unique words you could use array_unique (a php function) :-)