Generate Search Engine Friendly URLs with PHP Functions
Generating search engine friendly (SEF) URLs can dramatically improve your search engine results. There's a big difference between "/post.php?id=2382" and "/great-php-functions/". Having search engine friendly URLs also gives the user an idea of what will be on the page they are clicking on if the link text isn't adequate.
I've created sister PHP functions to generate search engine friendly URLs for the CMS' I create for my customers. The idea is fairly simple. I take the user-created page title and feed it to a scrubbing function to:
- remove all punctuation
- switch the URL to lowercase
- remove spaces, replace with a given delimiter (in this case, a dash)
- remove duplicate words
- remove words that aren't helpful to SEO
The Code
/* takes the input, scrubs bad characters */ function generate_seo_link($input, $replace = '-', $remove_words = true, $words_array = array()) { //make it lowercase, remove punctuation, remove multiple/leading/ending spaces $return = trim(ereg_replace(' +', ' ', preg_replace('/[^a-zA-Z0-9\s]/', '', strtolower($input)))); //remove words, if not helpful to seo //i like my defaults list in remove_words(), so I wont pass that array if($remove_words) { $return = remove_words($return, $replace, $words_array); } //convert the spaces to whatever the user wants //usually a dash or underscore.. //...then return the value. return str_replace(' ', $replace, $return); } /* takes an input, scrubs unnecessary words */ function remove_words($input,$replace,$words_array = array(),$unique_words = true) { //separate all words based on spaces $input_array = explode(' ',$input); //create the return array $return = array(); //loops through words, remove bad words, keep good ones foreach($input_array as $word) { //if it's a word we should add... if(!in_array($word,$words_array) && ($unique_words ? !in_array($word,$return) : true)) { $return[] = $word; } } //return good words separated by dashes return implode($replace,$return); }
The Explanation
The function accepts four values:
- $input - string - will be SEO'd, in my case, the page title
- $replace - string - the word separator, in most cases a dash or underscore
- $remove_words - boolean - remove specific, non-helpful SEO words
- $words_array - array - an array of words that should be removed from every URL because they aren't helpful to SEO
Example Results
$bad_words = array('a','and','the','an','it','is','with','can','of','why','not'); echo generate_seo_link('Another day and a half of PHP meetings', '-', true, $bad_words); //displays :: another-day-half-php-meetings echo generate_seo_link('CSS again? Why not just PHP?', '-', true, $bad_words); //displays :: css-again-just-php echo generate_seo_link('A penny saved is a penny earned.', '-', true, $bad_words); //displays :: penny-saved-earned
Do yourself a favor -- make your dynamic pages more search engine friendly with clean URLs!
Hi,
Many thanks for the post, very useful. Im implementing the code however finding that it does not correctly replace the delimiter e.g. -. So for example 2 i get “cssagainwhynotjustphp” returned.
I cant spot why! Any thoughts?
Many thanks, Ben.
Thanks for posting Ben. I found the problem. WordPress stripped a slash out on me. Change:
to:
Superb, thanks for the quick response David, much appreciated.
Now working perfectly!
All the best, Ben.
Hi David,
And thanks for all your stuffs.
I have a little question, i try this function, it’s works well, but how can i replace accents characters by non-accents ?
For example: Génépi -> genepi ?
Now i have gnpi…
Thanks in advance
Hi David,
I am a newbie to this and have been reading my tail off trying to get up to speed since I know that it’s hurting my SERPs. How would I change this url/mess in my custom CMS (functions)?
http://www.domain.com/articles.php?art_id=885
Please help! Thanks in advance…
Maïs: You’ll want to create another function that does that. You’ll send it an array of keys you want to replace with values. For example:
Or put this at the top of the generate_seo_link function:
And don’t forget to replace the original first line with:
Thanks for this function, it’s very very useful!
Just as a tip… I’m using Full-Text Stopwords from MySQL as a $bad_words array…
This helps a lot as it removes most of the words that google wouldn’t index anyway…
http://dev.mysql.com/doc/refman/5.0/en/fulltext-stopwords.html
Thanks Everybody, everything is working now, Great !
hi David, quick question. i understand completely that “There’s a big difference between “/post.php?id=2382″ and “/great-php-functions/”….what function or method tells our webserver in php somethat like
/great-php-functions/ => post.php?id=2382
IE if I just did
http://www.testdummyurl.com/great-php-functions
with nothing special….my webserver would give a 404
so what do we need to do to let our webserver know that this actually
http://www.testdummyurl.com/ post.php?id=2382
thanks,
Kram
Great! That’s exactly what I was looking for! You’ve saved my day. :-) Thanks a lot Ben!
Lol I meant David, not Ben… I shouldn’t do two things at once ;-) Sry
I’m also a little in the dark as to how you relate your SEO friendly URL to the actual URL (as Kram said)?
A little late, but I just noticed that when i have actual text that has a “-” that is needed for example “stir-fry” the “-” is removed when running the function. I got around this before by adding an extra step:
Dave or anyone else, I am wondering if you knew of a way to get around this in the function itself instead of adding the extra step to basically trick it into doing this.
For unique words you could use array_unique (a php function) :-)
For filtering unique words you could use array_unique (a php function) :-)
How does the web server know that this is the url it should go to?
Totally digging this article, but like the above comment by @Marc: “How does the web server know that this is the URL it should go to?”
Also, how does this relate to the mod_rewrite and .htaccess? Do we also need to use that as well as this PHP function?
Please let me know!
Still, same question as above. Can I get a response, please? :)
i m a begineer can you please tell me how to use this ..
For those of you wondering how to actually make this work, it’s necessary to have yoursitename/index.php or whatevername.php you use. As long as the php file is in the url, the following /stuff/here is available to you so that your php can delimit it and use it as variables.
I’m new to this as well, so I’m not positive on removing the whatevername.php part from the url. I believe you do that through your .htaccess file. It’s a simple sort of search and replace that will apply to everything.
I have this working on my site now. To have index.php automatically added to the url (not necessary in the address bar) put the following code in the .htaccess file in the same directory as the index.php file being accessed.
This is the same code that WordPress uses. This will affect relative links in html.
In case my advice was unclear, here’s an example.
You site is example.com. In the root folder of your site, you have a php file called index.php that contains the information that should be displayed when a visitor navigates to example.com. They are actually seeing example.com/index.php. We now want to add parameters to the url that our php code can use (as in the tutorial above), so we need example.com/index.php/parameters/here. A url like that will work (except relative links will get messed up, that’s a whole other issue). Now you have your working parameters in the url, separated by / that can be used as a delimiter. Now, in the same folder as index.php (the root folder in this example) open or create a file called .htaccess and put the above code in it. This will rewrite the url example.com/parameters/here to example.com/index.php/parameters/here. There ya go! Clean SEF URLs!
hmm… those tags shouldn’t be there in my previous comment. Make sure to remove those.
lol I’m getting trolled by the code thing here. I repeat, the
tags shouldn’t be in that .htaccess code
Seriously!? It won’t let me put the paragraph tag in code in my posts lol, and I can’t edit my posts or delete them. Ahhhhh. Maybe it will work this time. If not, I give up. I’m sure everyone know/can figure out what paragraph tags I mean.
PHP Warning: in_array() expects parameter 2 to be array, null given in line 33
here i have this error
if(!in_array($word,$words_array) && ($unique_words ? !in_array($word,$return) : true))
This works, except when there is an single quote, double quote, etc. I tried adding the solution “Josh” posted back in 08, but it still only semi-works. Here’s an example of what is going on:
Post Title: Here’s an example
Displays: here039s-example
It looks like it’s converting the symbol into html special characters and leaving out the ampersand and pound sign. Is there anything else I can add to the code to make the single quote just disappear from the mix?
Any help is appreciated!!! :D