Match Accented Letters with Regular Expressions

By  on  

Regular expressions are used for a variety of tasks but the one I see most often is input validation. Names, dates, numbers...we tend to use regular expressions for everything, even when we probably shouldn't.

The most common syntax for checking alphabetic characters is A-z but what if the string contains accented characters? Characters like ğ and Ö will make the regex fail. That's where we need to use Unicode property escapes to check for a broader letter format!

Let's look at how we can use \p{Letter} and the Unicode flag (u) to match both standard and accented characters:

// Single word
"Özil".match(/[\p{Letter}]+/gu)

// Word with spaces
"Oğuzhan Özyakup".match(/[\p{Letter}\s]+/gu);

Using regular expressions to validate strings, especially names, is much more difficult than A-z+. Names and other strings can be very diverse -- let's not insult users by making them provide non-accented letters just to pass validation!

Recent Features

Incredible Demos

  • By
    Build a Slick and Simple MooTools Accordion

    Last week I covered a smooth, subtle MooTools effect called Kwicks. Another great MooTools creation is the Accordion, which acts like...wait for it...an accordion! Now I've never been a huge Weird Al fan so this is as close to playing an accordion as...

  • By
    MooTools Documentation Search Favelet

    I'm going to share something with you that will blow your mind: I don't have the MooTools documentation memorized. I just don't. I visit the MooTools docs frequently to figure out the order of parameters of More classes and how best to use...

Discussion

    Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!