Skip to the content...

Welcome to the David Walsh Blog. I'm a MooTools, Dojo, jQuery, CSS, and PHP Web Developer located in Madison, Wisconsin, United States. Please contact me if I can make your experience on my website better.

Get the Intro: PHP Paragraph Regular Expression

13 Responses »

Recently I was playing around with WordPress' wp_posts table. I wanted to grab the basic information about my posts (ID, title, content, slug) and build a quick summary list of them. One problem I ran into was creating a content "intro." Luckily a quick regular expression allowed me to create the intro.

The PHP

preg_match("/<p>(.*)<\/p>/",$post['post_content'],$matches);
$intro = strip_tags($matches[1]); //removes anchors and other tags from the intro

The above code extracts the first paragraph in a string. Since all of my posts begin with an introductory paragraph, this type of system will work.

Discussion

  1. July 31, 2009 @ 8:52 am

    A few things I’d recommend. allow for a CSS style in your regex, make the regex case-insensitive, and have the regex treat the subject as a single line.

    preg_match(“/<p.*>(.*)<\/p>/is”,$post['post_content'],$matches);

  2. July 31, 2009 @ 9:08 am

    Good suggestion Matt!

  3. July 31, 2009 @ 9:35 am

    @Matt, you probably meant:

    preg_match(”/<p[^>]*>(.*)<\/p>/is”,$post['post_content'],$matches);

    Otherwise, it would match everything between the first <p> tag and the last </p>

  4. July 31, 2009 @ 10:17 am

    @Jeremy, that’s it! i knew i was forgetting something.

  5. grouchy smurf
    July 31, 2009 @ 2:37 pm

    Has one of you ever read the doc ?

    preg_match(‘#<p[^>]*>(.*)</p>#isU’, $post['post_content'], $matches);

    The “i” modifier allow paragraph tag name to be lower case or upper case. The “s” modifier allow dot metacharacter to match all characters, including newlines. The “U” modifier make pattern really matching only the first paragraph instead of everything between the first p end the last p. Using the “#” delimiter for your pattern make it more readable when dealing with HTML tags and using single quote to declare the pattern string don’t start PHP automatic search for somes vars to replace.

  6. August 6, 2009 @ 3:12 am

    David, are you doing this from within wordpress via a plugin or something, or outside wordpress and just accessing the database?

    If the first one there are better ways of doing it by using the WP functions, like the_excerpt_rss()…

Be Heard!

Share your thoughts with fellow developers of all skill levels! I want to hear from you!

Name*:
Email*:
Website:  
Wrap your code with <code> tags, f00!