O'Reilly

Get the Intro: PHP Paragraph Regular Expression

By on  

Recently I was playing around with WordPress' wp_posts table. I wanted to grab the basic information about my posts (ID, title, content, slug) and build a quick summary list of them. One problem I ran into was creating a content "intro." Luckily a quick regular expression allowed me to create the intro.

The PHP

preg_match("/<p>(.*)<\/p>/",$post['post_content'],$matches);
$intro = strip_tags($matches[1]); //removes anchors and other tags from the intro

The above code extracts the first paragraph in a string. Since all of my posts begin with an introductory paragraph, this type of system will work.

Track.js Error Reporting

Upcoming Events

Recent Features

Incredible Demos

  • HTML5 download Attribute

    I tend to get caught up on the JavaScript side of the HTML5 revolution, and can you blame me?  HTML5 gaves us awesome "big" stuff like WebSockets, Web Workers, History, Storage and little helpers like the Element classList collection.  There are, however, smaller features in...

  • Multiple Background CSS Animations

    CSS background animation has been a hot topic for a long time, mostly because they look pretty sweet and don't require additional elements.  I was recently asked if it was possible to have multiple background animations on a given element and the answer is yes...with...

Discussion

  1. A few things I’d recommend. allow for a CSS style in your regex, make the regex case-insensitive, and have the regex treat the subject as a single line.

    preg_match("/<p.*>(.*)<\/p>/is",$post['post_content'],$matches);
    
  2. Good suggestion Matt!

  3. @Matt, you probably meant:

    preg_match(”/<p[^>]*>(.*)<\/p>/is”,$post['post_content'],$matches);
    

    Otherwise, it would match everything between the first <p> tag and the last </p>

  4. @Jeremy, that’s it! i knew i was forgetting something.

  5. Grouchy Smurf

    Has one of you ever read the doc ?

    preg_match('#<p[^>]*>(.*)</p>#isU', $post['post_content'], $matches);
    

    The “i” modifier allow paragraph tag name to be lower case or upper case. The “s” modifier allow dot metacharacter to match all characters, including newlines. The “U” modifier make pattern really matching only the first paragraph instead of everything between the first p end the last p. Using the “#” delimiter for your pattern make it more readable when dealing with HTML tags and using single quote to declare the pattern string don’t start PHP automatic search for somes vars to replace.

  6. David, are you doing this from within wordpress via a plugin or something, or outside wordpress and just accessing the database?

    If the first one there are better ways of doing it by using the WP functions, like the_excerpt_rss()…

  7. Cautious User

    Slight correction to Grouchy Smurf’s regex:

    $pattern = “#]*>(.*?)#is”;

  8. What does this code actually do? Is there any way to style the first paragraph by separating it first in PHP, then using CSS?

  9. nice idea, but how do I add the code? how if i just paste it after the_title ? is it works if i do that?

  10. Rich

    This worked like a champ, but lets say I wanted to get the rest of the content now, minus the intro paragraph? how would I go about doing that? for example. i want to show the intro paragraph using your code, then create a div that would enclose the rest of the actual content that I would be able to toggle through a read more link.

  11. Hi,
    I’m in a trouble with php paragraph replace. I’ve following para

    {tab=Sed facilisis consequat libero}Fusce eu ligula purus, eu ultricies nisl.

    Fusce eu ligula purus, eu ultricies nisl. Curabitur sed leo felis. {/tab}

    {tab=Curabitur ultricies sapien}Sed facilisis consequat libero, at tincidunt neque tristique vitae. {/tab}

    my aim is to replace {tab=name}something{/tab} to something

    Reg exp. i’m using is: /({tab.+?})(.*\S)({\/tab})/s

    Its working fine with single line or para. but not with multiline.

    Can anybody suggest how to fix it.
    Thanks in Advance…

  12. Bips

    Could you please help me to preg match two or more paragraph,I mean all the paragraph from the $post[‘post_content’]

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!

Recently on David Walsh Blog

  • Get Node.js Command Line Arguments with yargs

    Using command line arguments within Node.js apps is par for the course, especially when you're like me and you use JavaScript to code tasks (instead of bash scripts).  Node.js provides process.argv but that doesn't provide a key: value object like you'd expect: Bleh.  If you want to work with a...

  • OâReilly Velocity Conference â New York

    My favorite front-end conference has always been O'Reilly's Velocity Conference because the conference series has focused on one of the most undervalued parts of client side coding:  speed.  So often we're so excited that our JavaScript works that we forget that speed, efficiency, and performance are just as important. The next Velocity...

  • Free Download: Font Bundle Featuring 17 Incredible Typefaces

    The only thing we love more than a good font, is a good free font. So we’ve combed the Web for some of our favorite free fonts, and gathered them here in a single download. You’ll find a variety of useful typefaces, from highly geometric designs...

  • OâReilly Velocity Conference â Amsterdam

    My favorite front-end conference has always been O'Reilly's Velocity Conference because the conference series has focused on one of the most undervalued parts of client side coding:  speed.  So often we're so excited that our JavaScript works that we forget that speed, efficiency, and performance are just as important. The next Velocity...

  • CanIUse Command Line

    Every front-end developer should be well acquainted with CanIUse, the website that lets you view browser support for browser features.  When people criticize my blog posts for not detailing browser support for features within the post, I tell them to check CanIUse:  always up to date, unlike...