O'Reilly

Remove HTML Comments with PHP

By on  

When it comes to sending content to users, I'm of the belief that less is more.  There's no reason for HTML comments to be sent down to the user -- they simply bloat the payload.  I remove unwanted HTML comments within my WordPress theme, so I thought I'd share the regex that does it:

// Remove unwanted HTML comments
function remove_html_comments($content = '') {
	return preg_replace('/<!--(.|\s)*?-->/', '', $content);
}

That handy function, paired with output buffering, allows me to remove HTML comments from anywhere within the page.  Less load, less cruft for mobile users!

Track.js Error Reporting

Upcoming Events

Recent Features

  • Vibration API

    Many of the new APIs provided to us by browser vendors are more targeted toward the mobile user than the desktop user.  One of those simple APIs the Vibration API.  The Vibration API allows developers to direct the device, using JavaScript, to vibrate in...

  • Write Better JavaScript with Promises

    You've probably heard the talk around the water cooler about how promises are the future. All of the cool kids are using them, but you don't see what makes them so special. Can't you just use a callback? What's the big deal? In this article, we'll...

Incredible Demos

  • Create a Dynamic Flickr Image Search with the Dojo Toolkit

    The Dojo Toolkit is a treasure chest of great JavaScript classes.  You can find basic JavaScript functionality classes for AJAX, node manipulation, animations, and the like within Dojo.  You can find elegant, functional UI widgets like DropDown Menus, tabbed interfaces, and form element replacements within...

  • Dijit&#8217;s TabContainer Layout:  Easy Tabbed Content

    One of Dojo's major advantages over other JavaScript toolkits is its Dijit library.  Dijit is a UI framework comprised of JavaScript widget classes, CSS files, and HTML templates.  One very useful layout class is the TabContainer.  TabContainer allows you to quickly create a tabbed content...

Discussion

  1. MaxArt

    That would strip out all the comment-like sequences in Javascript code.
    A very rare case indeed, and mixing HTML and Javascript is usually deprecated, but still…
    A fully-fledged HTML-Javascript parser just to prevent this is hardly the effort here.

    Just remember that for backward compatibility for older browsers, script tags’ content are often enclosed in a comment. That would remove the entire script.

    • MaxArt

      I’d like to add that I usually used the sequence [\s\S] instead of the (capturing) group (.|\s). I think it’s faster.

    • You can also do (?:.|\s) to make a group non-capturing. [\s\S] (whitespace or no whitespace) is nonsensical, you could just as well do . (any character).

      David: Why do you do .|\s? As far as I know, . captures all characters, including whitespace.

    • I’ll check it out Fred!

  2. When does this code run?

    The best use I could see for this is a build step, eg you take the template files and them through this on deploy. It feels like a waste of cpu cycles to run something like this per-request?

  3. I like this concept but where/when would you call the function for normal php pages? thx

  4. This is great.

    For my use, I’d prefer this being done from an htaccess file – is this possible at all?

  5. (v)

    what about MSIE conditional comments? ;-)

    my code is like:

    ...
    return preg_replace('/<!--(?!\s*(?:\[if [^\]]+]|))(?:(?!-->).)*-->/s', '', $content);

    • Awesome point, love this — I’ll check it out and if it works I’ll update my post!

    • I tried this but it didn’t work :/ No comments were stripped at all.

    • Hi David, (V), the following mix of your snippets workes for me

      $data = preg_replace(‘//’, ”, $data);

  6. It depends on our framework, it should have a pipeline to minimize the html before sending it into client :D
    But thanks for your useful snippet :)

  7. Wouldn’t this alter IE conditional comments?

  8. Hi David, (V), the following mix of your snippets workes for me

    $data = preg_replace(‘//’, ”, $data);

    2nd try, I used pre but the code was removed …

  9. Hi David, (V), the following mix of your snippets workes for me

    http://pastebin.com/bfzWVFUi

    3rd try, I used pre but the code was removed … please delete my two previous comments

  10. Mike Smith

    I added this code to my functions.php file, however, visitors can still post strong html tags and images on my blog :(

  11. good concept and thanks for that

  12. spongeBob

    Nice approach. But it would be more believable if I you also removed html comments on this page. :) But I liked the regex.

  13. Jack

    Why even bother with putting in HTML comments at all? Since commenting is supposed to be for future developers eyes who will be reading the actual code I just comment in php and then don’t have to worry about comments passed into html.

  14. Full strip function

    function html2txt($document){
    $search = array('@]*?>.*?@si',  // Strip out javascript
                   '@<[\/\!]*?[^]*?>@si',            // Strip out HTML tags
                   '@]*?>.*?@siU',    // Strip style tags properly
                   '@@'         // Strip multi-line comments including CDATA
    );
    $text = preg_replace($search, '', $document);
    return $text;
    } 
    
  15. JoeB

    This crashes horribly if the comment inside the tag is very large.

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!

Recently on David Walsh Blog

  • OâReilly Velocity Conference â New York

    My favorite front-end conference has always been O'Reilly's Velocity Conference because the conference series has focused on one of the most undervalued parts of client side coding:  speed.  So often we're so excited that our JavaScript works that we forget that speed, efficiency, and performance are just as important. The next Velocity...

  • Free Download: Font Bundle Featuring 17 Incredible Typefaces

    The only thing we love more than a good font, is a good free font. So we’ve combed the Web for some of our favorite free fonts, and gathered them here in a single download. You’ll find a variety of useful typefaces, from highly geometric designs...

  • OâReilly Velocity Conference â Amsterdam

    My favorite front-end conference has always been O'Reilly's Velocity Conference because the conference series has focused on one of the most undervalued parts of client side coding:  speed.  So often we're so excited that our JavaScript works that we forget that speed, efficiency, and performance are just as important. The next Velocity...

  • CanIUse Command Line

    Every front-end developer should be well acquainted with CanIUse, the website that lets you view browser support for browser features.  When people criticize my blog posts for not detailing browser support for features within the post, I tell them to check CanIUse:  always up to date, unlike...

  • Generating Alternative Stylesheets for Browsers Without @media

    If your CSS code is built with a mobile-first approach, it probably contains all the rules that make up the "desktop" view inside @media statements. That's great, but browsers that don't support media queries (IE 8 and below) will simply ignore them, ending up getting the...