Using DOMDocument to Modify HTML with PHP
One of the first things you learn when wanting to implement a service worker on a website is that the site requires SSL (an https address). Ever since I saw the blinding speed service workers can provide a website, I've been obsessed with readying my site for SSL. Enforcing SSL with .htaccess was easy -- the hard part is updating asset links in blog content. You start out by feeling as though regular expressions will be the quick cure but anyone that has experience with regular expression knows that working with URLs is a nightmare and regex is probably the wrong decision.
The right decision is DOMDocument, a native PHP object which allows you to work with HTML in a logical, pleasant fashion. You start by loading the HTML into a DOMDocument instance and then using its predictable functions to make things happen.
// Formats post content for SSL
function format_post_content($content = '') {
$document = new DOMDocument();
// Ensure UTF-8 is respected by using 'mb_convert_encoding'
$document->loadHTML(mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8'));
$tags = $document->getElementsByTagName('img');
foreach ($tags as $tag) {
$tag->setAttribute('src',
str_replace('http://davidwalsh.name',
'https://davidwalsh.name',
$tag->getAttribute('src')
)
);
}
return $document->saveHTML();
}
In my example above, I find all img elements and replace their protocol with https://. I will end up doing the same with iframe src, a href, and a few other rarely used tags. When my modifications are done, I call saveHTML to get the new string.
Don't fall into the trap of trying to use regular expressions with HTML -- you're in for a future of failure. DOMDocument is lightweight and will make your code infinitely more maintainable.
![Serving Fonts from CDN]()
For maximum performance, we all know we must put our assets on CDN (another domain). Along with those assets are custom web fonts. Unfortunately custom web fonts via CDN (or any cross-domain font request) don't work in Firefox or Internet Explorer (correctly so, by spec) though...
![Convert XML to JSON with JavaScript]()
If you follow me on Twitter, you know that I've been working on a super top secret mobile application using Appcelerator Titanium. The experience has been great: using JavaScript to create easy to write, easy to test, native mobile apps has been fun. My...
![Image Protection Using PHP, the GD Library, JavaScript, and XHTML]()
Warning: The demo for this post may brick your browser.
A while back I posted a MooTools plugin called dwProtector that aimed to make image theft more difficult -- NOT PREVENT IT COMPLETELY -- but make it more difficult for the rookie to average user...
![Adding Events to Adding Events in MooTools]()
Note: This post has been updated.
One of my huge web peeves is when an element has click events attached to it but the element doesn't sport the "pointer" cursor. I mean how the hell is the user supposed to know they can/should click on...
So do you know if there is a performance hit with creating an element using this vs creating a string of html?
The right decision is skipping domain entirely if it isn’t hosted on some subdomain (
/path/to/asset), and skipping protocol if it is ((//example.com/path/to/asset)David, rather than str_replace all your (internal)
http://strings withhttps://you should replace them with//– that way your links become protocol-agnostic — a more future-proof solution.Why don’t you use the
search-replacefunction in WP-CLI?Why not remove the protocol completely?
//davidwalsh.name/would default to whatever protocol is used in the address bar.I agree that
//would be better but some RSS feed readers usehttp, othershttps. I’m asserting complete control.