Broken Link Checker

By  on  

Broken Link Checker by Steven Vachon is an outstanding Node.js-powered utility for recursively checking for broken links on a website.  Broken links lead to bad user experiences and mistrust -- two things that can cost you money and other types of conversion.  The broken link checker has two methods of use:  command line and a Node.js API.

Using Broken Link Checker from Command Line

Broken Link Checker can be used from command line if you install with Node.js:

npm install -g broken-link-checker

With the utility globally available, we can execute commands like this one to trigger broken link checking:

blc https://davidwalsh.name -ro

...which triggers a streaming list of results within your command line:

Broken Link Checker

This is the fastest and easiest way to quickly check for broken links!

Programmatic Broken Link Checker Usage

Broken Link Checker allows you to use its awesome, highly customizable API to do your own automation of broken link checking.  Here's a quick look at the API:

// Scans an HTML document to find broken links.
var htmlChecker = new blc.HtmlChecker(options, {
    html: function(tree, robots){},
    junk: function(result){},
    link: function(result){},
    complete: function(){}
});
htmlChecker.scan(html, baseUrl);

// Scans the HTML content at each queued URL to find broken links.
var htmlUrlChecker = new blc.HtmlUrlChecker(options, {
    html: function(tree, robots, response, pageUrl, customData){},
    junk: function(result, customData){},
    link: function(result, customData){},
    page: function(error, pageUrl, customData){},
    end: function(){}
});
htmlUrlChecker.enqueue(pageUrl, customData);

// Recursively scans (crawls) the HTML content at each queued URL to find broken links.
var siteChecker = new blc.SiteChecker(options, {
    robots: function(robots, customData){},
    html: function(tree, robots, response, pageUrl, customData){},
    junk: function(result, customData){},
    link: function(result, customData){},
    page: function(error, pageUrl, customData){},
    site: function(error, siteUrl, customData){},
    end: function(){}
});
siteChecker.enqueue(siteUrl, customData);

// Requests each queued URL to determine if they are broken.
var urlChecker = new blc.UrlChecker(options, {
    link: function(result, customData){},
    end: function(){}
});
urlChecker.enqueue(url, baseUrl, customData);

// Handle broken links
if (result.broken) {
    console.log(result.brokenReason);
    //=> HTTP_404
} else if (result.excluded) {
    console.log(result.excludedReason);
    //=> BLC_ROBOTS
}

This broken link checker API also allows for header and advanced options with everything from redirect management, keywords, cache options, and more.  Broken Link Checker has everything!

Recent Features

  • By
    Responsive Images: The Ultimate Guide

    Chances are that any Web designers using our Ghostlab browser testing app, which allows seamless testing across all devices simultaneously, will have worked with responsive design in some shape or form. And as today's websites and devices become ever more varied, a plethora of responsive images...

  • By
    JavaScript Promise API

    While synchronous code is easier to follow and debug, async is generally better for performance and flexibility. Why "hold up the show" when you can trigger numerous requests at once and then handle them when each is ready?  Promises are becoming a big part of the JavaScript world...

Incredible Demos

  • By
    Modal-Style Text Selection with Fokus

    Every once in a while I find a tiny JavaScript library that does something very specific, very well.  My latest find, Fokus, is a utility that listens for text selection within the page, and when such an event occurs, shows a beautiful modal dialog in...

  • By
    Spyjax:  Ajax For Evil Using Dojo

    The idea of Spyjax is nothing new. In pasts posts I've covered how you can spy on your user's history with both MooTools and jQuery. Today we'll cover how to check user history using the Dojo Toolkit. The HTML For the sake of this...

Discussion

  1. Rob

    I still find Xenu link sleuth to be really useful.
    GUI rather than API but you get a handy report and essential info like the actual pages that contain the broken links so you can fix then rather than just redirect.

    • I use Xenu too all the time, but for a JS site it was not finding all of the links. I had to give it the landing pages and even then it only found a few issues. Website is built in SharePoint.

    • Yeah, that’s becoming more and more of a problem.
      I’m glad you posted as I totally forgot about this and may need to use it soon.
      I just hope it now shows you the pages where it found the broken links rather than just listing broken ones.

  2. Thank you for this! Broken links also negatively effect ranking on Google.

  3. Ian

    Any idea how to generate an HTML report when running from command line?

  4. Markus

    the cli examples above should use long option names instead of short switches.

    e.g.
    –ordered instead of -o
    –recursive instead of -r

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!