Scrape Images with wget

By  on  

The desire to download all images or video on the page has been around since the beginning of the internet.  Twenty years ago I would accomplish this task with a python script I downloaded.  I then moved on to browser extensions for this task, then started using a PhearJS Node.js JavaScript utility to scrape images.  All of these solutions are nice but I wanted to know how I could accomplish this task from command line.

To scrape images (or any specific file extensions) from command line, you can use wget:

wget -nd -H -p -A jpg,jpeg,png,gif -e robots=off http://boards.4chan.org/sp/

The script above downloads images across hosts (i.e. from a CDN or other subdomain) to the directory from which the command is run from.  You'll see downloaded media as they come down:

Reusing existing connection to s.4cdn.org:80.
HTTP request sent, awaiting response... 200 OK
Length: 1505 (1.5K) [image/jpeg]
Saving to: '1490571194319s.jpg'

1490571194319s.jpg 100%[=====================>] 1.47K --.-KB/s in 0s

2017-03-26 18:33:26 (205 MB/s) - '1490571194319s.jpg' saved [1505/1505]

FINISHED --2017-03-26 18:33:26--
Total wall clock time: 2.7s
Downloaded: 66 files, 412K in 0.2s (2.10 MB/s)

Everyone loves cURL, which is another awesome resource, but don't foget about wget, which is arguably easier to use!

Recent Features

  • By
    9 More Mind-Blowing WebGL Demos

    With Firefox OS, asm.js, and the push for browser performance improvements, canvas and WebGL technologies are opening a world of possibilities.  I featured 9 Mind-Blowing Canvas Demos and then took it up a level with 9 Mind-Blowing WebGL Demos, but I want to outdo...

  • By
    JavaScript Promise API

    While synchronous code is easier to follow and debug, async is generally better for performance and flexibility. Why "hold up the show" when you can trigger numerous requests at once and then handle them when each is ready?  Promises are becoming a big part of the JavaScript world...

Incredible Demos

  • By
    prefers-color-scheme: CSS Media Query

    One device and app feature I've come to appreciate is the ability to change between light and dark modes. If you've ever done late night coding or reading, you know how amazing a dark theme can be for preventing eye strain and the headaches that result.

  • By
    Image Manipulation with PHP and the GD Library

    Yeah, I'm a Photoshop wizard. I rock the selection tool. I crop like a farmer. I dominate the bucket tool. Hell, I even went as far as wielding the wizard wand selection tool once. ...OK I'm rubbish when it comes to Photoshop.

Discussion

    Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!