Scrape Images with wget

By  on  

The desire to download all images or video on the page has been around since the beginning of the internet.  Twenty years ago I would accomplish this task with a python script I downloaded.  I then moved on to browser extensions for this task, then started using a PhearJS Node.js JavaScript utility to scrape images.  All of these solutions are nice but I wanted to know how I could accomplish this task from command line.

To scrape images (or any specific file extensions) from command line, you can use wget:

wget -nd -H -p -A jpg,jpeg,png,gif -e robots=off http://boards.4chan.org/sp/

The script above downloads images across hosts (i.e. from a CDN or other subdomain) to the directory from which the command is run from.  You'll see downloaded media as they come down:

Reusing existing connection to s.4cdn.org:80.
HTTP request sent, awaiting response... 200 OK
Length: 1505 (1.5K) [image/jpeg]
Saving to: '1490571194319s.jpg'

1490571194319s.jpg 100%[=====================>] 1.47K --.-KB/s in 0s

2017-03-26 18:33:26 (205 MB/s) - '1490571194319s.jpg' saved [1505/1505]

FINISHED --2017-03-26 18:33:26--
Total wall clock time: 2.7s
Downloaded: 66 files, 412K in 0.2s (2.10 MB/s)

Everyone loves cURL, which is another awesome resource, but don't foget about wget, which is arguably easier to use!

Recent Features

  • By
    Create a CSS Cube

    CSS cubes really showcase what CSS has become over the years, evolving from simple color and dimension directives to a language capable of creating deep, creative visuals.  Add animation and you've got something really neat.  Unfortunately each CSS cube tutorial I've read is a bit...

  • By
    Interview with a Pornhub Web Developer

    Regardless of your stance on pornography, it would be impossible to deny the massive impact the adult website industry has had on pushing the web forward. From pushing the browser's video limits to pushing ads through WebSocket so ad blockers don't detect them, you have...

Incredible Demos

  • By
    Degradable SELECT onChange

    Whenever I go to Google Analytics I notice a slight flicker in the dropdown list area. I see a button appear for the shortest amount of time and the poof! Gone. What that tells me is that Google is making their site function...

  • By
    MooTools, Mario, and Portal

    I'm a big fan of video games. I don't get much time to play them but I'll put down the MacBook Pro long enough to get a few games in. One of my favorites is Portal. For those who don't know, what's...

Discussion

    Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!