WP-Mix

A fresh mix of code snippets and tutorials

Recursive Download Files with Wget

Author: Jeff Starr
Category: Uncategorized
Posted: February 2, 2025

Quick note for future reference. GNU’s Wget enables you to download files and resources directly via Terminal/shell. The big tip here is the r option, which tells Wget to download the target files recursively. For example, if you want to download an entire site:

wget -r https://example.com

If you omit the r option, that command will download only the homepage, located at example.com. So add the r to make it a recursive download.

Note about robots.txt

By default, Wget checks and obeys any rules specified in the site’s robots.txt file. For example, if example.com has a robots file with the following rules:

User-agent: *
Disallow: /private.html
Disallow: /secret.html
Disallow: /treasure.pdf

..then by default Wget will obey the rules and not download those “forbidden” files. But the thing is, robots rules are not mandatory. Ultimately it is up to the bot or agent (in this case Wget) to follow the rules or simply ignore them.

SO to tell Wget to ignore the site’s robots rules, you can set the e option, like so:

wget -e robots=off -r https://example.com

By setting the e option to robots=off, Wget will ignore any rules contained in the site’s robots.txt file, and just proceed to download everything (thanks to the r option).

Bonus: Change the download destination

By default Wget downloads files into the current working directory. To change that, use the cd command to go to the desired destination directory, for example:

cd /home/path/Downloads

That puts you in the folder located at /home/path/Downloads on your machine. So now when you run your Wget command, any files that you download will go to that location.

★ Pro Tip:

‹ PHP Check if Safe Mode is On

Replace JavaScript Smooth Scroll with CSS ›

About the Author

Jeff Starr is a professional developer, designer, author, and publisher with over 15 years of experience. He writes books and tutorials, develops plugins, and runs his own business.

About the Site

WP-Mix is where I share code snippets, tricks, and tips. WP-Mix was launched in October 2012, and now features 421 posts. You can check out the latest post published on Feb 4, 2025. Learn more ›

Contact

A few ways to connect:

WP-Mix

Recursive Download Files with Wget

Note about robots.txt

Bonus: Change the download destination

★ Pro Tip:

WordPress Resources

Subscribe to WP-Mix

Project Demos

Popular Posts

Recent Posts

Random Posts

RSS Feed

About the Author

About the Site

Contact