Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: February 10, 2025
Downloading files from the web is a common task, and accordingly, wget command presents as one of the most powerful command-line tools for the job.
If we want to download a single file, as well as a whole website, or even more URLs in one command, wget certainly comes up with flexible options to automate such jobs. Being so simple but effective, wget is massively used in scripting, automation, and system administration.
In this tutorial, we’ll explore different techniques for downloading URLs using wget. We’ll look at basic usages, efficient handling of downloads, and various options that optimize the process for different needs.
The wget command is a simple yet powerful tool used to download files from the internet. It’s highly versatile, supporting multiple protocols like HTTP, HTTPS, and FTP. Whether fetching one file or automating large downloads, wget gets the job done efficiently. To begin, let’s start by installing wget in our Linux subsystem.
Most Linux distributions come with wget preinstalled. However, if it’s missing, we can install it using:
$ sudo apt update && sudo apt install wget -y
Afterwards, we can start downloading a file using wget:
$ wget https://file-examples.com/wp-content/storage/2017/10/file-example_PDF_1MB.pdf
--2025-01-28 22:20:35-- https://file-examples.com/wp-content/storage/2017/10/file-example_PDF_1MB.pdf
Resolving file-examples.com (file-examples.com)... 185.135.88.81
Connecting to file-examples.com (file-examples.com)|185.135.88.81|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘file-example_PDF_1MB.pdf’
file-example_PDF_1MB.pdf [ <=> ] 1.19K --.-KB/s in 0s
2025-01-28 22:20:36 (41.1 MB/s) - ‘file-example_PDF_1MB.pdf’ saved [1223]
Here, wget downloads file-example_PDF_1MB.pdf from https://file-examples.com/wp-content/storage/2017/10/ and saves it in our current directory. By default, the file retains its original name. If the download is interrupted, we can resume it later using the -c option. Thus, this makes wget a reliable tool for handling large files and unstable connections.
For a large number of files, running wget for each URL would be inefficient. Thankfully, wget allows us to download multiple files with one command. Consequently, that saves us time and prevents us from having to open and run wget for each URL. We can specify multiple URLs directly, use a file to store links, or even pass them inline.
The easiest way to fetch multiple files is by passing all URLs together:
$ wget https://file-examples.com/wp-content/storage/2017/10/file-example_PDF_1MB.pdf https://file-examples.com/wp-content/storage/2017/10/file-sample_150kB.pdf
--2025-01-28 22:22:44-- https://file-examples.com/wp-content/storage/2017/10/file-example_PDF_1MB.pdf
Resolving file-examples.com (file-examples.com)... 185.135.88.81
Connecting to file-examples.com (file-examples.com)|185.135.88.81|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘file-example_PDF_1MB.pdf’
file-example_PDF_1MB.pdf [ <=> ] 1.19K --.-KB/s in 0s
2025-01-28 22:22:45 (27.6 MB/s) - ‘file-example_PDF_1MB.pdf’ saved [1223]
--2025-01-28 22:22:45-- https://file-examples.com/wp-content/storage/2017/10/file-sample_150kB.pdf
Reusing existing connection to file-examples.com:443.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘file-sample_150kB.pdf’
file-sample_150kB.pdf [ <=> ] 1.19K --.-KB/s in 0s
2025-01-28 22:22:45 (16.1 MB/s) - ‘file-sample_150kB.pdf’ saved [1223]
FINISHED --2025-01-28 22:22:45--
Total wall clock time: 0.6s
Downloaded: 2 files, 2.4K in 0s (20.3 MB/s)
Here, wget downloads both file-example_PDF_1MB.pdf and file-sample_150kB.pdf to our current directory. Although this method is quick for a few files, it can become cumbersome if we have a long list of URLs. Therefore, let’s explore other ways of downloading multiple URLs together.
A more efficient way to handle multiple URLs is by storing them in a text file and passing it to wget:
$ echo -e "https://file-examples.com/wp-content/storage/2017/10/file-example_PDF_1MB.pdf\nhttps://file-examples.com/wp-content/storage/2017/10/file-sample_150kB.pdf" > download.txt
Here, we first create a new file, download.txt, containing all the URLs that we want to download. Now, instead of typing each URL manually, we can instruct wget to read from this file and download all the listed files:
$ wget -i download.txt
This approach is ideal for handling large amounts of files as it minimizes manual input and, at the same time, makes the downloads well-structured and easy to manage.
Alternatively, if we don’t want to create a separate text file to store the URLs, we can directly pass multiple URLs via an inline approach:
$ wget -i - <<< "https://file-examples.com/wp-content/storage/2017/10/file-example_PDF_1MB.pdf
https://file-examples.com/wp-content/storage/2017/10/file-sample_150kB.pdf"
Here, the -i – option tells wget to read from standard input, and we use a heredoc (<<<) to provide the URLs directly. This method is useful for quick downloads without cluttering our system with extra files.
When downloading multiple files, we need greater control to achieve better speed, avoid duplicates, and maintain directory structures. To that end, wget offers several advanced options to help us manage bulk downloads efficiently. Let’s explore these approaches one by one.
Instead of specifying multiple URLs manually, we can use a loop to process them dynamically. This is particularly useful when URLs follow a predictable pattern or are generated programmatically.
For example, suppose we need to download 10 sequentially numbered images from a website:
$ for i in {1..10}; do wget https://example.com/image$i.jpg; done
This loop iterates from 1 to 10, replacing $i with each number in the sequence. As a result, wget downloads image1.jpg, image2.jpg, and so on, up to image10.jpg.
Sometimes, downloading files with the same name can create conflicts. To avoid this, we can use the -N option to prevent re-downloading unchanged files:
$ wget -N -i download.txt
The -N flag ensures that wget only downloads a file if it’s newer than the existing one. This is helpful for updating files without unnecessary re-downloads.
By default, wget saves all files in the current directory. If we want to keep the original directory structure, we can use:
$ wget -x -nH -i url-list.txt
......
Downloaded: 2 files, 2.4K in 0s (27.8 MB/s)
$ ls
download.txt wp-content
The -x option enables directory recreation, while -nH removes the hostname from the path. This keeps files organized and prevents name conflicts. We can see that the new files are downloaded to the wp-content directory.
wget has several options to tune it and make it even more efficient, flexible, or whatever a particular case demands. We can limit download speeds, retry downloads that failed, or change user-agent settings.
To prevent wget from consuming too much bandwidth, we can set a download speed limit:
$ wget --limit-rate=500k http://example.com/largefile.zip
The –limit-rate option restricts the download speed to 500 KB/s, ensuring that other network activities aren’t disrupted.
If a download fails due to network issues, we can enable automatic retries:
$ wget --tries=5 http://example.com/file.zip
The –tries=5 option makes wget retry up to five times before giving up. This is useful for unstable connections.
If we’re downloading large files and the connection gets interrupted, it’s irritating to start from scratch. Fortunately, we can tell wget to resume incomplete downloads with the -c option:
$ wget -c https://example.com/largefile.zip
This is particularly useful for slow connections or unreliable networks, ensuring that we don’t waste time re-downloading already completed portions of a file.
Some servers block automated downloads based on the user-agent. We can bypass this by setting a custom user-agent:
$ wget --user-agent="Mozilla/5.0" http://example.com/protectedfile.zip
Here, wget mimics a web browser by using a Mozilla user-agent, allowing access to servers that restrict automated tools.
In this article, we explored wget, a strong and flexible utility that can download files from the web. Whether we have to fetch a single file or several, it has a simple yet effective solution for us.
We can pass multiple URLs directly, store the links in a file, or provide them inline. We can fine-tune our downloads for various usage by adjusting options like -N to stop duplicate downloads and –x -nH to save directory structures.
By mastering wget, we can automate the most repetitive tasks of downloading and fetching a huge amount of data with ease. And so, its versatility makes it a go-to tool for system administrators, developers, and anyone who works with online resources.
Exploring additional options and integrating wget into scripts can further enhance our workflow, making downloads more efficient and manageable.