Spring Sale 2026 – NPI EA (cat = Baeldung on Linux)
announcement - icon

Yes, we're now running our Spring Sale. All Courses are 30% off until 31st March, 2026

>> EXPLORE ACCESS NOW

Baeldung Pro – Linux – NPI EA (cat = Baeldung on Linux)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Introduction

When working with Linux, we often use the wget command to download web files. It’s scriptable, fast, and supports the HTTP, HTTPS, and FTP protocols. However, some websites block requests from tools like wget by detecting them through the user-agent string, which tells the server what kind of client is making the request. To handle that, we can modify the default user-agent to resemble a browser or some other client. This allows us to work around straightforward blocks or retrieve content intended for browsers.

In this tutorial, we’ll discuss how to use custom user-agent strings with wget, why it’s needed, and where it comes in handy in real-world applications.

2. What Is a User-Agent and Why It Matters

When we download web content using tools such as wget, the request contains a header named user-agent. The header informs the web server about the kind of client making the request—a browser, for instance, or a command-line tool such as wget. Different sites use the user-agent to decide how to respond, and some will deny requests from non-browser tools. The wget tool by default sends a default user-agent string such as Wget/1.21.5. If not already present in our system, let’s install wget on our system:

$ sudo apt install wget

To see what user-agent wget sends by default, we can make a request to a service that echoes it back:

$ wget -qO- https://httpbin.org/user-agent
{
  "user-agent": "Wget/1.21.5"
}

This confirms that wget is sending its default identifier, which may be blocked by some websites. Some websites may block this default user-agent to prevent automated access. To work around this, we can make wget look like a regular browser by setting a custom user-agent that we’ll discuss now.

3. Set a Custom User-Agent in wget

It is simple to change the default user-agent in wget and can be helpful in attempting to retrieve content that restricts automated tools. We can specify a custom user-agent for a single request or as a default for all subsequent operations, depending on our specific requirements. Let’s manually set a custom user-agent for one-time use:

$ wget --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64)" https://example.com

This makes the request with a browser-like user-agent string, which makes the server more likely to treat it like a normal browser request. If we regularly need to use the same user-agent repeatedly, we can make the alteration permanent by editing the wget config file:

$ echo 'user_agent = Mozilla/5.0 (Windows NT 10.0; Win64; x64)' >> ~/.wgetrc

Adding this line to the .wgetrc file in the home directory sets the default user-agent for all subsequent wget requests. Consequently, it eliminates the need to specify the user-agent on each call, streamlining the workflow and making it more efficient.

4. Advanced Techniques: Rotating User-Agents

When we continuously use the same user-agent, we can expect blocks, especially during automation or scraping. To reduce detection and mimic real browser activity, we can rotate through a list of user-agent strings. This adds randomness to our requests, which makes them appear more human-like. For this setup, we’ll be using a simple Python script, so let’s install Python if we haven’t already:

$ sudo apt install python3

Now, let’s see a basic Python script that randomly picks a user-agent from a list and uses it with wget:

#!/usr/bin/env python3

import random
import subprocess

# List of user-agent strings
user_agents = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64)",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)",
    "Mozilla/5.0 (X11; Linux x86_64)"
]

# Target URL
url = "https://example.com"

# Choose a random user-agent
chosen_agent = random.choice(user_agents)

# Use wget with the chosen user-agent
subprocess.run(["wget", "--user-agent", chosen_agent, url])

Here in this script, we first specify a list of various user-agent strings. We then pick one of them randomly using Python’s random.choice function. We lastly invoke wget with the chosen agent using subprocess.run. The user-agent will be different every time we execute the script, hence enabling us to prevent getting detected or rate-limited by the server. This method is viable, reusable, and perfect for rotating agents for low-scale automation activities.

5. Troubleshooting and Best Practices

Some servers can still block our requests even when we specify a custom user-agent. This is because they can be based on other headers such as Accept, Referer, or even IP reputation to detect automation. To cater to these scenarios, we can include other headers in our wget command so that we can emulate true browser behavior more closely.
Here is an example where we include extra headers along with a custom user-agent:

wget --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64)" \
     --header="Accept: text/html" \
     --header="Referer: https://www.google.com" \
     https://example.com

In this case, we used the –header option to send both Accept and Referer headers. These enable us to mimic the way that a browser would typically ask a server for data. The Accept header tells the server what types of content we’d prefer. And, the Referer makes the request look like we’re visiting from a link on some other site—like Google. It’s these small details that tend to make the difference between getting blocked and not.

Lastly, we should always respect the terms of service of the website. Just because we can modify headers doesn’t mean we have to work around rules. Being responsible with wget means we avoid IP bans, legal issues, and undue stress on remote servers.

6. Conclusion

In this article, we discussed various ways to use user-agent strings with wget. The wget tool becomes even more powerful when we understand how to manipulate and design the user-agent string. It enables us to access content that would be blocked otherwise and makes it possible for us to masquerade as real browsers.

Using a custom user-agent helps ensure web servers respond properly, whether for one-time requests or ongoing configurations. We also found out how to rotate user-agent values in Python and how to send additional headers to reduce detection. When properly applied, wget paired with manual user-agent manipulation is a good tool to add to any Linux user’s toolkit.