Skip to content

oxylabs/wget-proxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

How to Use Wget With Proxy

Oxylabs promo code

Wget is a popular command-line utility that can download files from the web. It’s part of the GNU Project and, as a result, commonly bundled with numerous Linux distributions.

This article will provide you with an overview of this utility.

For a detailed explanation, see our blog post.

How to install Wget

Wget can be downloaded from the official GNU channel and installed manually.

To install Wget on Ubuntu/Debian, open the terminal and run the following command:

sudo apt-get install wget

To install Wget on CentOS/RHEL, open the terminal and run the following command:

yum install wget

If you’re using macOS, we highly recommend using the Homebrew package manager. Open the terminal and run the following command:

brew install wget

If you’re using Windows, Chocolatey package manager is a good choice. When using Chocolatey, run the following command from the command line or PowerShell:

choco install wget

Lastly, to verify the installation of Wget, run the following command:

wget --version

Running Wget

Open the terminal and enter the following:

wget -h

This will list all the options used with the Wget command grouped in categories, such as Startup, Logging, Download, etc.

Downloading a single file

To download a single file, run Wget and type in the complete URL of the file:

wget https://ftp.gnu.org/gnu/wget/wget2-2.0.0.tar.lz

Changing the User-Agent

To identify the User-Agent used by Wget, request this URL:

wget https://httpbin.org/user-agent

This command will download a file named user-agent without any extension. To view the contents of this file, use the cat command:

~$ cat user-agent
{
  "user-agent": "wget/1.21.2"
}

The default User-Agent can be modified using the --header option:

wget --header "user-agent: DESIRED USER AGENT" URL-OF-FILE

The following example should clarify it further:

~$ wget  --header "user-agent: Mozilla/5.0 (Macintosh)" https://httpbin.org/user-agent
~$ cat user-agent
{
  "user-agent": "Mozilla/5.0 (Macintosh)"
}

Downloading multiple files

The following command will download files from all three URLs:

~$ wget http://example.com/file1.zip http://example.com/file2.zip http://example.com/file3.zip

The second method is to write all the URLs in a file and use the -i or --input-file option:

~$ wget --input-file=urls.txt
~$ wget -i urls.txt

Extracting links from a webpage

You can supply a URL that contains the links to the files:

~$ wget --input-file=https://ftp.gnu.org/gnu/wget

To download all files with a .sig extension, use the following command:

~$ wget --recursive --no-parent --no-directories --no-clobber --accept=sig --input-file=https://ftp.gnu.org/gnu/wget

Using proxies with Wget

The first method uses command line switches to specify the proxy server and authentication details.

First, check your current IP address. Run Wget in quiet mode and redirect the output to the terminal instead of downloading the file:

~$ wget --quiet --output-document=- https://ip.oxylabs.io/location
# OR
~$ wget -q -O - https://ip.oxylabs.io/location

To use a proxy that doesn’t require authentication, use two -e or two --execute switches:

~$ wget -q -O- -e use_proxy=yes -e http_proxy=12.13.14.15:1234 https://ip.oxylabs.io/location

Output:

{"ip":"104.200.141.20","providers":{"dbip":{"country":"US","asn":"AS46562","org_name":"Performive LLC","city":"New York","zip_code":"","time_zone":"","meta":"\u003ca href='https://db-ip.com'\u003eIP Geolocation by DB-IP\u003c/a\u003e"},"ip2location":{"country":"US","asn":"","org_name":"","city":"New York City","zip_code":"10011","time_zone":"-05:00","meta":"This site or product includes IP2Location LITE data available from \u003ca href=\"https://lite.ip2location.com\"\u003ehttps://lite.ip2location.com\u003c/a\u003e."},"ipinfo":{"country":"US","asn":"AS46562","org_name":"Performive LLC","city":"","zip_code":"","time_zone":"","meta":"\u003cp\u003eIP address data powered by \u003ca href=\"https://ipinfo.io\" \u003eIPinfo\u003c/a\u003e\u003c/p\u003e"},"maxmind":{"country":"US","asn":"AS46562","org_name":"PERFORMIVE","city":"","zip_code":"","time_zone":"-06:00","meta":"This product includes GeoLite2 Data created by MaxMind, available from https://www.maxmind.com."}}}

If the proxy server requires user authentication, set the proxy username by using the --proxy-user switch. Similarly, set the proxy password using the --proxy-password switch:

~$ wget -q -O- -e use_proxy=yes -e http_proxy=12.13.14.15:1234  --proxy-user=your_username --proxy-password=your_password https://ip.oxylabs.io/location

The second method is to use the .wgetrc configuration file.

In the ~/.wgetrc file, enter the following lines:

use_proxy = on
http_proxy = http://12.13.14.15:1234

If you also need to set user authentication for the proxy, modify the file as follows:

use_proxy = on
http_proxy = http://your_username:[email protected]:1234

As of now, every time Wget runs, it’ll use the specified proxy.

$ wget -q -O- https://ip.oxylabs.io
# Prints IP of the proxy server

If you wish to learn more about wget, see our blog post.