Blog

03/06/2019

wget Command: Download Compressed File By Sending gzip Headers



I have turned on gzip compression as modern web browser supports and accepts compressed data transfer. However, I’m unable to do so with the wget command. How do I force wget to download file using gzip encoding?

GNU wget command is a free and default utility on most Linux distribution for non-interactive download of files from the Web. It supports various protocols such as HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies.

You can save the headers sent by the HTTP server to the file, preceding the actual contents, with an empty line as the separator.

The --header option

The synax is as follows:

wget --header='HEADER-LINE' http://server1.sxi.io/file.tar.gz
wget -option1 --header='HEADER-LINE' http://server1.sxi.io/images.bmp 
 
### compressed speed test ###
wget -O /dev/null --header='HEADER-LINE' http://server1.sxi.io/lib1html5v2.js
 
### debug on screen ##
wget -O- --header='HEADER-LINE' http://server1.sxi.io/file.tar.gz

You can send HEADER-LINE along with the rest of the headers in each HTTP request. The supplied header is sent as-is, which means it must contain name and value separated by colon, and must not contain newlines. You may define more than one additional header by specifying --header more than once as follows:

wget --header='Accept-Charset: iso-8859-2' --header='Accept-Language: hr'  http://server1.sxi.io/file.css

Example: Testing gzip encoding with wget command

To send gzip encoding request, enter:
$ wget --header='Accept-Encoding: gzip' http://sxi.io/hardware/linux-find-and-recover-wasted-disk-space/
Sample outputs:

--2012-10-28 17:48:06--  http://sxi.io/hardware/linux-find-and-recover-wasted-disk-space/
Resolving sxi.io... 75.126.153.206
Connecting to sxi.io|75.126.153.206|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `index.html.54'
 
    [ <=>                                  ] 12,657      --.-K/s   in 0.02s   
 
2012-10-28 17:48:07 (583 KB/s) - `index.html.54' saved [12657]

Download the sample page without gzip:
$ wget http://sxi.io/hardware/linux-find-and-recover-wasted-disk-space/
Sample outputs:

--2012-10-28 17:48:37--  http://sxi.io/hardware/linux-find-and-recover-wasted-disk-space/
Resolving sxi.io... 75.126.153.206
Connecting to sxi.io|75.126.153.206|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `index.html.55'
 
    [   <=>                                ] 45,729      73.7K/s   in 0.6s    
 
2012-10-28 17:48:38 (73.7 KB/s) - `index.html.55' saved [45729]

From the above two outputs:

  1. gzip enabled page was downloaded in 0.2 seconds using wget command.
  2. Without gzip page was downloaded in 0.6 seconds using wget command.

Use this option to test:

  1. Testing and troubleshooting HTTP server problems
  2. CDN edge node speed.
  3. Your origin server speed.
  4. Web server gzip comparability.
  5. Load balancer / reverse proxy server testing.

As of wget v1.10, this option can be used to override headers otherwise generated automatically. In this example wget is used connect to sxi.io, but to specify ‘beta.sxi.io’ in the Host header (i.e. show page from beta.sxi.io for same domain name :

wget --header="Host: beta.sxi.io" http://sxi.io/

Finally, you can ave the headers sent by the HTTP server to the file, run:
$ wget --save-headers http://sxi.io
$ vi index.html

Sample outputs:

Fig.01: wget saving the http headers

Posted by: SXI ADMIN

The author is the creator of nixCraft and a seasoned sysadmin, DevOps engineer, and a trainer for the Linux operating system/Unix shell scripting. Get the latest tutorials on SysAdmin, Linux/Unix and open source topics via RSS/XML feed or weekly email newsletter.

14/08/2019

How to KVM, QEMU start or stop virtual machine from command line (CLI)

KVM or Kernel Based Virtual Machine is a popular virtualization technology. It allows you to run virtual guest machines over a host machine. To start...
14/08/2019

How to Docker backup Saving and restoring your volumes

Running a Docker volume backup First, we spin up a temporary container, and we mount the backup folder and the target Docker volume to this container....
12/08/2019

How to Start and Enable Firewalld on CentOS 7

In this article, we discuss how to start and enable firewalld. It is highly recommended that you have a firewall protecting your server.Pre-Flight CheckThese...