Linux gzip: How to work with compressed files

If you work much with Unix and Linux systems you'll eventually run into the terrific file compression utilities, gzip and gunzip. As their names imply, the first command creates compressed files (by gzip'ing them), and the second command unzip's those files.

In this tutorial I take a quick look at the gzip and gunzip file compression utilities, along with their companion tools you may not have known about: zcat, zgrep, and zmore.

The Unix/Linux gzip command

You can compress a file with the Unix/Linux gzip command. For instance, if I run an ls -l command on an uncompressed Apache access log file named access.log, I get this output:

-rw-r--r--   1 al  al  22733255 Aug 12  2008 access.log

Note that the size of this file is 22,733,255 bytes. Now, if we compress the file using gzip, like this:

gzip access.log

we end up creating a new, compressed file named access.log.gz. Here's what that file looks like:

-rw-r--r--   1 al  al  2009249 Aug 12  2008 access.log.gz

Notice that the file has been compressed from 22,733,255 bytes down to just 2,009,249 bytes. That's a huge savings in file size, roughly 10 to 1(!).

There's one important thing to note about gzip: The old file, access.log, has been replaced by this new compressed file, access.log.gz. This might freak you out a little the first time you use this command, but very quickly you get used to it. (If for some reason you don't trust gzip when you first try it, feel free to make a backup copy of your original file.)

The Linux gunzip command

The gunzip ("g unzip") command works just the opposite of gzip, converting a gzip'd file back to its original format. In the following example I'll convert the gzip'd file we just created back to its original format:

gunzip access.log.gz

Running that command restores our original file, as you can see in this output:

-rw-r--r--   1 al  al  22733255 Aug 12  2008 access.log

The Linux file compress utilities (zcat, zmore, zgrep)

I used to think I had to uncompress a gzip'd file to work on it with commands like cat, grep, and more, but at some point I learned there were equivalent gzip versions of these same commands, appropriately named zcat, zgrep, and zmore. So, anything you would normally do on a text file with the first three commands you can do on a gzip'd file with the last three commands.

For instance, instead of using cat to display the entire contents of the file, you use zcat to work on the gzip'd file instead, like this:

zcat access.log.gz

(Of course that output will go on for a long time with roughly 22MB of compressed text.)

You can also scroll through the file one page at a time with zmore:

zmore access.log.gz

And finally, you can grep through the compressed file with zgrep:

zgrep '/java/index.html' access.log.gz

There are also two other commands, zcmp and zdiff, that let you compare compressed files, but I personally haven't had the need for them. However, as you can imagine, they work like this:

zmp file1.gz file2.gz

or

zdiff file1.gz file2.gz

Linux gzip / compress summary

As a quick summary, just remember that you don't have to uncompress files to work on them, you can use the following z-utilities to work on the compressed files instead:

  • zcat
  • zmore
  • zgrep
  • zcmp
  • zdiff