Linux backups: Using find, xargs, and tar to create a huge archive

I did something wrong in a previous blog entry that led me to use the "pax" command to create a large backup/archive. There's nothing wrong with using the pax command -- other than the fact that it's not available for Cygwin -- and I really needed to created a huge archive. (I know that pax is available for our Linux and Unix systems, but I can't find a version for Cygwin.)

In my earlier blog post I stated that something like this did not work for me when trying to create a large backup using find, xargs, and tar:

find . -type f -name "*.java" | xargs tar cvf myfile.tar

What was happening was that as xargs was managing the input to the tar command, tar kept re-writing the archive. That is, each time xargs passed a new block of input files to tar, tar perceived it as a new command, and went on to re-create the file named myfile.tar. So, instead of the huge myfile.tar that I expected, I ended up with only a few files in the archive.

This problem is easily remedied if you use the 'r' switch/command with tar instead of the 'c' switch/command. The 'r' switch tells tar to append to the archive, while 'c' says "create".

All that being said, this command worked just fine for me to create a very large tar archive:

find . -type f -name "*.java" | xargs tar rvf myfile.tar

This combination of find, tar, and xargs worked like a champ for me. I guess this is one of those things where Unix is "intuitively obvious once you know how to do it", because in retrospect this seems like the obvious solution.



Ah I was just having this exact problem using xargs tar -cvf, and the "r" flag saved me!


Totally found this one the hard way too. If you want two file types, use
>find . -type f -name "*." -name "*." | xargs tar cvf myfile.tar
or if you want every file except one type:
>find . -type f ! -name "*." | xargs tar cvf myfile.tar