Remove non-printable ASCII characters from a file with this simple Unix command

For a variety of reasons you can end up with text files on your Unix filesystem that have binary characters in them. In fact, I showed you how to do this to yourself in my blog post about the Unix script command. (There's nothing wrong with this approach; it's just a by-product of using the script command.)

Remove the garbage characters with the Unix tr command

To fix this problem, and get the binary characters out of your files, there are several approaches you can take to fix this problem. Probably the easiest solution involves using the Unix tr command. Here's all you have to remove non-printable binary characters (garbage) from a Unix text file:

tr -cd '\11\12\15\40-\176' < file-with-binary-chars > clean-file

This command uses the -c and -d arguments to the tr command to remove all the characters from the input stream other than the ASCII octal values that are shown between the single quotes. This command specifically allows the following characters to pass through this Unix filter:

octal 11: tab
octal 12: linefeed
octal 15: carriage return
octal 40 through octal 176: all the "good" keyboard characters 

All the other binary characters -- the "garbage" characters in your file -- are stripped out during this translation process.

For more information on ASCII characters

For more information on ASCII characters check out the ASCII character tables at either of these sites:

 

Error! Missing the slash before 40

there's a slash missing before the 40. I had to figure out why when it didn't work right. the command is:

tr -cd '\11\12\15\40-\176' < file-with-binary-chars > clean-file

Sorry about that, and thanks

Sorry about that, and thanks for letting me know. I've updated the command above.

Typo in explanation

Hello

This oneliner saved my life!

Found a typo:
octal 140 through octal 176: all the "good" keyboard characters
should be
octal 40 through octal 176: all the "good" keyboard characters

Thanks again

Thanks for catching this

Thanks for catching this typo. I just made the correction to the article.

Post new comment

The content of this field is kept private and will not be shown publicly.