Posts in the “linux-unix” category

Linux: How to find multiple filenames with the ‘find’ command

Unix/Linux find command FAQ: How can I write one Unix find command to find multiple filenames (or filename patterns)? For example, I want to find all the files beneath the current directory that end with the file extensions ".class" and ".sh".

You can use the Linux find command to find multiple filename patterns at one time, but for most of us the syntax isn't very common. In short, the solution is to use the find command's "or" option, with a little shell escape magic. Let's take a look at several examples.

How to use the Linux sed command to edit many files in place (and make a backup copy)


Warning: The following Unix sed commands are very powerful, so you can modify a lot of files successfully — or really screw things up — all in one command. :)

Yesterday I ran into a situation where I had to edit over 250,000 files, and with that I also thought, “I need to remember how to use the Unix/Linux sed command.” I knew what editing commands I wanted to run — a series of simple find/replace commands — but my bigger problem was how to edit that many files in place.

A quick look at the sed man page showed that I needed to use the -i argument to edit the files in place:

Linux: Recursive file searching with `grep -r` (like grep + find)


Unix/Linux grep FAQ: How can I perform a recursive search with the grep command in Linux?

Two solutions are shown next, followed by some additional details which may be useful.

Solution 1: Combine 'find' and 'grep'

For years I always used variations of the following Linux find and grep commands to recursively search subdirectories for files that match a grep pattern:

find . -type f -exec grep -l 'alvin' {} \;

This command can be read as, “Search all files in all subdirectories of the current directory for the string ‘alvin’, and print the filenames that contain this pattern.” It’s an extremely powerful approach for recursively searching files in all subdirectories that match the pattern I specify.

Solution 2: 'grep -r'

However, I was just reminded that a much easier way to perform the same recursive search is with the -r flag of the grep command:

grep -rl alvin .

As you can see, this is a much shorter command, and it performs the same recursive search as the longer command, specifically:

  • The -r option says “do a recursive search”
  • The -l option (lowercase letter L) says “list only filenames”
  • As you’ll see below, you can also add -i for case-insensitive searches

If you haven’t used commands like these before, to demonstrate the results of this search, in a PHP project directory I’m working in right now, this command returns a list of files like this:


More: Search multiple subdirectories

Your recursive grep searches don’t have to be limited to just the current directory. This next example shows how to recursively search two unrelated directories for the case-insensitive string "alvin":

grep -ril alvin /home/cato /htdocs/zenf

In this example, the search is made case-insensitive by adding the -i argument to the grep command.

Using egrep recursively

You can also perform recursive searches with the egrep command, which lets you search for multiple patterns at one time. Since I tend to mark comments in my code with my initials ("aja") or my name ("alvin"), this recursive egrep command shows how to search for those two patterns, again in a case-insensitive manner:

egrep -ril 'aja|alvin' .

Note that in this case, quotes are required around my search pattern.

Summary: `grep -r` notes

A few notes about the grep -r command:

  • This particular use of the grep command doesn’t make much sense unless you use it with the -l (lowercase "L") argument as well. This flag tells grep to print the matching filenames.
  • Don’t forget to list one or more directories at the end of your grep command. If you forget to add any directories, grep will attempt to read from standard input (as usual).
  • As shown, you can use other normal grep flags as well, including -i to ignore case, -v to reverse the meaning of the search, etc.

Here’s the section of the Linux grep man page that discusses the -r flag:

-R, -r, --recursive
Read all files under each directory, recursively; this is
equivalent to the -d recurse option.

  Recurse in directories only searching file matching PATTERN.

  Recurse in directories skip file matching PATTERN.

As you’ve seen, the grep -r command makes it easy to recursively search directories for all files that match the search pattern you specify, and the syntax is much shorter than the equivalent find/grep command.

For more information on the find command, see my Linux find command examples, and for more information on the grep command, see my Linux grep command examples.

Linux: How to get the basename from the full filename

As a quick note today, if you’re ever writing a Unix/Linux shell script and need to get the filename from a complete (canonical) directory/file path, you can use the Linux basename command like this:

$ basename /foo/bar/baz/foo.txt

How to make an offline mirror copy of a website with wget

As a short note today, if you want to make an offline copy/mirror of a website using the GNU/Linux wget command, a command like this will do the trick for you:

wget --mirror            \
     --convert-links     \
     --html-extension    \
     --wait=2            \
     -o log              \

Update: One thing I learned about this command is that it doesn’t make a copy of “rollover” images, i.e., images that are changed by JavaScript when the user rolls over them. I haven’t investigated how to fix this yet, but the easiest thing to do is to copy the /images directory from the server, assuming that you’re making a static copy of your own website, as I am doing. Another thing you can do is manually download the rollover images.

Why I did this

In my case I used this command because I don’t want to use Drupal to serve that website any more, so I used wget to convert the original Drupal website into a series of static HTML files that can be served by Nginx or Apache. (There’s no need to use Drupal here, as I no longer update that website, and I don’t accept comments there.) I just did the same thing with my website, which is basically an online version of a children’s book that I haven’t modified in many years.

Why use the --html-extension option?

Note that you won’t always need to use the --html-extension option with wget, but because the original version of my How I Sold My Business website did not use any extensions at the end of the URLs, it was necessary in this case.

What I mean by that is that the original version of my website had URLs like this:

Notice that there is no .html extension at the end of that URL. Therefore, what happens if you use wget without the --html-extension option is that you end up with a file on your local computer with this name:


Even if you use MAMP or WAMP to serve this file from your local filesystem, they aren’t going to know that this is an HTML file, so essentially what you end up with is a worthless file.

Conversely, when you do use the --html-extension option, you end up with this file on your local filesystem:


On a Mac, that file is easily opened in a browser, and you don’t even need MAMP. wget is also smart enough to change all the links within the offline version of the website to refer to the new filenames, so everything works.

Explanation of the wget options used

Here’s a short explanation of the options I used in that wget command:

    Turn on options suitable for mirroring. This option turns on 
    recursion and time-stamping, sets infinite recursion depth,
    and keeps FTP directory listings. It is currently equivalent to 
    ‘-r -N -l inf --no-remove-listing’. 

    After the download is complete, convert the links in the document
    to make them suitable for local viewing.


-o foo
    write "log" output to a file named "foo"

    Wait the specified number of seconds between the retrievals.
    Use of this option is recommended, as it lightens the server load 
    by making the requests less frequent.

Depending on the web server settings of the website you’re copying, you may also need to use the -U option, which works something like this:

-U Mozilla
   mascarade as a Mozilla browser

That option lets you set the wget user agent. (I suspect that the string you use may need to be a little more complicated than that, but I didn’t need it, and didn’t investigate it further.)

I got most of these settings from the GNU wget manual.


An alternative approach is to use httrack, like this:

httrack --footer "" http://mywebsite:8888/

I’m currently experimenting to see which works better.


I’ll write more about wget and its options in a future blog post, but for now, if you want to make an offline mirror copy of a website, the wget command I showed should work.

Unix/Linux shell script reference page (shell cheat sheet)

Linux shell script test syntax

All of the shell script tests that follow should be performed between the bracket characters [ and ], like this:

if [ true ]
  # do something here

Very important: Make sure you leave spaces around the bracket characters.

I'll show more detailed tests as we go along.

Linux shell file-related tests

To perform tests on files use the following comparison operators:

[toc hidden:1]

Linux sed command: Use sed and wc to count leading blanks in a file

Way back in the day — pre-2007 — I used JSPs and servlets to generate a lot of the pages around here, and today I looked at how many blank spaces and blank lines are generated by the JSP's. I don't think I can do much about the blank lines (actually, I just haven't looked into it yet), but about those blanks spaces ...

Out of curiosity I decided to look at this -- how many blank spaces are there at the beginning of lines that I could delete just through formatting? Would deleting those characters help reduce my bandwidth costs (at the expense of slightly uglier JSP's)?

I thought about writing a Ruby script to get it right, but I've been working with sed so much lately I thought I'd just give it a try. So, any further introduction I think this sed script is very close to giving me what I want -- a count of the number of blank spaces at the beginning of all lines in a sample HTML file:

# 1. delete blank lines

# 2. delete lines beginning with a tab
/^  /d

# 3. delete lines beginning w/ any alpha characters, <, or %

# 4. find lines beginning w/ one or more blanks, then print only
#    the blanks
/^  */ {
        s/^\(  *\).*/\1/

# 5. delete all lines that just have ^M (need to do ^V ^M trick here)

As you can see from the five comments, these commands will (1) delete all completely blank lines from the output stream; (2) delete all lines beginning with a [Tab] character; (3) delete all lines beginning with alpha characters, the '<' character, or '%'; (4) then find all lines beginning with one or more blanks, and printing only the blanks from that line; (5) removing the '^M' character that may be at the end of lines.

Naming this file "leadingblanks.sed" I run it like this, piping the output into the wc command:

sed -f leadingblanks.sed < mySampleFile.html | wc

which leads to output like this:

358       0    2967

This output means that wc found 358 lines in the stream from sed, and in that stream there were 2,967 characters, in my case, all blanks. (I may be wrong here, there may actually be [2,967 minus 358] blank spaces, but I really don't care, this is close enough for today.)

So, as a quick summary, the JSP file that I looked at is printing as many as 2,967 extra blank spaces at the beginning of the lines it outputs -- meaning my HTML files are that much larger than they have to be -- and I can easily delete most of these characters if I want to use this as a means of reducing my bandwidth bill (and arguably making your page load that much faster).

Assuming an average page has 10,000 characters (which is close), I can reduce the bandwidth of this particular page by as much as 29.7%.

vi quit and exit tutorial

vim quit/save/exit FAQ: How do I quit/exit vim?

Answer: This depends by what you mean by the word exit. Here's a short list of the different ways I normally quit or exit a vi/vim editor session.

vi exit - no changes made to your file (vim quit command)

If you haven't made any changes to your file you can just quit your vi (or vim) editing session like this:

[toc hidden:1]

vim tip: How to configure vim autoindent

vim autoindent FAQ: How do I configure vim to automatically indent newlines? That is, if my current line is indented three spaces, and I hit [Enter], I want the next line to autmatically be indented three spaces as well.

To configure vim autoindent, just use this vim command:

vim editor: How do I enable and disable vim syntax highlighting?

vim syntax faq: How do I turn on (enable) or turn off (disable) vim syntax highlighting?

Turning on syntax highlighting in your vim editor is usually pretty simple; you just need to issue a syntax on command, either in your current editor session, or in your vimrc configuration file. Here are a couple of quick examples.

Turn vim syntax highlighting on

To enable syntax highlighting in your current vim editor session just issue this command:

[toc hidden:1]

The vim “delete line” command

vim delete FAQ: How do I delete a line in vim? (Also, how do I delete multiple lines in vim?)

To delete the current line in your vim editor, use this command:


You can use this same command to delete multiple lines in vim. Just precede the command by the number of lines you want to delete. For instance, to delete five lines in vim, use this command:

vi/vim video tutorials

Woo-hoo, I've always wanted to create a vim video tutorial series, and now that I have the software to do it, I'm finally embarking on this adventure.

My vi/vim editor video tutorial - Lesson
1, Introduction

Dozens of Unix/Linux 'grep' command examples


Linux grep FAQ: Can you share some Linux/Unix grep command examples?

Sure. The name grep means "general regular expression parser", but you can think of the grep command as a “search” command for Unix and Linux systems: It’s used to search for text strings and regular expressions within one or more files.

I think it’s easiest to learn how to use the grep command by showing examples, so let’s dive right in.

Installing Wiki.js on Ubuntu 20.04, with Postgresql

This probably won’t make sense to anyone else, but these are my notes related to installing Wiki.js and Postgresql on an Ubuntu 20.04 system. Everything here is related to setting up a new Ubuntu system and then running Wiki.js:

A Linux crontab mail command example

Linux crontab mail FAQ: Can you share an example of a Linux crontab entry you use to send email on a regular basis?

Solution: Here’s the source code for a really simple Linux mail script that I used to send an email message to one of my co-workers every month. This script used the Unix or Linux mail command to email a file to her that showed a list of all the websites on our server that she needed to bill our customers for.

An example Linux crontab file

Linux crontab format FAQ: Do you have an example of a Unix/Linux crontab file format?

I have a hard time remembering the crontab file format, so I thought I’d share an example crontab file here today. The following file is the root crontab file from a CentOS Linux server I use in a test environment.