Tracing Solaris process ancestors with Perl


Many times when you're working with Unix computer systems, you like to know information about the processes running on those systems.  How many processes are running? Are there any runaway processes? Any zombies?  Who is utilizing most of the CPU?  Unless you purchase system management tools to help you analyze your system, the answer to these questions isn't always clear.

In a previous article we introduced the getPsefData() subroutine, which can be used to obtain a database of information about processes running on a Sun Solaris (i.e., Unix) computer.  In this article we're going to show you how to use that subroutine in a real-world application.

In this article we'll show you how to track the ancestors of processes on a Unix system.  If you supply a process-ID (PID), this program will tell you what program spawned that PID (the parent process), and what program spawned that one (the grandparent process), and so on.  This can be very useful when you're trying to debug a problem or kill the correct process on your system.

How the ancestors program works

The ancestors program is very simple.  Suppose you're working on your system and something doesn't seem quite right.  After a little research, you become concerned with process number 743.  You're thinking that maybe that process should be terminated, but before you kill it you want more information about it.  This is a great time to use the ancestors program.

Assuming your path is set up correctly, just type:

You'll be prompted to enter the PID you're interested in, like this:

At this point, you just enter the PID - in this case 743 - and you'll get output like this:

You can see from this output that PID 743 (intruder-alert, the 'child') was spawned by PID 626 (the 'parent'), which in turn was spawned by 515 (the 'grandparent'), which was started by PID 249, which was originally started by PID 0.  With this knowledge at hand you can make a better judgement as to whether PID 743 should be killed or not.

The ancestors program

We won't spend much time here discussing the getPsefData() subroutine.  Please read our previous article for information on that routine.  Instead, we'll jump right into our main program that processes the data compiled by getPsefData().

The complete main portion of our program is shown in Listing 1.
#-------------------------  Main  ---------------------------#

#  Step 1:  Get the ps -ef data, and put it in the desired   #
#           arrays and hashes.                               #


#  Step 2:  Prompt the user for the PID to trace.  #
#           (It's assumed that the user knows      #
#           what PID they want to trace, so they   #
#           are not prompted with a list of PIDs.  #

print "\n\nSpecify the PID whose lineage should be traced: ";

printf("\n\n\t\t%5d \t%s\n", $pid, $ucmd{$pid});

#  Step 3:  Print the lineage of the desired PID.   #
#           Reset $pid each time through the loop.  #

while ($pid != 0) {
   $parent[$i] = $ppid{$pid};
   printf("\t\t%5d \t%s\n", $parent[$i], $ucmd{$pid});
   $pid = $parent[$i];
print "\n";


Listing 1:  The main portion of our program (1) calls getPsefData, (2) prompts the user for the PID to search for, and (3) determines the ancestors of the given PID. 

As you can see from the listing, the routine consists of three steps.  The first step is to call the getPsefData() subroutine.  This generates all of the arrays and hashes we need for the rest of the program.

The next step is to prompt the user for the PID whose ancestry they want to trace.  All the user has to do is enter a PID (something like '743'), and the program takes care of the rest of the work.

In step 3 of the process, we iterate through the processes until we go all the way back to PID 0 (which is where all processes will trace back to eventually).  Using hashes, it's very easy to get each parent process ID just by writing:

Once we have the parent process ID (PPID) of the current process, we just set $pid equal to this PPID, and perform the loop all over again to get the parent of the PPID.  This gives us the grandparent.  After we have the grandparent, we get their parent, and so on, until $pid finally equals 0.

The print statement within the loop gives us the desired output.  If you don't like this format, it's very easy to change this statement to a format you prefer.


As you can see from this real-world example, once you've put together a subroutine like getPsefData(), it's very easy to build on that routine to perform very useful work.  In a future example, we'll demonstrate how you can use this same subroutine to determine the most active processes running on your computer system.

Download the code

Click here if you'd like to download the source code for the complete program.

Important notes

The getPsefData() subroutine was written to process the output of the "ps -ef" command on Sun's Solaris operating system.  This is one of those cases where there are differences between Unix operating systems, so the same function may not work properly on other versions of Unix, such as AIX, HP-UX, UnixWare, or freeBSD.  The fields of output on those systems may be different than the output fields generated by Solaris, and we haven't tested those systems.  More than likely, small changes will be required for other Unix systems.