The Drupal Diaries, Day 12

If you're interested in the really short story, this Drupal-based website is now hosted on a new, dedicated Linux server with much more RAM than the old server. For the longer story -- and the details of Apache tweaks and a script to automatically restart Apache when it got hung up -- read on.

Surprised by the Drupal plus LAMP memory use

I ended up being really surprised, but it turns out that Drupal and the LAMP architecture (Linux, Apache, MySQL, and PHP) take much more memory than the old Java-based blog that I had running the DevDaily site previously.

The old software was written with pure Java JSP's and servlets, with a Postgres backend, while the new architecture runs Drupal, which sits on the LAMP stack. While Drupal has a ton of flexibility, including hundreds (maybe thousands) of modules and themes, my old Java blog was written for one purpose, and that was to to serve as a single-user blog, and it ran fine with only 256MB RAM.

But with the recent switch to Drupal, I was fighting a daily battle to give the appearance that this site was always up and running. The truth was, Apache was constantly running out of memory, and a "self healing" program (a shell script) I wrote was rebooting Apache 10-20 days a day to keep it limping along.

Drupal memory details

It turns out that the LAMP stack and Drupal modules I had installed required roughly 17 MB RAM for each Apache process. As a user commented on my last Drupal blog post, the PHP folks strongly recommend that you use the Apache "prefork" memory model, and when doing so, each process requires a fair amount of memory, 17 MB RAM per process, to be specific. (The "worker" memory model may consume significantly less RAM by using threads instead of processes, but this practice is strongly discouraged in several PHP docs.)

Doing the math, if we start with the old server having 256MB RAM available, then subtract 24MB for MySQL, that leaves 232 MB for Apache. Dividing 232 MB by 17 per Apache thread means I could have a maximum of 13 Apache processes try to serve up the workload of this site, and that just wasn't going to work. I had a ton of Apache error messages that look like this in my Apache error log:

[error] server reached MaxClients setting, consider raising the MaxClients setting

Tweaking MaxClients and other Apache settings

I did a lot of tweaking of the Apache MaxClients and related Apache settings, and I was able to get as many as 20-25 Apache clients (processes) running at a time, but Apache would invariably bog done during the busiest times. I know the Apache docs say that when you have more requests than MaxClients will allow at one time, the future requests will queue up, but as a practical matter, it appeared that my web server was dead, and restarting it with an explicit "stop" command followed by a "start" command was the only way to get response back. (To be clear, an apachectl restart command did not work here; apachectl stop and then apachectl start was required to solve this problem.)

If it helps anyone with a similar problem to see it, the last Apache prefork settings I had on the old server looked like this:

<IfModule prefork.c>
StartServers           3
MinSpareServers        3
MaxSpareServers       10
MaxClients            25
MaxRequestsPerChild  100
</IfModule>

Those settings worked the best for me, and I'll be glad to explain them to anyone if need be.

Self-healing script

Also, if it helps anyone to see it, this is the source code for the "self healing" script I created to (a) test to see if my web server was alive, then (b) restart the server if it appeared to be dead:

#!/bin/sh

# author:  alvin alexander, devdaily.com
# purpose: automatically restart apache if it is not responsive
# created: 2009-08-25

# choose a lightweight url to hit
TEST_URL=http://www.devdaily.com/java/jwarehouse/cobertura-1.9.shtml
LOGFILE=/tmp/auto-restart-apache.log

# try to download a page, then use the wget exit status
/usr/bin/wget --timeout=5 --quiet --tries=2 --output-document=/dev/null "$TEST_URL"
wget_status=$?

echo "" >> $LOGFILE
date    >> $LOGFILE
echo "wget wget_status: $wget_status" >> $LOGFILE

if [ $wget_status -ne 0 ]
then
  # move to apache bin dir
  cd /usr/local/apache2/bin

  # stop apache
  echo "stopping apache at `date`" >> $LOGFILE
  ./apachectl stop

  sleep 2

  # start apache
  echo "starting apache at `date`" >> $LOGFILE
  ./apachectl start
else
  echo "No need for a restart" >> $LOGFILE
fi

As you can see, the script would try to hit a URL on the server, and if it didn't get a response back from the Apache server in 10 seconds, it assumed the server was dead, and restarted it. Again, I can explain this more if anyone needs more details, but for now I'll let this paragraph stand as is.

crontab entry

As a final note, this script was invoked every minute from a crontab entry that looked like this:

* * * * * /tmp/auto-restart-apache.sh 2> /dev/null

I know I dropped some connections over the last week, but in general, restarting Apache like this kept the site alive much more than if it had continued to be hung up as the system ran out of memory. On a day with 20 Apache restarts, the site may have been down for 40 minutes total, but that's a lot better than just hanging up forever.

Conclusion

As mentioned, this site now runs on a server with much more RAM. Limping along with 20 Apache restarts a day just isn't a good thing, so after fighting the good fight for nearly two days of Apache tweaks, I finally gave up and got this new server.

And now, it's time to get back to my regularly-scheduled work ...