My Scala Apache access log parser library

Last week I wrote an Apache access log parser library in Scala to help me analyze my Apache HTTP access log file records using Apache Spark. The source code for that project is hosted here on Github. You can use this library to parse Apache access log “combined” records using Scala, Java, and other JVM-based programming languages.

Analyzing Apache access logs with Spark and Scala (a tutorial)

I want to analyze some Apache access log files for this website, and since those log files contain hundreds of millions (billions?) of lines, I thought I’d roll up my sleeves and dig into Apache Spark to see how it works, and how well it works. I used Hadoop several years ago, and as a quick summary, I found the transition to be easy. Here are my notes.

Manual PHP and Drupal 6 web access logging

There was a little funky activity on a client's Drupal 6 website that was hosted at GoDaddy, and without having access to an Apache access log file, I wanted to be able to see what was going on. So I wrote the following PHP code snippet to do some manual logging, and placed it in the Drupal theme's page.tpl.php file:

Perl and Apache - How to parse Apache access log file records in Perl

Perl Apache log file FAQ: Can you demonstrate how to read an Apache access log file in Perl (How to parse an Apache access log file in Perl)?

I've provided Perl examples before that can be used to read and parse an Apache log file ("How many RSS feed readers do I have?", "A Perl program to read an Apache access log file"), but to make this code a little easier to find, I'm breaking that code out here.