apache

Analyzing Apache access logs with Spark and Scala

I want to analyze some Apache access log files for this website, and since those log files contain hundreds of millions (billions?) of lines, I thought I’d roll up my sleeves and dig into Apache Spark to see how it works, and how well it works. I used Hadoop several years ago, and as a quick summary, I found the transition to be easy. Here are my notes.

A Scala REST 'get content' client function using Apache HttpClient

As quick post here today, if you need a Scala REST client function, the following source code should be able to work for you, or at least be a good starting point. I'm using it in several applications today, and the only thing I think it needs is the ability to set a connection timeout and socket timeout, and I share the code for that down below.

Here's my Scala REST 'get content' client function, using the Apache HttpClient library:

Apache NameVirtualHost configuration using MAMP on Mac OS X

Since I can't seem to ever remember this, here are some notes on how to configure a Name Virtual Host (NameVirtualHost) on an Apache web server. In particular, this is from the httpd.conf configuration file that I use with MAMP on one of my Mac OS X development systems.

In short, as I'm developing two different applications, one named "cato" and another named "zenf", these are the important name-based virtual host lines from my Apache configuration file: