parsing

Five good ways (and two bad ways) to read large text files with Scala

I’m working on a small project to parse large Apache access log files, with the file this week weighing in at 9.2 GB and 33,444,922 lines. So I gave myself 90 minutes to try a few different ways to write a simple “line count” program in Scala. (Not my final goal, but something I could use to measure file-reading speed without applying my algorithm.)

An Android Java, JSON, and Twitter REST API example

I don't get to parse too much JSON code with Java because the biggest JSON source I work with is Twitter, and I always use the Twitter4J project to interact with their web services. But a few days ago while working on an Android project, I just wanted to access their "Twitter Trends" REST service, and I used Java and the json.org Java library that comes with Android to parse the Twitter Trends JSON feed like this:

Parsing “real world” HTML with Scala, HTMLCleaner, and StringEscapeUtils

While XML parsers work great for well-formed XML, out in the 'real world' internet, you can't count on HTML being XHTML, or even being well-formatted. As a result, various 'HTML cleaner' libraries for Java have appeared. They attempt to clean up the HTML so you can parse it.

Scala YAML parser examples using Snakeyaml

Summary: A Scala YAML parsing example using the Snakeyaml parser.

If you need some Scala YAML parsing examples using Snakeyaml parser, you've come to the right place. I just worked through some Snakeyaml issues related to Scala, in particular converting YAML to JavaBean classes written in Scala, so I thought I'd share the source code here.

A Scala JSON array parsing example using Lift-JSON

I just worked through some Scala Lift-JSON issues, and thought I'd share some source code here.

In particular, I'm trying to parse a JSON document into Scala objects, and I'm using Lift-JSON to do so. One of the things I'm doing here is to parse some of the JSON text into an array of objects, in this case an array of String objects.

First, here's the Scala source code for my example, and then a description will follow:

Fri, Nov 7, 2003

I just started using an open source spell checking tool for Java. Its name is Jazzy, and the authors have created a site for this tool on sourceforge. It is only on version 0.5, so I'm a little concerned about recommending it at this time, but I've created two test programs, and they both seem to work okay. I'll release those test programs shortly, because as you might have guessed by now, I'm a big believer in learning how to use tools, or how to get started with them, by working with examples. Another cool thing for LaTeX weirdos like me ...