A little Scala `sed` class

A few times during the past year I got tired of trying to remember the Unix/Linux sed syntax while needing to make edits to many files, so this weekend I wrote a little sed-like Scala class.

Introduction

Actually, I wrote two separate classes for different needs:

  • One that lets you operate on each line in a file, one line at a time
  • A separate class that lets you look at both the current line and the previous line, to give you a little more flexibility

The basic Sed class was pretty simple to write and is simple to use, though I realize now that I’ll need to change it to handle the “delete line” use case. That wasn’t something I need, so I didn’t even think about it until now.

The MultilineSed class currently works, but it feels convoluted, so I expect it to change once I look at it when I’m not tired. (I tend to write these things late at night.)

Kaleidoscope rocks!

Along the way I learned one cool thing that’s barely related to my work: Jon Pretty’s Kaleidoscope project lets you use string pattern-matching code in Scala match expressions. This enables regex pattern-matching code like this:

// Kaleidoscope enables this:
def updateLine(currentLine: String): String = currentLine match {
    case r"^# ${header}@(.*)"  => s"<h1>$header</h1>"  //"# foo"  -> "<h1>foo</h1>"
    case r"^## ${header}@(.*)" => s"<h2>$header</h2>"  //"## foo" -> "<h2>foo</h2>"
    case _                     => currentLine
}

val sed = new Sed("EXAMPLE.md", updateLine)
sed.run

As shown, the ability to use regex pattern-matching code in a Scala match expression is very useful, and I didn’t realize how important Kaleidoscope was until I tried to use a regex in a match expression without it. (This SO post shows some other ways you can try to use pattern matching with match expressions.)

How Sed works

Hopefully from that code you can see how the basic Sed class works:

  • You define a sed-like function with as many “pattern match and replace” expressions as you need
  • If you use a match expression (as shown), the default case just passes along lines you don’t want to modify
  • You can implement your function however you want; the only important thing is that the function has the required String => String type
  • You pass that function into Sed along with the file you want to read
  • Then you call run on that Sed instance

When you call run, Sed invokes your function for each line as it reads the file, and writes its output to STDOUT.

As a shameless plug, if some of that discussion about functions doesn’t seem to make sense, I encourage you to read Functional Programming, Simplified.

MultilineSed

As mentioned, there’s also a MultilineSed class that I’m still working on and thinking about, but the benefit of it is that it lets you convert two lines of text like this:

Hello
-----

into one output line like this:

<h2>Hello</h2>

As I note in the documentation, MultilineSed isn’t really a good name for it as it really just lets you look at the current line and previous line, but hey, you gotta start somewhere.

The source code

The source code for my Sed project is here:

Hopefully there’s enough documentation there to get you started. In addition to the documentation, the repo also includes two demo classes and an example Scala shell script.

If you look in that project you’ll see that the Sed class is simple. I just use a Scala BufferedSource to read the file, but you may be able to use a Java PushbackReader or other Scala file-streaming solutions, as desired. Also, the Sed sub-project has a little to-do list in TODO.md, and hopefully it will answer any “Why didn’t you ___?” questions. Finally, feel free to fork the project for your own needs.

All the best,
Alvin Alexander
Louisville, Colorado