Table of Contents
A few times during the past year I got tired of trying to remember the Unix/Linux sed
syntax while needing to make edits to many files, so this weekend I wrote a little sed
-like Scala class.
Introduction
Actually, I wrote two separate classes for different needs:
- One that lets you operate on each line in a file, one line at a time
- A separate class that lets you look at both the current line and the previous line, to give you a little more flexibility
The basic Sed
class was pretty simple to write and is simple to use, though I realize now that I’ll need to change it to handle the “delete line” use case. That wasn’t something I need, so I didn’t even think about it until now.
The MultilineSed
class currently works, but it feels convoluted, so I expect it to change once I look at it when I’m not tired. (I tend to write these things late at night.)
Kaleidoscope rocks!
Along the way I learned one cool thing that’s barely related to my work: Jon Pretty’s Kaleidoscope project lets you use string pattern-matching code in Scala match
expressions. This enables regex pattern-matching code like this:
// Kaleidoscope enables this:
def updateLine(currentLine: String): String = currentLine match {
case r"^# ${header}@(.*)" => s"<h1>$header</h1>" //"# foo" -> "<h1>foo</h1>"
case r"^## ${header}@(.*)" => s"<h2>$header</h2>" //"## foo" -> "<h2>foo</h2>"
case _ => currentLine
}
val sed = new Sed("EXAMPLE.md", updateLine)
sed.run
As shown, the ability to use regex pattern-matching code in a Scala match
expression is very useful, and I didn’t realize how important Kaleidoscope was until I tried to use a regex in a match
expression without it. (This SO post shows some other ways you can try to use pattern matching with match expressions.)
How Sed works
Hopefully from that code you can see how the basic Sed
class works:
- You define a sed-like function with as many “pattern match and replace” expressions as you need
- If you use a match expression (as shown), the default case just passes along lines you don’t want to modify
- You can implement your function however you want; the only important thing is that the function has the required
String => String
type - You pass that function into
Sed
along with the file you want to read - Then you call
run
on thatSed
instance
When you call run
, Sed
invokes your function for each line as it reads the file, and writes its output to STDOUT.
As a shameless plug, if some of that discussion about functions doesn’t seem to make sense, I encourage you to read Functional Programming, Simplified.
MultilineSed
As mentioned, there’s also a MultilineSed
class that I’m still working on and thinking about, but the benefit of it is that it lets you convert two lines of text like this:
Hello
-----
into one output line like this:
<h2>Hello</h2>
As I note in the documentation, MultilineSed
isn’t really a good name for it as it really just lets you look at the current line and previous line, but hey, you gotta start somewhere.
The source code
The source code for my Sed project is here:
Hopefully there’s enough documentation there to get you started. In addition to the documentation, the repo also includes two demo classes and an example Scala shell script.
If you look in that project you’ll see that the Sed
class is simple. I just use a Scala BufferedSource
to read the file, but you may be able to use a Java PushbackReader or other Scala file-streaming solutions, as desired. Also, the Sed sub-project has a little to-do list in TODO.md, and hopefully it will answer any “Why didn’t you ___?” questions. Finally, feel free to fork the project for your own needs.
All the best,
Alvin Alexander
Louisville, Colorado