file

A Scala “file find” utility alvin September 8, 2019 - 6:12pm

I wanted some specific features in a “find” utility, and when I couldn’t figure out how to get them with combinations of find, awk, and other Unix commands, I wrote what I wanted in Scala. Those features are (a) showing matching filenames, (b) showing the line that matches my search pattern, and underlining the pattern in the output, (c) showing the line numbers of the matches, and (d) showing an optional number of lines from the file before and after each match.

Unix/Linux: Find all files that contain multiple strings/patterns

When using Unix or Linux, if you ever need to find all files that contain multiple strings/patterns, — such as finding all Scala files that contain 'try', 'catch', and 'finally' — this find/awk command seems to do the trick:

find . -type f -name *scala -exec awk 'BEGIN {RS=""; FS="\n"} /try/ && /catch/ && /finally/ {print FILENAME}' {} \;

As shown in the image below, all of the matching filenames are printed out. As Monk says, you’ll thank me later. :)

(I should mention that I got part of the solution from this gnu.org page.)

Five good ways (and two bad ways) to read large text files with Scala

I’m working on a small project to parse large Apache access log files, with the file this week weighing in at 9.2 GB and 33,444,922 lines. So I gave myself 90 minutes to try a few different ways to write a simple “line count” program in Scala. (Not my final goal, but something I could use to measure file-reading speed without applying my algorithm.)

A big collection of Unix/Linux `find` command examples

Linux/Unix FAQ: Can you share some Linux find command examples?

Sure. The Unix/Linux find command is very powerful. It can search the entire filesystem to find files and directories according to the search criteria you specify. Besides using the find command to locate files, you can also execute other Linux commands (grep, mv, rm, etc.) on the files and directories you find, which makes find extremely powerful. 

My Scala Sed project: More features, returning strings alvin July 11, 2019 - 7:13pm
Table of Contents1 - Basic use2 - Using a Map3 - Match expressions4 - Sed limitations5 - My Sed project6 - Bonus: Factories and HOFs

My Scala Sed project is still a work in progress, but I made some progress on a new version this week. My initial need this week was to have Sed return a String rather than printing directly to STDOUT. This change gave me more ability to post-process a file. After that I realized it would really be useful if the custom function I pass to Sed had two more pieces of information available to it:

  • The line number of the string Sed passed to it
  • A Map of key/value pairs the helper function could use while processing the file

Note: In this article “Sed” refers to my project, and “sed” refers to the Unix command-line utility.

Back to top

Basic use

In a “basic use” scenario, this is how I use the new version of Sed in a Scala shell script to change the “layout:” lines in 55 Markdown files whose names are in the files-to-process.txt file:

A Scala method to write a list of strings to a file

As a brief note today, here’s a Scala method that writes the strings in a list — more accurately, a Seq[String] — to a file:

def writeFile(filename: String, lines: Seq[String]): Unit = {
    val file = new File(filename)
    val bw = new BufferedWriter(new FileWriter(file))
    for (line <- lines) {
        bw.write(line)
    }
    bw.close()
}

Some Java file utilities

As a bit of warning, this is some old Java code, but if you want to create your own Java file utilities (utility methods), this code might help you get started:

A large collection of Unix/Linux ‘grep’ command examples

Linux grep commands FAQ: Can you share some Linux/Unix grep command examples?

Sure. The name grep means "general regular expression parser", but you can think of the grep command as a "search" command for Unix and Linux systems: it's used to search for text strings and more-complicated "regular expressions" within one or more files.

I think it's easiest to learn how to use the grep command by showing examples, so let's dive right in.