How to replace newline character with sed on Mac OS X (macOS)

I don’t have much time to explain this today (I’ll try to remember to update it when I get back), but ... if you’re interested in seeing how to use the sed command on a Mac OS X (macOS) system to search for newline characters in the input pattern and replace them with something else in the replacement pattern, this might point you in the right direction.

The problem

My problem was that I have a bunch of files with dozens to hundreds of paragraphs that look like this:

Lorem ipsum dolor sit amet, 
consectetur adipiscing elit, 

sed do eiusmod tempor incididunt 
ut labore et dolore magna aliqua.

(Those are very short sentences and paragraphs for this example.)

What I want are continuous paragraphs with no unnecessary line breaks, so I want to use sed to create output like this:

Lorem ipsum dolor sit amet, consectetur adipiscing elit, 

sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

The solution

To solve the problem I first put this sed command in a file named sed.cmds:

s/([a-zA-Z,`])\n([a-zA-Z`])/\1 \2/g

When I then tried to run the command like this:

sed -E -f sed.cmds Input.txt > Output.txt

the command wouldn’t work properly. After a lot of searching I finally found this Stack Overflow thread, and in short, the solution is to run this sed command instead:

sed -e ':a' -e 'N' -e '$!ba' -E -f sed.cmds Input.txt > Output.txt

When I run that sed command with my sed.cmds file, it successfully finds the newline characters in the sed input stream with the \n pattern, and then I replace the newline character with a blank space in the replacement pattern.

Using the search pattern in the replacement pattern

One other note: The \1 and \2 in the replacement pattern let me use the two patterns in the search pattern that I “capture.” Here’s a quick look at how they relate:

\1    ([a-zA-Z,`])
\2    ([a-zA-Z,`])

The regex inside the () parentheses is a capture group, and then \1 and \2 are variables that you can use in the replacement pattern.

That’s all (for now)

I haven’t looked into all of those sed command line options to see which ones are truly needed and which ones aren’t, but again, at the moment I can confirm that this works with the Unix system on macOS 10.12.1 (Sierra), properly finding the newline characters in sed’s input stream.

Add new comment

The content of this field is kept private and will not be shown publicly.

Anonymous format

  • Allowed HTML tags: <em> <strong> <cite> <code> <ul type> <ol start type> <li> <pre>
  • Lines and paragraphs break automatically.
By submitting this form, you accept the Mollom privacy policy.