pandoc

How to convert asciidoc to markdown alvin January 27, 2019 - 6:06pm

It turns out that converting AsciiDoc to HTML without including a bunch of undesired CSS is a problem, and converting AsciiDoc to Markdown is also a problem. The page I linked to shows the best way I’ve found to convert AsciiDoc to Markdown, which can then be converted to CSS-free HTML. In case that page ever disappears, the basic commands are:

Install pandoc and asciidoc:

sudo apt install pandoc asciidoc

Convert asciidoc to docbook:

asciidoc -b docbook foo.adoc

Convert docbook to markdown:

How to convert Docbook to AsciiDoc

If you ever need to convert Docbook to AsciiDoc, this Pandoc command seems to work well:

pandoc --wrap=none -f docbook -t asciidoc \
       DocbookFile.xml > AsciiDocFile.adoc

How to convert HTML to AsciiDoc with Pandoc

If you ever need to convert HTML to AsciiDoc, I just used this Pandoc command and it seems to work well:

pandoc --wrap=none -f html -t asciidoc myfile.html > myfile.adoc

The wrapping part of that command isn’t 100% necessary, but if you don’t use it, Pandoc will wrap the plain paragraph text, which I don’t like because I’ll be editing the resulting AsciiDoc text.

Here’s some of the AsciiDoc text that this command generated:

How to convert Asciidoc to HTML

As a brief note to self, if you need to convert an Asciidoc file named test1.adoc to HTML format, this command works:

asciidoc -o test1.html test1.adoc

Of course a key here is that you need the asciidoc command installed. I installed it on my Mac with Homebrew, something like brew install asciidoc. I don’t like the HTML that this approach generates, so I’ll keep looking for something better.

MacOS, Pandoc, PDFs, and MacTex

Note to self: When trying to use Pandoc to create a PDF on MacOS, you need to install MacTex separately. Install everything, because it will make things much easier later.

A custom TextMate command that uses ‘sed’

In this post I share the contents of a custom TextMate command I just created that uses pandoc and sed to convert markdown content in the TextMate editor to a “pretty printer” version of HTML:

#!/bin/sh

PATH=$PATH:/usr/local/bin

# note: 'sed -E' gives you the advanced regex's

# use pandoc to convert from markdown to html,
# then use sed to clean up the resulting html
pandoc -f markdown -t html |\
sed -Ee "/<p|<h2|<h3|<h4|<aside|<div|<ul|<ol/i\\
\\"

You can try to use a command like tidy to clean the HTML, but the version of tidy I have does not know about HTML5 tags. The TextMate Markdown plugin also doesn’t work the way I want it. Besides that, I’m trying to learn more about writing TextMate commands anyway.

As an important note, when you set this up as a TextMate command and then run it, it will convert the TextMate editor contents from markdown to HTML.

(In a related note, serenity.de is also a good resource for TextMate command and bundle documentation.)

In summary, this code shows:

* How to execute a Unix shell command from TextMate
* Specifically, how to execute a sed command from TextMate
* How to use modern regular expressions with sed (the -E option)
* How to search for multiple regex search patterns with sed

Markdown comments syntax: Comments that won’t appear in generated output

Markdown FAQ: How do I create comments in Markdown? Especially comments that won’t appear in the generated output.

Part 1 of my answer is that technically there is no way — or at least no standard way — to create comments in Markdown documents, other than to use HTML comments like this:

Getting started converting documents with Pandoc

I’m looking into producing my Scala/FP book as a PDF, and as part of that I have been looking into Pandoc. With the exception of converting HTML tables into other formats such as Markdown or LaTeX, Pandoc has been working well so far.

Here are a couple of Pandoc commands to show you how easy this is: