Getting started converting documents with Pandoc

By Alvin Alexander. Last updated: July 25, 2020

I’m looking into producing my Scala/FP book as a PDF, and as part of that I have been looking into Pandoc. With the exception of converting HTML tables into other formats such as Markdown or LaTeX, Pandoc has been working well so far.

Here are a couple of Pandoc commands to show you how easy this is:

# create a pdf from a markdown doc
pandoc test1.md -s -o test1.pdf

# create an html doc from a markdown doc, long form
pandoc test1.md -f markdown -t html -s -o test1.html

# convert markdown to latex
pandoc test1.md -s -o test1.tex
pandoc test1.md -f markdown -t latex -s -o test1.tex

# read a markdown doc and print html to stdout
pandoc -s table.md --to html

# convert a latex document to html
pandoc -s test.tex -o html

As a “note to self,” I confirmed that LaTeX to HTML approach in October, 2019. It creates a large, single-page HTML document. It’s not perfect, but it worked surprisingly well on a large LaTeX project.

As another note to self, this command helps a little bit with the Pandoc HTML to Markdown table conversion problem:

pandoc table.html --to=markdown_github -o table.md

As a better note, both of these commands work when converting tables in ODT and DOCX files to Markdown:

pandoc Test.odt -t markdown-simple_tables-multiline_tables-grid_tables -o ODT.md

pandoc Test.docx -t markdown-simple_tables-multiline_tables-grid_tables -o DOCX.md

I can confirm that those commands create pipe-delimited Markdown tables from ODT and DOCX input files.

For more information on Pandoc, see their getting started doc and user’s manual.

technology

conversion