I’m looking into producing my Scala/FP book as a PDF, and as part of that I have been looking into Pandoc. With the exception of converting HTML tables into other formats such as Markdown or LaTeX, Pandoc has been working well so far.
Here are a couple of Pandoc commands to show you how easy this is:
# create a pdf from a markdown doc pandoc test1.md -s -o test1.pdf # create an html doc from a markdown doc, long form pandoc test1.md -f markdown -t html -s -o test1.html # convert markdown to latex pandoc test1.md -s -o test1.tex pandoc test1.md -f markdown -t latex -s -o test1.tex # read a markdown doc and print html to stdout pandoc -s table.md --to html # convert a latex document to html pandoc -s test.tex -o html
As a “note to self,” I confirmed that LaTeX to HTML approach in October, 2019. It creates a large, single-page HTML document. It’s not perfect, but it worked surprisingly well on a large LaTeX project.
As another note to self, this command helps a little bit with the Pandoc HTML to Markdown table conversion problem:
pandoc table.html --to=markdown_github -o table.md
As a better note, both of these commands work when converting tables in ODT and DOCX files to Markdown:
pandoc Test.odt -t markdown-simple_tables-multiline_tables-grid_tables -o ODT.md pandoc Test.docx -t markdown-simple_tables-multiline_tables-grid_tables -o DOCX.md
I can confirm that those commands create pipe-delimited Markdown tables from ODT and DOCX input files.
For more information on Pandoc, see their getting started doc and user’s manual.