I recently started using AsciiDoc to write a new book. A great thing about it is that unlike Markdown, you can use AsciiDoc to write a book and get all of the features you want in a book, including linking between anything, captions for tables and figures, indexes, etc. Because this got me started using AsciiDoc I thought, “Wouldn’t it be nice if I could also use AsciiDoc to write blog posts like this one?”
Sadly, I quickly ran into a problem: I couldn’t find a good way to convert AsciiDoc into HTML, or even Markdown. There are tools to convert AsciiDoc to HTML, but for some reason they take the approach of including a ton of markup in the HTML (divs, spans, and attributes), and as far as I can tell there’s no way to turn off that markup.Back to top
A shell script solution
I ended up not using this code, but if you wanted to see one way to use JSoup’s OutputSettings (Document.OutputSettings) class to set some parameters before calling
JSoup.clean, I hope this is helpful:
// tried some things to improve the html output val settings: OutputSettings = new OutputSettings settings.prettyPrint(true) //`true` is default settings.charset("UTF-8") settings.outline(true) //this is close to what i want, but too extreme settings.indentAmount(4) val cleanHtml: String = Jsoup.clean(html, "", wl, settings)
I can attest that this code works, it’s just not what I need at the moment.
Also, the code shown is written in Scala, but as you can see, it converts easily to Java.
It turns out that converting AsciiDoc to HTML without including a bunch of undesired CSS is a problem, and converting AsciiDoc to Markdown is also a problem. The page I linked to shows the best way I’ve found to convert AsciiDoc to Markdown, which can then be converted to CSS-free HTML. In case that page ever disappears, the basic commands are:
Install pandoc and asciidoc:
sudo apt install pandoc asciidoc
Convert asciidoc to docbook:
asciidoc -b docbook foo.adoc
Convert docbook to markdown:
pandoc --wrap=none -f html -t asciidoc myfile.html > myfile.adoc
The wrapping part of that command isn’t 100% necessary, but if you don’t use it, Pandoc will wrap the plain paragraph text, which I don’t like because I’ll be editing the resulting AsciiDoc text.
Here’s some of the AsciiDoc text that this command generated:
As a brief note to self, if you need to convert an Asciidoc file named test1.adoc to HTML format, this command works:
asciidoc -o test1.html test1.adoc
Of course a key here is that you need the
asciidoc command installed. I installed it on my Mac with Homebrew, something like
brew install asciidoc. I don’t like the HTML that this approach generates, so I’ll keep looking for something better.
I have a 19" monitor on the counter between my kitchen and living room, and it’s powered by a Raspberry Pi. I use the Linux Phosphor screen saver to show a scrolling “news and stock ticker” on the display, which I’ve programmed to show news from several different sources (Atom and Rss feeds, along with other news and data sources). An old version of the display looks like this:
Today I added a new “Word of the day” feature to the display, and as with all of the other code, I wrote a Scala shell script to generate the output.
To make the online reading a little easier, I’ve put a free preview version of Functional Programming, Simplified on fpsimplified.com. That website contains ~40 lessons from the book. For more complete previews, see my original Functional Programming, Simplified page.
If you want to see an example of a Play Framework 2.6 data entry form that that sets help text (tips or tooltips) on text input fields (Play inputText fields), here’s an example of the required syntax:
I don’t remember exactly why I wrote this Scala shell script, but if I remember right I was having a problem getting
sed to work properly, so I wrote this little script to insert an Amazon Kindle “break” tag before each
<h1> tag in an HTML file:
If you ever need to convert HTML to plain text using Scala or Java, I hope these Jsoup examples are helpful: