In two small tests I ran where GraalVM was able to create a native executable, the native executable ran significantly faster than the equivalent Scala/Java code running with the Java 8 JVM, and also reduced RAM consumption by a whopping 98% in a long-running example. On the negative side, GraalVM currently doesn’t seem to work with Swing applications.
Background/Intro
Over the past five years I’ve been writing Scala shell scripts — rather than scripts using Unix tools or other scripting languages — as a means of pushing the boundaries of what can be done with “Scala in the small.” This recently culminated in creating my own Scala ‘Sed’ library.
While I’ve enjoyed doing that, one thing that always bothers me with this approach is the JVM startup lag. Where binary executables start running immediately, whenever I start a JVM script I always feel that slow startup lag time.
This past week I finally decided to take GraalVM (which I’ll call Graal in this article) out for a spin, and as a means of creating native executables from command-line Scala (or Java) classes and JAR files, it looks like a big win.
Test system information
I conducted the following tests on a 2013 MacBook Pro running macOS Mojave (10.14.5), with a 2.3 GHz Intel Core i7 with 16 GB RAM and a SSD drive. The Scala version is 2.12.8, the Java version is OpenJDK 1.8.0_222, and the GraalVM version is 19.1.1. The tests were run on July 20 & 21, 2019.
Creating native executables
After you get Graal and its native-image
command installed, creating native executables with Graal is pretty easy. If you have a single Java class file named Find.class and it has a main
method, you can create an executable named find
with this command:
$ native-image Find
Note that the output filename is all lowercase. I assume the output is named find.exe on Windows’ systems, but I don’t know for sure.
To create an executable from a self-contained JAR file — meaning a JAR file you can run with java -jar
— use this command:
$ native-image -jar Hyde.jar
If you have a JAR file that needs other resources, create a native image by supplying the necessary classpath:
// create a jar file from a scala class $ scalac RenumberAllMdFiles.scala -d RenumberAllMdFiles.jar // turn the jar file into a native executable $ native-image -cp $SCALA_HOME/lib/scala-library.jar:RenumberAllMdFiles.jar RenumberAllMdFiles
Note that this last example also creates a lowercase executable file named renumberallmdfiles. (Insert sad emoji face here for tools that rename my stuff.)
Test 1: Modifying 55 files with my Sed library
For my first test, I tested running a Scala JAR file I had without using Graal, i.e., using the scala
command and therefore the JVM:
$ time scala -cp $SCALA_HOME/lib/scala-library.jar:RenumberAllMdFiles.jar
Then I created a native image of that JAR file, renamed it back to a camelcase name, and ran it like this:
$ time ./RenumberAllMdFiles
Per the Unix time
tool, the results were an order of magnitude difference in Graal’s favor:
java/JVM Graal -------- -------- 0m0.727s 0m0.058s
I could tell that the Graal executable was faster, but it took me a moment to realize there was a leading 0
before that 58
:
0m0.058s -
92% faster run time. Very cool.
And yes, I verified that the results were the same.
Test 2: A longer-running application
Getting rid of that startup lag time felt like a huge win for my Scala shell-script life. Next, I wanted to see a bigger test.
I don’t have any scripts that take a long time to run, so looked around and found a Java file-finding class on this Oracle page and decided to put it to the test with Graal. I copied their code into a .java file, compiled it to a .class file, then created a Graal executable from the class file with this command:
$ native-image Find
After a quick bit of research to find something that would run for a while, I decided to search my ~/Projects directory — which contains over 500,000 files — for files named CatoGui.scala. This time I ran the .class file with the java
command, and then ran the Graal native executable, and the time
results were:
java/JVM Graal -------- ------- 27.087s 16.433s
If you prefer images:
This was a real surprise. I expected Graal to be a little faster due to the reduced startup lag time, but it was a whopping 39% faster in a long-running script. How could this be?
Note: I re-ran these tests multiple times to verify them, including rebooting my laptop several times.
More details on Test 2
There are probably many ways to determine why Graal is so much faster for a use-case like this — and feel free to comment on those below — but I learned that a simple approach is to re-run the tests with the time -l
command. Excluding the actual search results, here’s the output from that time
command using the java/class approach:
$ /usr/bin/time -l java Find /Users/al/Projects -name "CatoGui.scala" 27.85 real 2.31 user 14.34 sys 109424640 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 31133 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 2 signals received 63223 voluntary context switches 6407 involuntary context switches
And here are the results from the Graal native image:
$ /usr/bin/time -l find /Users/al/Projects -name "CatoGui.scala" 16.15 real 0.59 user 6.94 sys 1986560 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 496 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 52545 voluntary context switches 2293 involuntary context switches
I’m not a performance-tuning expert, but one thing immediately stands out:
java: 109,424,640 maximum resident set size graal: 1,986,560 maximum resident set size
I’ve seen multiple definitions for what the units of “maximum resident set size” are, but whatever the heck those units are, the java
command with the .class file requires more than 55 times the memory that the Graal executable requires. The “page reclaims” required by the java
version are also higher by a factor of almost 63x.
About the memory results
Despite some things I’ve read on the internet, it appears that those “maximum resident set size” units are bytes, and the Java version requires 104MB, while the Graal version requires only 2MB(!).
Two things I can say for sure are that (a) the rss
field of the ps
command is documented and states that its results are in kilobytes, and it showed ~105572 when I ran it with the java/class command running, which is 103MB; also, at about the same time, (b) the RES
field of the htop
command shows 104M for the java/class command.
As a bit of proof, here’s an image of the htop
screen for the java/class command:
And here’s an htop
image taken when the Graal executable was running, showing remarkably little memory use:
If you want to repeat these tests on your system, Mac users can install htop
with Homebrew, and this is the ps
command I ran:
$ ps -awxm -o %mem,rss,comm | sort -nr | grep java
You can also put that code in a loop if you want:
while true do ps -awxm -o %mem,rss,comm | sort -nr | grep java sleep 2 done
For the java/class/jvm command, that shows output like this:
0.4 68204 /usr/bin/java 0.5 91156 /usr/bin/java 0.6 105780 /usr/bin/java 0.6 106140 /usr/bin/java 0.6 106220 /usr/bin/java . . .
As a final note, you can also use -Xmx
with your java
command to put a cap on the maximum memory used:
$ time java -Xmx512M Find /Users/al/Projects -name "CatoGui.scala"
On my system this had no effect on the run time.
Please note that my thought process in using this Find.java code was, “What code can I find that I can run from the command-line that will take a while to run,” and not something like, “What code can I find where Graal can reduce its memory use by 98%.”
GraalVM native-image
notes
One thing to know about Graal is that when you run the native-image
command it takes a while to compile your class or JAR file to a native executable. Here’s the output from running native-image
on Find.class:
$ native-image Find Build on Server(pid: 53748, port: 55574)* [find:53748] classlist: 1,491.50 ms [find:53748] (cap): 2,155.80 ms [find:53748] setup: 3,447.06 ms [find:53748] (typeflow): 2,380.97 ms [find:53748] (objects): 1,613.12 ms [find:53748] (features): 291.77 ms [find:53748] analysis: 4,370.96 ms [find:53748] (clinit): 97.72 ms [find:53748] universe: 796.64 ms [find:53748] (parse): 422.46 ms [find:53748] (inline): 943.66 ms [find:53748] (compile): 4,635.51 ms [find:53748] compile: 6,337.11 ms [find:53748] image: 552.88 ms [find:53748] write: 195.97 ms [find:53748] [total]: 17,379.96 ms
As shown, it takes a little over 17 seconds to create a native image on my system.
A second thing to know is that this command starts a background server by default, and keeps it running after the command is finished. That server consumes over 1GB RAM, so you’ll want to stop/kill it. When it’s running, this command:
ps auxw | grep graalvm
shows a result like this:
al 53748 0.0 7.8 13148200 1307096 ?? S 1:43PM 1:28.48 /Users/al/bin/graalvm-ce-19.1.1/ ... much more here ...
After a while I learned that you can run native-image
without the server, like this:
$ native-image --no-server Find
For much more information, here’s a link to the GraalVM native-image command.
No joy for Swing apps :(
In sad news, there is no native-image
joy for Swing applications. I’ve created several Swing apps that I use, and anything to make them start faster and use less RAM sounded awesome, but sadly they won’t compile with native-image
:
$ native-image JFrameExample Build on Server(pid: 18953, port: 55329) . . . Warning: Aborting stand-alone image build. Unsupported features in 2 methods Detailed message: Error: Detected a started Thread in the image heap. Threads running in the image generator are no longer running at image run time. The object was probably created by a class initializer and is reachable from a static field... Trace: object sun.awt.AWTAutoShutdown method sun.awt.AWTAutoShutdown.getInstance() Call path from entry point to sun.awt.AWTAutoShutdown.getInstance(): at sun.awt.AWTAutoShutdown.getInstance(AWTAutoShutdown.java:133) at java.awt.EventQueue.detachDispatchThread(EventQueue.java:1137) at java.awt.EventDispatchThread.run(EventDispatchThread.java:88) at com.oracle.svm.core.thread.JavaThreads.threadStartRoutine(JavaThreads.java:460) at ... Trace: object sun.java2d.opengl.OGLRenderQueue field sun.java2d.opengl.OGLRenderQueue.theInstance
Somebody else already filed a bug report on this, so (fingers crossed) maybe it will be fixable in the future.
You can also read more about GraalVM’s current limitations on this Github page.
this post is sponsored by my books: | |||
#1 New Release |
FP Best Seller |
Learn Scala 3 |
Learn FP Fast |
More information
I hope all of that was helpful. It sure seems promising for my Scala shell-script world.
For more information on GraalVM, see the GraalVM website.
Also, I haven’t watched it yet, but I know that this ScalaDays 2018 video talks about Twitter using Graal with their microservices.