In two small tests where GraalVM was able to create a native executable, the native executable ran significantly faster than the equivalent Scala/Java code running with the Java 8 JVM, and also reduced RAM consumption by a whopping 98% in a long-running example. On the negative side, GraalVM currently doesn’t seem to work with Swing applications.
Over the past five years I’ve been writing Scala shell scripts — rather than scripts using Unix tools or other scripting languages — as a means of pushing the boundaries of what can be done with “Scala in the small.” This recently culminated in creating my own Scala ‘Sed’ library.
While I’ve enjoyed doing that, one thing that always bothers me with this approach is the JVM startup lag. Where binary executables start running immediately, whenever I start a JVM script I always feel that slow startup lag time.
This past week I finally decided to take GraalVM (which I’ll call Graal in this article) out for a spin, and as a means of creating native executables from command-line Scala (or Java) classes and JAR files, it looks like a big win.
Test system information
I conducted the following tests on a 2013 MacBook Pro running macOS Mojave (10.14.5), with a 2.3 GHz Intel Core i7 with 16 GB RAM and a SSD drive. The Scala version is 2.12.8, the Java version is OpenJDK 1.8.0_222, and the GraalVM version is 19.1.1. The tests were run on July 20 & 21, 2019.
Creating native executables
After you get Graal and its
native-image command installed, creating native executables with Graal is pretty easy. If you have a single Java class file named Find.class and it has a
main method, you can create an executable named
find with this command:
$ native-image Find
Note that the output filename is all lowercase. I assume the output is named find.exe on Windows’ systems, but I don’t know for sure.
To create an executable from a self-contained JAR file — meaning a JAR file you can run with
java -jar — use this command:
$ native-image -jar Hyde.jar
If you have a JAR file that needs other resources, create a native image by supplying the necessary classpath:
// create a jar file from a scala class $ scalac RenumberAllMdFiles.scala -d RenumberAllMdFiles.jar // turn the jar file into a native executable $ native-image -cp $SCALA_HOME/lib/scala-library.jar:RenumberAllMdFiles.jar RenumberAllMdFiles
Note that this last example also creates a lowercase executable file named renumberallmdfiles. (Insert sad emoji face here for tools that rename my stuff.)
Test 1: Modifying 55 files with my Sed library
For my first test, I tested running a Scala JAR file I had without using Graal, i.e., using the
scala command and therefore the JVM:
$ time scala -cp $SCALA_HOME/lib/scala-library.jar:RenumberAllMdFiles.jar
Then I created a native image of that JAR file, renamed it back to a camelcase name, and ran it like this:
$ time ./RenumberAllMdFiles
Per the Unix
time tool, the results were an order of magnitude difference in Graal’s favor:
java/JVM Graal -------- -------- 0m0.727s 0m0.058s
I could tell that the Graal executable was faster, but it took me a moment to realize there was a leading
0 before that
92% faster run time. Very cool.
And yes, I verified that the results were the same.
Test 2: A longer-running application
Getting rid of that startup lag time felt like a huge win for my Scala shell-script life. Next, I wanted to see a bigger test.
I don’t have any scripts that take a long time to run, so looked around and found a Java file-finding class on this Oracle page and decided to put it to the test with Graal. I copied their code into a .java file, compiled it to a .class file, then created a Graal executable from the class file with this command:
$ native-image Find
After a quick bit of research to find something that would run for a while, I decided to search my ~/Projects directory — which contains over 500,000 files — for files named CatoGui.scala. This time I ran the .class file with the
java command, and then ran the Graal native executable, and the
time results were:
java/JVM Graal -------- ------- 27.087s 16.433s
If you prefer images:
This was a real surprise. I expected Graal to be a little faster due to the reduced startup lag time, but it was a whopping 39% faster in a long-running script. How could this be?
Note: I re-ran these tests multiple times to verify them, including rebooting my laptop several times.
More details on Test 2
There are probably many ways to determine why Graal is so much faster for a use-case like this — and feel free to comment on those below — but I learned that a simple approach is to re-run the tests with the
time -l command. Excluding the actual search results, here’s the output from that
time command using the java/class approach:
$ /usr/bin/time -l java Find /Users/al/Projects -name "CatoGui.scala" 27.85 real 2.31 user 14.34 sys 109424640 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 31133 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 2 signals received 63223 voluntary context switches 6407 involuntary context switches
And here are the results from the Graal native image:
$ /usr/bin/time -l find /Users/al/Projects -name "CatoGui.scala" 16.15 real 0.59 user 6.94 sys 1986560 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 496 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 52545 voluntary context switches 2293 involuntary context switches
I’m not a performance-tuning expert, but one thing immediately stands out:
java: 109,424,640 maximum resident set size graal: 1,986,560 maximum resident set size
I’ve seen multiple definitions for what the units of “maximum resident set size” are, but whatever the heck those units are, the
java command with the .class file requires more than 55 times the memory that the Graal executable requires. The “page reclaims” required by the
java version are also higher by a factor of almost 63x.
About the memory results
Despite some things I’ve read on the internet, it appears that those “maximum resident set size” units are bytes, and the Java version requires 104MB, while the Graal version requires only 2MB(!).
Two things I can say for sure are that (a) the
rss field of the
ps command is documented and states that its results are in kilobytes, and it showed ~105572 when I ran it with the java/class command running, which is 103MB; also, at about the same time, (b) the
RES field of the
htop command shows 104M for the java/class command.
As a bit of proof, here’s an image of the
htop screen for the java/class command:
And here’s an
htop image taken when the Graal executable was running, showing remarkably little memory use:
If you want to repeat these tests on your system, Mac users can install
htop with Homebrew, and this is the
ps command I ran:
$ ps -awxm -o %mem,rss,comm | sort -nr | grep java
You can also put that code in a loop if you want:
while true do ps -awxm -o %mem,rss,comm | sort -nr | grep java sleep 2 done
For the java/class/jvm command, that shows output like this:
0.4 68204 /usr/bin/java 0.5 91156 /usr/bin/java 0.6 105780 /usr/bin/java 0.6 106140 /usr/bin/java 0.6 106220 /usr/bin/java . . .
As a final note, you can also use
-Xmx with your
java command to put a cap on the maximum memory used:
$ time java -Xmx512M Find /Users/al/Projects -name "CatoGui.scala"
On my system this had no effect on the run time.
Please note that my thought process in using this Find.java code was, “What code can I find that I can run from the command-line that will take a while to run,” and not something like, “What code can I find where Graal can reduce its memory use by 98%.”
One thing to know about Graal is that when you run the
native-image command it takes a while to compile your class or JAR file to a native executable. Here’s the output from running
native-image on Find.class:
$ native-image Find Build on Server(pid: 53748, port: 55574)* [find:53748] classlist: 1,491.50 ms [find:53748] (cap): 2,155.80 ms [find:53748] setup: 3,447.06 ms [find:53748] (typeflow): 2,380.97 ms [find:53748] (objects): 1,613.12 ms [find:53748] (features): 291.77 ms [find:53748] analysis: 4,370.96 ms [find:53748] (clinit): 97.72 ms [find:53748] universe: 796.64 ms [find:53748] (parse): 422.46 ms [find:53748] (inline): 943.66 ms [find:53748] (compile): 4,635.51 ms [find:53748] compile: 6,337.11 ms [find:53748] image: 552.88 ms [find:53748] write: 195.97 ms [find:53748] [total]: 17,379.96 ms
As shown, it takes a little over 17 seconds to create a native image on my system.
A second thing to know is that this command starts a background server by default, and keeps it running after the command is finished. That server consumes over 1GB RAM, so you’ll want to stop/kill it. When it’s running, this command:
ps auxw | grep graalvm
shows a result like this:
al 53748 0.0 7.8 13148200 1307096 ?? S 1:43PM 1:28.48 /Users/al/bin/graalvm-ce-19.1.1/ ... much more here ...
After a while I learned that you can run
native-image without the server, like this:
$ native-image --no-server Find
For much more information, here’s a link to the GraalVM native-image command.
No joy for Swing apps :(
In sad news, there is no
native-image joy for Swing applications. I’ve created several Swing apps that I use, and anything to make them start faster and use less RAM sounded awesome, but sadly they won’t compile with
$ native-image JFrameExample Build on Server(pid: 18953, port: 55329) . . . Warning: Aborting stand-alone image build. Unsupported features in 2 methods Detailed message: Error: Detected a started Thread in the image heap. Threads running in the image generator are no longer running at image run time. The object was probably created by a class initializer and is reachable from a static field... Trace: object sun.awt.AWTAutoShutdown method sun.awt.AWTAutoShutdown.getInstance() Call path from entry point to sun.awt.AWTAutoShutdown.getInstance(): at sun.awt.AWTAutoShutdown.getInstance(AWTAutoShutdown.java:133) at java.awt.EventQueue.detachDispatchThread(EventQueue.java:1137) at java.awt.EventDispatchThread.run(EventDispatchThread.java:88) at com.oracle.svm.core.thread.JavaThreads.threadStartRoutine(JavaThreads.java:460) at ... Trace: object sun.java2d.opengl.OGLRenderQueue field sun.java2d.opengl.OGLRenderQueue.theInstance
Somebody else already filed a bug report on this, so (fingers crossed) maybe it will be fixable in the future.
You can also read more about GraalVM’s current limitations on this Github page.
I hope all of that was helpful. It sure seems promising for my Scala shell-script world.
For more information on GraalVM, see the GraalVM website.
Also, I haven’t watched it yet, but I know that this ScalaDays 2018 video talks about Twitter using Graal with their microservices.