NQP on JVM gets Grammars, Multiple Dispatch

Originally published on 2013-02-16 by Jonathan Worthington.

Having just reached an interesting milestone, I thought I’d blog a quick progress update on my work to port NQP to the JVM, in preparation for also doing a port of Rakudo Raku.

The big news is that the grammar and regex engine is pretty much ported. This was a sizable and interesting task, and while a few loose ends need to be wrapped up I think it’s fair to say that the hard bits are done. The port includes support for the basics (literals, quantifiers, character classes, positional and named captures, calls to other rules, and so forth) as well as the more advanced features (transitive Longest Token Matching for both alternations and protoregexes, and embedded code blocks and assertions). Missing are anchors, conjunctions and a small number of built-in rules; none of these are particularly demanding or require primitives that don’t already exist, however. It’s also worth pointing out that the NQP code to calculate the NFAs used in Longest Token Matching also runs just fine atop of the JVM.

Another interesting feature that I ported a little while ago is multiple dispatch. This was some effort to port, since the original implementation had been done in C. While it’s sensible to have a close-to-the-VM dispatch cache, there’s little reason for the one-off candidate sorting work (not a hot path) to be done in C, so I ported the code for this to NQP. This meant that on the JVM side, I just needed to implement a few extra primitives, and could then run the exact same candidate sorting code.

I think it’s worth noting again that I’m really doing two things in parallel here: hunting down places where NQP couples too tightly to Parrot and loosening the coupling, and also doing the porting to the JVM. The first half of this work is relevant to all future ports. In many cases, I’m also finding that the changes give us architectural improvements or just cleaner, more maintainable code. I wanted to point this out especially because I’m seeing various comments popping up suggesting that Rakudo (or even Raku) is on a one-way road to the JVM, forsaking all other platforms. That’s not the case. The JVM has both strengths (mature, a solid threading story, widely deployed, the only allowed deployment platform in some development shops, increasing attention to supporting non-Java languages through things like invokedynamic) as well as weaknesses (slow startup time, lack of native co-routine support, and the fact that it was originally aimed at static languages). Rakudo most certainly should run on JVM – and it most certainly should run on other platforms too. And, as I wrote in my previous post, we’ve designed things so that we are able to do so. Perl has always been a language where There’s More Than One Way To Do It. Perl also has a history of running on a very wide range of platforms. Raku should continue down this track – but the new reality is that a bunch of the interesting platforms are virtual, not hardware/OS ones.

By now, the JVM porting work is fast approaching a turning point. Up until now, it’s been about getting a cross-compiler and runtime support in place and working our way through the NQP test suite. This phase is largely over. The next phase is about getting NQP itself cross-compiled – that is, cross-compiling the compiler, so that we have an NQP compiler that runs on the JVM, supporting eval and able to run independently.