Wrestling with dispatch

The Raku Programming Language Collect, Conserve and Remaster Project

Wrestling with dispatch

Originally published on 2010-10-17 by Jonathan Worthington.

It turns out that I’m having to take a slight diversion from meta-model work to spend a while worrying over dispatch – because not dealing with the design issues now will only make for more refactoring and pain later. Further to that, it’s going to have an impact on how bits of the meta-model API should look.

Dispatch is essentially about flow control between chunks of code at the inter-block level. We often talk about dispatch as being what happens when:

However, I’m also wondering if we should consider the case of throwing an exception in with dispatch. It is also a dispatch in a sense – a dispatch to an exception handler. Running the handler may or may not indicate that some kind of stack unwinding is required. But actually getting to the handler is a kind of dispatch.

It may be useful to break dispatches up into groups.

For all complex dispatches, we have the concept of a candidate list. A candidate list is a list of things that we can work out way through calling. In a method dispatch, it’s all the methods through the MRO with a matching name. In a multi-dispatch, it’s all the matching candidates. In the case of wrapping, it’s the various wrappers, down towards the original routine that we wrapped. When we have these candidate lists, we have various operations that we can perform on them.

In some cases, we may end up with three different dispatches active at the same time for what is, from the caller’s perspective, just a method call. This shows up if you call a method (thus we’re in the method dispatcher) that happens to be a multi method (so we’re going to need to use the multi dispatcher) but somebody wrapped the proto (so we’re working our way through the wrapping candidate list too).

So, here’s some of the things I’m pondering.

First up, let’s look at how we obtain and iterate over candidate lists. For all complex dispatchers, in the general case, we have no idea up front whether we’re going to need the whole set of possible candidates or just the first one. (We may one day be able to build sufficient static analysis to know in some cases.) This means that we either need to:

Second, dispatchers are first class from a referential point of view even if not in their implementation (actually, the issues are almost certainly orthogonal). That is, if I write &foo and we have some multi foo in scope then I refer to the whole set of candidates available from the point of view of the current scope. That is, in:

class Drink { }
class Beer is Drink { }

proto foo($x) { {*} }
multi foo(Drink $x) { 42 }

my $c;
{
    multi foo(Beer $x) { 69 }
    $c = &foo;
}

say $c(Beer.new);

We’d expect the output to be 69, not 42.

Similar for $obj.can(‘meth-name’) – it returns something that we should be able to invoke and that will provide access to the whole candidate list. In these two cases, what are we actually getting a reference to? It can’t simply be a code object because we expect deferral to work. Instead, it probably needs to be a “dispatcher instance”, instantiated with (or curried with) the candidate list.

This is not really stated in the Raku specification anywhere though, so far as I can see. It perhaps needs to be, however, because we probably have to make the differentiation in ways that are user visible. For example, I expect that taking a multi and doing .candidates, or getting a method through .^methods, will almost certainly not give you something that will ever work in the case of deferral, since it gets you a code object, not something with a candidate list.

This code object vs dispatcher distinction comes and bites fairly hard in the case of multiple dispatch. A while ago, the specification for multiple dispatch changed quite drastically. I remain disappointed that I wasn’t consulted on the changes by the Raku design team, given I’ve led the way in implementation work on multiple dispatch in Raku in the last couple of years. The essence of the change is that whenever you do a multiple dispatch, you first call a proto, which in turn delegates to the multi-dispatcher.

This was done to help answer questions like “what does it mean to wrap a multi”, which was just not possible before. Now it simply means “you wrap the proto”. Answers to “what is the arity of a multi” and “what is the signature of a multi” now boil down to the same answer – whatever the proto declares. Further, this means that you have the possibility to do something pre-dispatch and post-dispatch. These are the good consequences, and I don’t disagree with they are improvements. However, there are some less good ones too.

One immediate one – though one I know how to deal with – is that now every single multi-dispatch is, in the unoptimized case, two Raku level invocations. Remember this hits every single operator invocation. Yes, ouch. This means that if we want to implement this scheme without a bad performance hit, we have to implement some basic inlining support.

An issue I don’t want to consider right now – but will probably happen – is a stream of (very justified) complaints from Raku users that “I have to write this proto thing that I never had to before – why the boilerplate?”

Here’s the one that is making my life much trickier, though. Consider:

class Drink { }
class Beer is Drink { }

proto foo($x) { {*} }
multi foo(Drink $x) { 42 }

my $c;
{
    multi foo(Beer $x) { 69 }

    $c = &foo;

    &foo.wrap(sub ($x) { "lol " ~ callsame; });
}

say $c(Beer.new);
say foo(Beer.new);

I think most folks would agree the output of this should be “lol 69”, followed by “lol 42”. The question is, how do we actually make this work? Consider what’s in $c. It references some &foo from the inner scope. But here’s the problem: what is this &foo?

That is, it needs to be the proto apart from when we do a dispatch. But that’s not where the problems end. Suppose we did manage to delegate all method calls off to the proto apart from .candidates, which obtains the current view of the candidate list. How does the {*} inside the (now wrapped) proto, which actually enters the multi-dispatcher, get hold of this candidate list?

We have a kind of inside-out thing going on here. On the one hand, we’re invoking the proto to do the multiple dispatch, but the proto is actually just calling a wrapper, which wants to have its own candidate list. That is, if you imagine a stack of dispatchers, we have the candidate list for the multi-dispatcher in our hands when we invoke the proto, but it instead wants to push a wrapper dispatcher on the stack to go through the wrapper candidate list. It’s not until we hit the {*} – if we ever do – that we need to take the candidate list from the scope we captured a reference in (or invoked from, in the simple case) and give it to the multi-dispatcher.

And coming up with a robust – let alone performant – implementation of this is what’s currently tying me in knots. I didn’t even talk about closure semantics yet. Consider:

class Drink { }
class Beer is Drink { }

proto foo($x) { * }

my $c;
multi foo(Beer $x) { $c = { nextsame } }
multi foo(Drink $x) { say "badger" }

foo(Beer.new);
$c for ^4;

For this to work, the current state of the candidate list iteration also needs to be lexically scoped too. We currently make this work in the methods case in Rakudo by making the dispatch state immutable and storing it in a lexical. Anyway, it’s yet another thing to consider into the design.

I guess my overall gripe here is that we’re going from a model that) is easy enough to implement and understand (despite having some real issues for “power users”), to something that seems to have a less obvious clean implementation, makes people write more boilterplate and relies on inlining optimizations to be decently performant. And while the specification offers the implementor a wish list of things that should work, there’s not so much in the way of guidance on how one might actually structure things.