Typing Toccata

The mission of Toccata is "To be the most productive programming language possible." (Everyone needs goals. :) ) Another way of saying this is that Toccata should help typical programmers write typical software as quickly and correctly as possible. Obviously, (to me, at least) the Functional Programming paradigm is the first step. But then what? For many, static type checking would be the next requirement. Yet languages like Scala and Haskell seem to be pretty strong in that area already. And while they're popular, there are large numbers of programmers who prefer dynamically typed languages for the flexibility and ease they offer in getting software written. And they might have trouble switching between programming at the "type level" and programming at the "value level". I certainly do. Toccata is intended for those people. This post starts to describe Toccata's "killer feature" enabling this.

Gradually

Let's start with a function from the core library.

116: (defn add-numbers [x y]
117:   (assert (instance? Integer x))
118:   (assert (instance? Integer y))
119:   (inline C Integer "..."))

(I don't usually include line numbers in my code snippets, but they'll be needed later.)

Let's start at the bottom. The inline C expression has an optional type annotation field which in this case is said to be Integer. This means that the C code (which I've ellided) will always produce an Integer value. Therefore, add-numbers will always return an Integer.

But what if we tried to pass a string to add-numbers? That's what the two (seemingly) runtime assertions deal with. They assert that both x and y have to be Integers, which is checked at runtime. Or is it? This establishes a contract for add-numbers that it is only valid to call it with two integers and that it will always return an integer. It's like there's a wall or a fence between the C code which implements this functionality which is safe, and the rest of the program, which is not guaranteed to be safe.

And here's the kicker; the compiler can tuck that contract for add-numbers away for use later.

Let's look at another core library function.

709: (defn +
710:   ([] 0)
711:   ([x y]
712:    (add-numbers x y))
713:   ([& xs]
714:    (reduce xs 0 add-numbers)))

Just focus on the 2-arity for the moment. It calls add-numbers with the two values that it is called with. I refer to this call to add-numbers as a 'call site' and since there's only one target for this call site, the compiler can check the contract it remembered when it compiled add-numbers and move the fence guaranteeing the safety of adding two number up the call tree one level to here. It can also know that the 2-arity of + will always return an integer. So the runtime assertions that were ostensibly compiled into add-numbers can instead be moved to this (and every other) call site which targets add-numbers. And this in turn supplies a contract for the 2-arity of +. That it only accepts integers and always returns an integer.

As I understand it, this expanding the boundaries of code that is known to be safe is referred to as 'Gradual Typing'. Feel free to correct me in the comments.

But wait ...

So let's take it one more step. Check out this function

915: (defn inc [x]
916:   ;; Add 1 to an integer
917:   (+ 1 x))

inc calls the 2-arity of + with a literal 1 and whatever value x is. Since the compiler can know exactly what function is targeted at this call site, it can move the 'safe code' fence out to here. And since 1 is a literal integer, it can determine at compile time that it doesn't need to be checked. And it also knows precisely what the contract for inc is. It accepts one integer and returns an integer.

In practice

Now what happens when we run this program?

1: (main [_]
2:      (println (inc "bogus string")))

That's a trick question. This code doesn't even compile. Instead, we get this error.

*** Failed type assertion at tricky.toc: 2.
Expected 'Integer', got 'String'.
Assertion from
 $TOCCATA_DIR/core.toc: 917
 $TOCCATA_DIR/core.toc: 712
 $TOCCATA_DIR/core.toc: 118

By moving the safety fence up the call tree, the compiler can tell us exactly where the bug is at compile time rather than make us sift through a stack trace at runtime. In essence, the compiler gives a "reverse stack trace" that tells us where the bug is and the path to where the failure would have occurred. Also, one of the runtime assertions can be eliminated. And all the runtime checks between the call in the main function and add-numbers can be eliminated as well. All this happens with no additional effort from the programmer who wrote 'tricky.toc'. It happens automatically.

But it gets so much better than that...

Next time.