Dec 24, 2017 - And Sums

Comments

Merry Christmas

The Full Algebra

A couple of posts ago I included a link to an explanation of Algebraic Types to explain the 'Product' in product types. If you read that (and I do encourage you to at least skim it), you should be wondering if Toccata has Sum Types as well.

It does.

(def Sequence (comp List
                    Vector))

A sum type like Sequence here, can only be used in assert and assert-result expressions. If the compiler can not prove a value is one of the types included in the sum, it will insert code at the appropriate place to check that at run time. But there's no possibility to 'pre-dispatch' a protocol function at compile time on the basis of a sum type. So they're mostly for bug catching.

The combination of Product and Sum types gives Toccata Algebraic Data Types. Or at least most of the functionality of ADT's. The goal Toccata's type system is not go all the way in the direction of Haskell or Scala. But to give as many of the benefits of a sophisticated type system to programmers who prefer dynamic types while imposing as little cost as possible. And there's quite a lot more benefit to be derived from the infrastructure that's been implemented so far. I hope add those features soon.

And since it's Christmas, we'll leave it at that.

Dec 22, 2017 - Following Protocol

Comments

They Live

We continue our exploration of Toccata types by looking at the link between deftype and defprotocol.

It's not enough to just put data together in custom types. We also need to write functions that do things to or with that data. Now, it would be possible to just write a 'plain' function that would take it's arguments and explicitly test the types of the values given to determine what to do. But there's a better way. It goes by multiple names; Parametric Polymorphism, Polymorphic Dispatch, or what have you. The idea is, at a call site, the compiler inserts code that checks the types of the argument(s) and chooses which function to execute based on the types given. This is what Toccata does. So all that remains is to tell the compiler how to distinguish between calls to 'plain' functions and calls to functions that need to be dispatched by type. That is where defprotocol comes in. It lets the programmer provide a list of functions that are to be dispatched at runtime. I wrote about Toccata's core protocols here. Toccata only chooses which function to execute based on the type of the first argument. The types of the other arguments don't factor into that decision.

So let's take one of the core protocols and look at it a little closer. Hmmmm, I pick this one.

;; For collections whose contents can be indexed by integers
(defprotocol Indexed
  (nth [coll n]
    ;; Retrieve the `n`th value from `coll`, wrapped in a Maybe, if there are
    ;; enough values in `coll`. Otherwise, returns `nothing`.
    (assert (instance? Integer n))
    (assert-result a (instance? Maybe a)))

  (store [coll n v]
    ;; Create a new copy of `coll` (wrapped in a Maybe)  with `v` at index `n`
    ;; if `coll` is at least size of `n` - 1. Otherwise, return `nothing`.
    (assert (instance? Integer n))
    (assert-result a (instance? Maybe a))))

Looking at the nth function, we see two type assertions. The first one is like we've seen before and asserts that the parameter n is always an integer. You should never put a type assertion on the first parameter to a protocol function, because it is the dispatch type.

The next thing we see is the assert-result expression. This expression includes a symbol, in this case a. This is only used inside the assertion expression and stands in for the value returned by the function.

These two assertions give us the contract for all the nth functions. They take a collection of unknown type and an integer parameter n. And they all return a Maybe value, which may be nothing or may contain the nth value of the collection. Now, if anyone tries to implement an nth function for a collection that doesn't return a Maybe value, they're going to get an error at compile time if the comipiler can determine that to be the case. Otherwise, the compiler will insert some code in that particular implementation to check the return value. Regardless, the compiler will know that any call site that calls nth will evaluate to a Maybe value. Pretty neat, huh?

Now, let's look at a type that wraps Vector for some reason.

(deftype NewVec [v]
  (assert (instance? Vector v))

  Indexed
  (nth [_ x]
    (nth v x)))

This is a nonsense type, but does show how to implement a protocol function. And since v is guaranteed to always be a Vector, the compiler can 'pre-dispatch' the call to nth in the implementation. But how is nth defined for Vector? That is here.

(extend-type Vector

  ;; ...

  Indexed
  (nth [v n]
    (get v n))

  ;; ...
  )

Vector is defined in the sub-basement of Toccata, so there's no way to write a deftype expression for it. Also, if you read through core.toc, you'll see the functionality for some of the core types has to be broken into several pieces.

More importantly, you can write an extend-type to add protocol functions to any type you want. The only difference is that in deftype expressions, you can access field values by name. Whereas in extend-type expressions, you have to use field accessors on the dispatch parameter.

Dec 20, 2017 - Product Types

Comments

Just My Type(s)

As I've been explaining Toccata, I've used the core types to illustrate the various concepts. But now that I've cracked open the door on how Toccata uses type assertions to keep code correct, it's time to get a little crazy.

Yep, I'm gonna talk about making your very own types.

It's (almost) all in the definition

There are a couple of pieces of information you need to define a type; a name and names for each of the fields. In the simplest case, this looks like

(deftype Flummer [phlux priv zlog])

This defines a type Flummer that has 3 fields. So each Flummer value will actually be 3 values that are forever bound together. (Implementation detail: each Flummer value is actually a Vector. But you can't get access to it.) Those three values can be literally any valid Toccata value, including other Flummer values or Vectors or HashMaps or ...

And how are these Flummer values created? Glad you asked.

(Flummer 'hook "crook" 23)

Couldn't be easier. The order the values are given in the call to the Flummer function, called the construcor BTW, matches the order of the fields in type definition. So for this particular value the fields are assigned values like this

phlux: 'hook
priv:  "crook"
zlog:  23

And if we have a Flummer value and want the value in it's phlux field? We use the field accessor

(.phlux (Flummer 'hook "crook" 23))

evaluates to the value 'hook.

There's one more trick here with fields. Given a Flummer value, we can create a new Flummer value by replacing a fields value using the a field accessor like so

(let [flm (Flummer 'hook "crook" 23)]
  (.zlog flm 3400))

This let expression evaluates to a new Flummer value with fields

phlux: 'hook
priv:  "crook"
zlog:  3400

and the Flummer value flm was automatically garbage collected because it was no longer used.

But why ...

Since you're reading this post, I'm assuming you're an elite programmer and don't need me to explain why you might want to define you're own types. OTOH, you might be wondering why the title is Product Types. That because you can restrict what types of values can be assigned to the fields. So let's take a look at this totally hypothetical type

2: (deftype HTML [tag attributes contents]
3:   (assert (instance? String tag))
4:   (assert (instance? HashMap attributes)))

For this type, the tag field must always be a string and the attributes must be a HashMap. But the contents field is wide open. It can be anything. So if you try to create an HTML value with a symbol as a tag rather than a string, it's going to blow up at compile time.

6: (main [_]
7:       (HTML 'img {} nothing))

does this

*** Failed type assertion at try.toc: 7.
Expected 'String', got 'Symbol'.
Assertion from
 boom.toc: 3

And one more point for clarification

2: (deftype XML [tag attributes contents]
3:   (assert (instance? Symbol tag))
4:   (assert (instance? HashMap attributes)))

Different types can have the same names for fields. And those fields can be of different types even if the names are the same.

And why is this called a product type? This post has gone long enough, so I'm going to fob you off to someone else to answer that.