Changing Direction
I hope you all had a very Merry Christmas.
For a long time, I've been writing posts about Toccata. I wanted a body of material that explained the philosophy and capabilities of Toccata. But now we change direction and instead of talking about Toccata, we're going to start using Toccata. Starting with the basics.
When I was young ...
and just learning to program, I remember reading an article/program in a computer magazine explaining something called 'parsing'. I didn't even know what 'parsing' was, but it was an interesting program to play around with. There's been a massive amount of research about parsing in the years since and many advances. Parsing a sequence of bytes into structured data is still one of the fundamental programming tasks today. And yet, hand rolling a parser with imperative code is hard to write, debug and verify. Many libraries exist in almost every language to help programmers and Toccata has it's own. Here's a short program to parse a hard-code text string.
This is a significant chunk of code, but we'll be building on it, so I'm going to explain it all line-by-line. You can copy-and-past the above to a file and use the run
script to run it.
This expression is how to import a Git repo as a dependency. There is no central package repository. Instead any Git repo can be a dependency. In this case, the file recursive-descent.toc
is the file imported from the recursive descent repo. This expression also states that any symbol imported from recursive-descent.toc
must be prefixed by rd/
when referenced in this file.
This is how a recursive descent parser is created from a grammar, in this case, the simplest possible one. Couldn't be easier. This dependency has the definitions to produce a recursive descent parser from a grammar. I intend to provide other kinds of parsers eventually.
And here we have 2 examples of using that parser to parse 2 different strings. The output generated from this short little program is
If the parser successfully matches the string passed to it, the default is to just return that string wrapped in a Maybe value. But if it doesn't match the string, nothing
is returned. That's enough to get us started.
Literal strings are just one of several 'simplest' grammars possible. The grammar.toc
file has more as well as some functions for composing simpler grammars into more complex ones.
And here we have a simple grammar. In this case, 3 terminal grammars are composed into a single grammar with the grmr/any
function. This composed grammar will match any of the three strings, but nothing else. We'll see this in action in a second.
And we create a parser from the given grammar
And here we have 4 examples of using that parser to parse 4 different strings. The output generated from this short little program is
No big deal.