Wednesday, May 22, 2013

SIMPL - A simple programming language based on Txtzyme

SIMPL - A simple programming language based on Txtzyme

Last weekend I played around with Ward Cunningham's Txtzyme - a minimalist programming language, with an interpreter written in C so that it can be easily ported to many microcontrollers.

Txtzyme contains all the elements necessary to enable a microcontroller to interpret and execute a series of typed commands, and is an ideal example to learn the techniques employed by more sophisticated interpretive languages.

During the past week, I have written some extensions to Txtzyme and tried out a few ideas to  make Txtzyme more versatile and easier to use.

This blog post is a tutorial in Txtzyme, and it's new extensions to create more of a useful language - which I am calling SIMPL - A Serial Interpreted Minimal Programming Language.

SIMPL runs in under 6K on the standard Arduino.

A Brief Description of Txtzyme.

Txtzyme consists of an interpreter contained within a loop.  The interpreter, intended to be very simple, decodes individual ASCII characters, and executes a block of C code associated with that character. This is similar, in principle to how Forth scans through a series of Forth words and executes the code associated with them, but Txtzyme treats each ASCII character as a word, greatly simplifying the scanning process.

The interpreter steps through a string of ASCII characters, executing the associated code blocks in turn. This may be slow in speed in comparison to assembly language execution, but the user does not need to know about machine code, assembly language or even C to make the microcontroller perform simple tasks.

Numbers and Maths

Txtzyme enumerates numbers and assigns them to an integer variable x. Typing 123p into the serial terminal will set x to 123, and then print that value out when the p command is executed.

I extended Txtzyme with the use of another integer variable y.  A number can be stored in y by using the ! character. This again being borrowed from Forth.

456! will initially assign 456 to x, and then copy it into y.

The use of the second variable allows simple maths operations to be performed. Taking the simple interpreter, I added some maths operations + - * and / .

456!123+p  will set y to 456, then set x to 123, add x and y, leaving the result in x and then print out the answer 579.

I/O Commands

Txtzyme was designed to perform simple I/O operations on the pins of the microcontroller, with each operation being initiated by a serial command.  This allows ports to be set, inputs to be read and analogue inputs to be read and printed to the serial terminal.  The keywords that perform these operations are generally only single ascii characters, chosen to make the commands surprisingly human readable.

First you have to state which I/O pin you wish to use.  This is done with the d command.

For example  6d  will select digital pin 6

You may then set this selected pin to high using 1o   (where o = output)  or to low using 0o.

To read an input pin, first you have to define it eg.  8d will define digital 8. Then you use i, for input to read it's state into x. p will then print out the value.

To read the value on one of the ADC pins, you use the s command, which means analogue sample.

0sp will read the value on ADC 0 into x and print it out.


Txtzyme uses a simple loop structure, allowing commands to be executed repeatedly. It also uses the native delay functions present on the Arduino to provide simple timing functions - ideal for flashing LEDs and generating musical tones from output pins. Txtzyme can toggle output pins at up to 47kHz, a speed that is only limited by the Arduino digitalWrite function.

The loop function will execute any command contained within braces {}.  It uses x to initialise the loop counter, k to a starting value, from whence k will be decremented to zero and terminate the loop.

For example to print out ten readings from analogue pin 0


To see the loop count variable k decrementing

10{kp}  will print out the numbers from 9 decrementing to 0

Txtzyme does not yet have the ability to do complete FOR/NEXT loops. This would be a very useful addition, the means to perform a loop such as

for i = x to y step z

I have added primitives @ l y and  z

y allows direct access to the variable y so typing 10y   is equivalent to y=10

@ was intended to be the equivalent of Forth's fetch, but in SIMPL it copies the value stored in y to x.

l is a loop counter.  It increments y by 1 every time

The construct l@p  increments y by 1, copies it to x and prints it out.

To print an ascending series of numbers from 1 to 10 uses this construct


When combined with the read primitive r (see below) this can be used to read and print consecutive RAM addresses.

I/O with Loops and Timing

Loops can be used to flash LEDs or generate tones, and can be combined with the millisecond and microsecond delay functions to generate appropriate timed behaviour.

To flash a LED on pin 13 ten times, on for 500mS and off for 500mS


To turn this into an audible note to sound a small speaker on pin 6, we shorten the delay to say 500uS, and generate say 1000 cycles (1000 times around the loop).


Creating New Definitions

The simple Txtzyme interpreter readily executes a program contained in the characters in the input buffer, but wouldn't it be great if you could store these mini programs in RAM so you could use them time over?

The mechanism to execute a program from RAM, is to point the interpreter at the location of the starting character and let it execute the characters in turn until it finds a return '\r' or newline '\n' character that marks the end of the string.

Here we have to start borrowing ideas from Forth, the concept of the "word" that signifies the start address of a block of code to execute, and the "colon definition" - a mechanism to write new words into memory.

To keep things simple, our new words will be assigned only the capital letters, allowing up to 26 new words to be defined, namely A to Z.

Additionally, to simplify the addressing of these words, and to keep the RAM usage within the limitations of the ATmega328 ( as used on the Arduino), we will allocate each word just 48 bytes of memory, with each consecutive word starting on the 48 byte address boundary.  This allows us to easily decode the ASCII value of the character, find the starting address of the associated code, and execute the word.

To create a new word, and to tell the interpreter to copy the input character string into the correct position of RAM we use the "colon definition".

We type a colon followed by the capital letter of the word we wish to create.  Suppose we liked the tone example from above and want to assign it to the letter T, we type


The colon definition code will detect the leading : and then decode the T to 84 in decimal.  It then creates an address by multiplying 84 by the allocated permissible word length of 48 bytes, and adds this to the start address of the array assigned to hold the definitions.  I have restricted the length of the definitions to 48 bytes, because on the Arduino we are short of RAM - only having 2K to play with. Having fixed blocks of 48 characters for each definition is wasteful, but it greatly simplifies and speeds up the instruction decoding and addressing process.

This character based addressing process happens automatically and we don't have to concern ourselves about the exact address that holds T, however whenever the interpreter encounters a T, it will jump to the correct address and execute the code it finds there.

Once stored in RAM, the interpreter prints out T, to say it has been defined, and then executes the new word twice. This is a quirk of the interpreter.  So we hear the tone twice as long.

Now whenever we type T, we will get the tone.

In order to keep track of what we are doing, I have added a ? command.   This prints out the entire definition RAM, showing the commands A to Z and the code that is associated with them.  This allows a definition to be edited, by cut and paste operations from the ? listing into the input buffer, and changing what ever parameter is to be edited.  The edited definition is then automatically stored back into RAM, when you press return.

Some Prestored Definitions

Just for fun, I decided to hard code some "musical notes" into the definitions. Characters A to G play a short musical note, roughly tuned to A = 440Hz , so it's possible to play tunes just by typing the notes.  Having "musical debug" is also a great way of determining whether the program is doing what was intended.

The SIMPL strings for properly tuned notes are as follows, and are pasted into the RAM array when the sketch is first compiled.  They can be written over at anytime.

//  40{1o1106u0o1106u}     // A 440 Hz
//  45{1o986u0o986u}         // B 493.88 Hz
//  51{1o929u0o929u}        // C 523.25 Hz
//  57{1o825u0o825u}        // D 587.33 Hz
//  64{1o733u0o733u}        // E 659.26 Hz
//  72{1o690u0o691u}        // F 698.46 Hz
//  81{1o613u0o613u}        // G 783.99 HZ

Building Up Programs

The real power behind the colon definition is that new and more complex words can be assembled from existing definitions, and then executed as singe commands. For example, suppose we have defined three tone generating words A B and C.

We can type ABC - which will play the three tones in succession

Alternatively   5{ABC}  will play the 3 tone sequence 5 times over

We could then define a new word I as


Whenever we type I, we get ABC played 5 times.

So from very simple definitions, quite complex operations can be performed.

This is an incredibly versatile technique. It allows you to write your own routines, assembled from other routines and then be able to use names that you can remember.

For example, you can define a word H to set the port pin high, and L to set it low.  If you have a LED connected to pin 6,  6H will turn it on, whilst 5L will set pin 5 to logic low. It becomes very easy to toggle any output pin - just with a couple of memorable keystrokes.

This ability to form new definitions is one of the fundamental and most powerful aspects of Forth - so very worthwhile to borrow it to make it adaptable to the minimal programming environment of SIMPL.

Some Other Additions

The Txtzyme interpreter is readily modified to include new functions.  This is a work in progress and I have added some simple extensions.

I now have comparison operators < and > to test whether x is less than y or greater than y.  These operations will set x to 1 if true or 0 if false.

10!5<   translates to "is 5 less than 10 ?". This is true so x is set to 1.

10!5>  translates to "is 5 greater than 10 ?". This is false so x is set to 0.

To make use of these comparisons I have introduced the jump command j.   If  x = 1 the next command is skipped, otherwise it is executed as normal.

Memory Operations.

I wanted a means to directly edit the contents of an address in RAM and read it.  These are equivalent to PEEK and POKE.

The y variable is used to hold the address  - so  723! will set the address up as 723.

The x variable is used to hold the data

To write to RAM we use w and to read we use r

So 723!65w   will write the character ASCII 65 into address 723.  You can then type ? to see that it's there. It pokes an "A" into the first location of the word defined by J.

To read a location,

723!r  will read the contents of location 723 and print it as an ASCII character.  You get your "A" back.

As this character has been poked into a definition, it has modified that definition. In this case it will cause definition J to play the tone associated with A.

Finally, I needed a means to look at a block of memory. The q command does this by printing out the desired number of characters starting at the address stored into y.

As an example, address 627 holds the start address of the H command

Typing 627!48q  will print out 48 consecutive characters, which in our case is the start-up message

_Hello World, and welcome to SIMPL_

With the loop primitive l we can also read and print successive memory locations


This sets the address in y to 627 and reads and prints 48 consecutive locations using the r primitive.

Using the colon definition, the construct l@r could be assigned to R,   as in :Rl@r

So to read and print 33 characters from address 627 we use


A Work in Progress

SIMPL and it's underlying Txtzyme interpreter is constantly evolving as new commands are added to try new ideas.

It is unlikely that it will ever be a serious language, but a novel experiment and a means to understand how an interpreter can be manipulated to execute a series of simple commands.

Many of the ideas have been borrowed from and inspired by Charles Moore's Forth as it evolved from a set of ideas into a proper language during the 1960s.

You can download a recent version of SIMPL from github gist - but be aware that it is a work in progress.

I hope others will get as much enjoyment as I have from tinkering with SIMPL and Txtzyme.

No comments: