Saturday, December 28, 2013

SIMPL on the STM32F4xx Discovery Board

Back in the summer, I posted some musings about a very small programming language that I named SIMPL  - Serial Interpreted Micro Programming Language. It's an extended version of Ward Cunningham's Txtztme, described as a Nano Interpreter - which is documented here.

Well over the period of the Christmas break, I have had the opportunity to port SIMPL, which is written in fairly standard C, across to the STM32F4 ARM Discovery board. It is now working to the point where I can program simple I/O operations and loops on the Discovery board.

SIMPL is fast on the STM32F4x you can create a pulse as short as 200nS, or toggle a pin on and off at 372kHz. By comparison on the Arduino (using digitalWrite) the pulse time is 10uS and 47.8kHz maximum frequency, though better could be achieved by direct port manipulation.

Ward Cunningham has ported txtzyme to the Teensy board, as a means of controlling hardware over a USB connection, and from a remote server. Ward is using txtzyme to control hardware remotely as part of his Federated Wiki project. A video explaining the project with some demos is here, and well worth a watch for some interesting background.

I've chosen to follow a slightly different route to Ward's project, to use SIMPL more as a programming language, but as the txtzyme interpreter is common to both the Teensy and my ARM implementation, then either hardware can interpret txtzyme strings.

Although SIMPL started off life on an Arduino, it seemed a natural choice to port over to the ARM Discovery board. The additional RAM and program space allow far more elaborate programs to be written, and as the ARM runs at 10 times the clock speed of the Arduino Uno, - the code is blistering quick.  

You might recall that SIMPL is based on an interpreter, which interprets serial character strings and decodes them into functions which are executed in turn.

SIMPL generally uses a mix of small characters and punctuation marks as the primitive command set, and then a capital letter can be used in a similar way to a Forth word, to execute a sequence of commands.

Small characters, maths symbols and punctuation marks are used as the primitives. When encountered by the interpreter, they are used to call a function directly, such as p to print a character, or h to set a port line high.  The maths symbols are used, obviously to allow maths operations to be executes, such as + - * /  and tests such as < and >.  The overall simplicity of using a single ascii character as a command, means that the entire interpreter can be coded as a simple switch-case statement in C.  For example when h is encountered:

      case 'h':                               // set bit high
      digitalWrite(x, HIGH);

Where the interpreter encounters a number character 0 to 9, these are scanned in, and accumulated into variable x with the correct decimal place shift, for as long as the next character in the buffer is a number. The following few lines of C perform this number interpretation

      x = ch - '0';
      while (*buf >= '0' && *buf <= '9') {
        x = x*10 + (*buf++ - '0');

Punctuation marks are used to help program flow. For example the open and close brace structure defines a portion of code to be repeated as in a loop. Suppose we want to loop some code 10 times, for example print 10 Green Bottles.  The Green Bottles will be printed if it is written thus _Green Bottles_ and the loop structure uses a loop index k, which is decremented each time around the loop. kp will print the current value of k, each time decrementing.

10{kp_Green Bottles_}

Unfortunately p also introduces a newline character, so the output is not quite as desired :-(


In this interpretation scheme, I have reserved the upper case letters to represent "words" - in the Forth sense. In order to maintain the same simplicity in the interpreter, single characters are decoded as calls to the code stored in the word definition.  This may sound confusing, so to illustrate with an example.

Suppose I liked the "Green Bottles" code so much, that I wanted to preserve it and use it again, and again. Well I can do this by making it into the form of a colon definition (again borrowed from Forth), such that I can assign it to a memorable word, say B for bottles.

The colon : tells the interpreter that the next character will be a word, followed by the code that will be executed when the word is called

:B 10{kp_Green Bottles_}

This will place the code 10{kp_Green Bottles_} at a location in memory that can be accessed everytime that the interpreter sees the letter B.  Instead of interpreting the contents of the keyboard buffer, the interpreter is now pointed to a word buffer, containing the code body that makes up B.

This process can be extended by adding together words to make new words

:A   10{B}

This defines A as ten iterations of B, or about 100 Green Bottles!

The word definitions are stored in a RAM array, and the ? command can be used as a means of listing all the current word definitions.

SIMPL is now starting to take the form of a very concise language.  It can handle numbers, character commands from the terminal and perform simple maths. It can perform loops and output numbers and text to the terminal. The next thing is to allow it to access the peripherals and I/O.

SIMPL has been developed to be "close to the hardware".  As almost all microcontrollers have a mix of on chip peripherals, such as timers, ADCs, DACs and general purpose I/O, SIMPL has primitive commands designed to exercise this hardware directly.

For example, suppose you have an output port on Port D13 controlling a LED,  then the command 13h will turn the LED on and 13l will turn it off,  h and l being the commands to set the named port 13, either high or low.

For analogue input from an ADC channel,  the command s for "sample" is used. So to read the ADC channel 10 and print to the terminal, we enter 10sp.

Commands can be repeated within loops using the {  } loop structure.

To read the ADC channel 10, ten times and print the result to the terminal, we use:


or to slow this down to once a second, we can delay with the millisecond delay m command which puts a 1000mS pause in the loop.


We can now put a few of these commands together so as to flash the LED for 100mS each time we take an ADC reading


Note that we are still looping 10 times,
reading ADC channel 10 and printing it out,
pausing for 900mS,
setting LED 13 high,
pausing for 100mS,
setting the LED low,
returning to the start of the loop.

So it's quite easy to build up complex commands, just by concatenating a few primitives together.

What if you were to write this in C? It would be somewhat longer, about 8 to 10 lines of code, and then need to be compiled, tested, and if you weren't happy, edited and compiled again. SIMPL breaks us out of the edit, compile, test cycle, and allows ideas to be tested quickly and easily, straight from the terminal.

This was exactly what Forth language pioneer, Charles Moore realised in the mid-1960s. He too wanted to break free from the edit-compile-test cycle, in order to improve his efficiency in coding, as computer time was expensive back then, and also to make himself no longer reliant on compilers and assemblers that he had no control over.  He wanted to develop a self-written, self-contained programming environment that could easily be moved from one system to another, and require few and simple resources to get it running.

If you are interested in the early history of Forth, Charles "Chuck" Moore describes the early development of the language here:

History of Forth

Whilst SIMPL is not Forth, and never will be, there are several nice techniques borrowed from Forth, which are used to enhance SIMPL.

We have come a long way since the early development of the computer languages, such as C and Forth, which were often hosted on very primitive machines, with limited RAM and tiny disks by today's standards.

One early machine, the PDP-8, was of a simple enough architecture, that it has often formed the basis of study in computer science courses. Additionally, the PDP-8 has been implemented in many different technologies over the years, including TTL, VLSI and as a FPGA implemented in Verilog or VHDL.

The reason for interest in the PDP-8, is that whist primitive by machine standards now, it represented a revolution in architecture simplification, such that the whole system could be sold for $18,000 back in 1965. It was the first of the true minicomputers, available at a tenth of the cost of competing systems.

Additionally, the PDP-8, is comparable in resources to those that we find on a low cost microcontroller, costing a few dollars. So the practices used in the 1960s to write code for the early minicomputers has significant relevance these days.

Whilst generally we use open source C compilers, such as GCC, and integrated design environments (IDEs) to develop programs, there is no reason why a simple interpreted language, running in the background would not make sense, when developing applications on a new microcontroller. It puts the control of the hardware, directly at your fingertips and allows quick experimentation.

Modern 32 bit microcontrollers, such as the ARM range of cores, are now rapidly replacing 8-bit devices in a range of applications, at very little additional extra cost. The resources and peripherals available on a typical ARM device, are vast compared to 8-bit processors, and with clock speeds roughly ten times faster, a huge amount of processing power is available, compared to only a few years ago.

With clock speeds in the 100-200MHz range, it's now perfectly possible to host an interpreted language on the microcontroller, and have it run at speeds only previously available through the compiled language route.

SIMPL consists of an interpreter running within a loop.  The interpreter is written in about 300 lines of standard C, and additionally includes several I/O and memory functions, which are tailored to the particular microcontroller, to create a standardised machine model.

To the Arduino user, these I/O routines will be familiar:

digitaRead                               Read the state of a digital input
digitalWrite                             Write a digital output high or low
analogRead                              Read the value of an ADC channel
analogWrite                             Write a value to an analogue PWM or DAC channel
delayMilliseconds                   generate a delay in mS
delayMicroseconds                  generate a  delay in uS
printNum                                  print an integer number to the terminal
putChar                                    print an ASCII character to the terminal
getChar                                     read an ASCII character from the keyboard or terminal input buffer

All microcontrollers for embedded applications should have the hardware means to execute these routines, and although setting up the I/O and peripherals may take a bit of time on an unfamiliar hardware device, once done the basic routines will be used frequently in any application that may be developed.

delayMilliseconds can simply be derived from 1000 times around the delayMicroseconds loop, or can be derived from a hardware timer, depending on the device.

Whilst putChar would normally send a character to the UART transmit buffer, it might instead be used to bit-bang an output pin, if no UART is available. Fortunately most microcontrollers have one or more UARTs available these days, so bit-banged serial comms is less of a requirement.

printNum is just a convenient means of getting integer numerical output from the microcontroller. It will use the C routine "integer to ASCII" and putChar to send an integer to the terminal.

getChar is intended to read in a character at a time either directly from the keyboard or from the terminal input buffer.

With these routines, you can now interface to a serial terminal, read and write to I/O lines, read analogue inputs and control PWM or DAC outputs.  With the ability to specify accurate delays to give a sense of timing to the program flow, there is little else that the micro needs to do, except perhaps for memory operations.

Thanks for the Memory

In bygone days, when microcontrollers had very little memory, it was easy to get obsessed about the placement of code and data within memory. Memory was a precious commodity, and it was necessary to squeeze your program and your data into what little memory was available. Every byte was precious, so clever tricks were devised to pack data into as few bytes as possible.

These days, this is not so much of a problem.  The STM32F407 has a 1Mbyte flash for program and storage of constant data, and 196Kbytes of SRAM, 4Kbytes of which can be battery backed and made non-volatile. Other STM32F4xx family members have 2Mbyte of flash and 256Kbytes of SRAM. This might not seem much by PC standards, but for an embedded application it is more than plenty.

To be continued.

No comments: