Wednesday, February 17, 2016

Experimenting With New Devices - Part 2

The '5969 Launch Pad  - an ideal experimenting platform for FRAM

In this post I look at ways in which to get the most performance out of a small computer system based around the MSP430FR5xxx with external serial  RAM and FRAM.

MSP430 Performance

The MSP430 being a 16 bit architecture is pitched performance-wise somewhere between AVR and ARM.

If you are doing a lot of 16 bit maths operations, then it will be quicker than the AVR as these are single cycle operations on registers, rather than having to combine two 8 bit operations.

This report compares execution time and code size for a number of common microcontrollers, in particular comparing MSP430 with ARM and AVR.

The speed up factor over an ATmega328 Arduino is as follows:

Simple Maths       1.27
FIR Analysis        3.29
Dhrystone            1.83
Whetstone           2.56

Average               2.24

When this is combined with a 24MHz clock - rather than the normal 16MHz clock on the Arduino the average speed improvement is approximately 3.35.

Code execution is fastest from RAM, so an important speed improvement will be achieved by copying all commonly used code from FRAM to SRAM at initialisation time. This copying process is extremely fast with 1Kbytes of FRAM memory contents copied to SRAM in approximately 128uS.

SIMPL on the MSP430

I tried SIMPL back in 2014 on a very low spec MSP430 value line device in a 20 pin DIL package with only 512 bytes of memory. The C code for SIMPL was compiled using the Energia IDE. There was no difficulty porting what had been an Arduino sketch to Energia - what runs under Arduino runs on an MSP430 under Energia, with zero or very little modification to the sketch. Only the lack of RAM on the MSP430G2553 device was a little problem.

Porting SIMPL to a FRAM Device.

Boosted by the initial success with the MSP430 Launchpad, in getting a basic version of SIMPL to run, I decided to invest in a couple of FRAM based devices.

First, I tried identical sketches on the ATmega328 and the MSP430FR5969.  This was purely so that I could prove that they both ran as expected, and so I could compare the codesize and execution time for each implementation.

For the same clock frequency of 16MHz, the MSP version used about 80% of the codesize and executed simple loops containing integer maths at about 25% improved time.

It was now time to reduce the Arduino supplied UART routines (Serial.xxx) to self coded putchar, getchar and printlong.  This process removed 2200 bytes of code from the implementation - down to 3720 bytes from a starting point of 5920.

The other large chunk of code is the array allocated to holding the initial word definitions. This is a 26 x 48 character array (1248 bytes) - which when removed brings SIMPL down to about 2472 bytes.

The other thing I noticed was that whilst the UART routines take just less than 300 bytes to implement, when they are combined with setup(), the whole thing bloats out to about 640 bytes. This needs further investigation because without this bloating, SIMPL could reside in just 2Kbytes of memory.

Making Use of the FRAM.

The question now is what new features can SIMPL leverage off the back of the MSP430's FRAM?

The write speed of FRAM is well matched to the higher baudrates of the USB to serial converter. Instead of 9600, we can now upload source code at 921600 baud or higher. Using a hex file format, a full 16Kbytes of code could be sent over the serial link in about 0.25 seconds.

The FRAM may be written to at 125nS per 16 bit word, so the whole FRAM storage space could be updated in about 1 milisecond.

The 10 bit ADC could send data back to the PC at almost 100ksps.

Execution from RAM is 3 times faster, so the primitives and the SIMPL kernel are loaded from flash into RAM during initialisation.

The MSP430 has a 20 bit address space, so up to 64 blocks of code could be swapped into FRAM when required.

Off chip storage could be uSD card or even serial FRAM - such as the Cypress 25LV20   which is a 256Kx8 SPI FRAM device in an 8 pin SOIC package.  The SPI bus runs at a maximum of 10Mbit/s - so a "block load" of 1Kbytes is approximately 1mS or the time to load a full 16K block from external FRAM via SPI is 14mS.  This is effectively 1uS access time to a large non-volatile solid state drive.

For faster access either dual SPI or Quad SPI would need to be employed.  This can be done using bit-banging techniques using quite elementary code.

See the Microchip App Note AN1791 for details. A datarate of 8Mbyte/s should be achievable. This would allow the 23LC1024 to be fully read in about 16mS.

To be continued...... soon





No comments: