Wednesday, January 13, 2016

More Thoughts on Retro Computing

Forty years ago when home computers were in their infancy - and I myself was still in short trousers, the tech community in Silicon Valley began to build their own homebrew computers based on the newly available microprocessors such as the 8080, 6800 and 6502.

Microprocessors were still expensive (especially Intel) but with the advent of the considerably cheaper 6502 in 1975 there was a significant reduction in the overall cost of a homebuilt machine. The Homebrew Computer Club - that met regularly at Stanford University soon obtained a copy of a cut down interpreted programming language called Tiny Basic, and very quickly it was being adapted to run on all sorts of fledgling machines - a process that involved a lot of tedious copying of listings,  saving onto paper tape or cassette, or even programming onto Eprom.

The early pioneers desperately wanted an easy to use language, in plain English text that was quick to learn and quick to program.  They put up with perhaps limited maths capabilities and lacking in anything but single alpha-character variables,  but the advantages more than overcame the shortcomings an so it was Tiny Basic that helped seed the early microcomputer industry.

Wikipedia has a good account of those early days of TinyBASIC.

Tiny BASIC was compact and designed to fit into about 2 - 3 kbytes of memory - as the early micros generally only had between 4K and 8K of memory - similar to what the first minicomputers had, - a decade earlier.

In some implementations, it was coded in such a way that it would run within a virtual machine on the microprocessor.  This approach led for compact, more portable code - but was not so fast in execution as an implementation coded in the host's native machine language.


In the mid-1970s microprocessors could at best muster about 1 million instructions per second - but still were able to deliver a credible computing experience when programmed in TinyBASIC.  The first machines generally used teletype or serial terminal for input and output - the ASR33 Teletype being favoured - as it had built in punched paper tape reader and punch which conveniently provided a permanent paper tape record of the program - which could be swapped and exchanged amongst friends at the computer club meetings. When I went through university in the early 1980s - the engineering department were still using 20 year old ASR33 Teletypes!


Within a few years, the large corporations became interested in the home computer market - and the products were further refined to be more home-friendly.  This involved incorporating some video generation hardware into the design - so that the output could be displayed on the home TV.

Whilst this increased the market appeal, it meant that often the poor overworked microprocessor was further burdened by the overheads of servicing the video display - and that real processing was constrained to the vertical blanking interval - slowing the machines down further.  Early Sinclair machines (ZX80, ZX81) had been honed down to the bare minimum hardware, in order to undercut the cost of competitor's machines - and so the resulting operation of running basic was particularly slow.

Recent benchmarking tests show that a 3.5MHz Z80A - as used in the Spectrum was capable of about 0.142 DMIPS.  Or 250.7 Dhrystones per second.


Microprocessor clock speeds increased rapidly over this decade.  The very popular Atmel AVR appeared in 1996 and was capable of being clocked at speeds of up to 20MHz.  The modified Harvard architecture featured a 2 stage pipeline, allowing execution of one instruction whilst the next is being fetched. This gave operational speeds of approaching 1MIP per MHz - so a 20MHz processor would achieve about 20MIPS.

The AVR was optimised for executing high level languages - especially C, and as such has a highly orthogonal instruction set.

There have in recent years been several implementations of Tiny Basic, written in C and aimed at supporting, in particular, the Arduino or Arduino derivatives. The code often supports external accessories such as a micro SDcard used as a disc, external SRAM and on chip EEprom  as well as the usual Arduinio I/O  and timing - such as millis and micros.

The code is reasonably fast allowing an AVR (16MHz)  to execute about 50,000  empty For-Next loops per second. (This was implemented on an Arduino back in 2011 - see my earlier post from May 2011.

The New Millenium.

If we now fast-forward a few decades, we now have multi-core 32 bit processors capable of executing a billion instructions per second, on each core - and handling larger numbers and larger data sizes.

How would TinyBasic perform on a modern ARM processor for example? Would the interactive nature of the language, easy to learn syntax and the additional resources - memory and peripherals offered by a modern ARM processor make for a rewarding computing experience?

As there are versions of Tiny Basic written in C, it should be fairly easy to port it across to almost any microcontroller. Right?

Benchmarks and Other Performance Indicators

Recently I have been looking at the benchmarks of various ARM processors - especially those that are compatible with the Arduino IDE.

These include the STM32F103 and GD32F103 running at 72MHz and 120MHz respectively. (The GD32F103 is a Chinese produced, licenced version of the STM32F103 - tweaked with faster on chip RAM so it runs at 120MHz).  These may be programmed using the  STM32duino plug-in within the Arduino IDE.

Peak Performance

A very wide range of processors may be programmed within the mbed ecosystem - including the latest Cortex M7 from ST Microelectronics.

The STM32F746  is a Cortex M7 with 1Mbyte of Flash and 384KB of SRAM.  Available in a number of LQFP packages from 100 pins to 208 pins, it is clocked at 216 MHz and claims to deliver 462 DMIPS.  Here's the spec summary:

  • 1MB of flash memory and 340KB of SRAM (320KB system, 16KB instruction and 4KB backup SRAM)
  • Ethernet, 6/3 SPI/I2S, 4 I2C, 4/4 USART/UART, USB OTG FS/HS, 2 CAN, 2 SAI, SPDIFRX, SDMMC interfaces
  • 168 I/O ports with interrupt capability
  • 8bit to 14bit parallel camera interface up to 54Mbytes/s
  • Ten general purpose, two advanced control, two basic and one low power timers
  • LCD-TFT controller up to XGA resolution with dedicated Chrom-ART accelerator for DMA2D
  • Three 12bit, 2.4MSPS ADC with 24 channels and 7.2MSPS in triple interleaved mode
  • Two 12bit D/A converters and 16 stream DMA controller with FIFOs and burst support

  • This looks an impressive spec for a device that costs about £11 in 1 off.

    The STM32F746 is about the fastest processor I can lay my hands on, and being LQFP, solder onto a pcb.  Using the online mbed compiler it can be programmed without having to resort to expensive toolchains.

    The STM32F746 forms the basis of the STM32F7 Discovery board - a $50 development board with LCD, USB, ethernet, microSD card, SRAM, and a whole host of other features.

    • STM32F746NGH6 microcontroller with 1MB flash memory and 340kB RAM in BGA216 package
    • On-board ST-LINK/V2-1 supporting USB reenumeration capability
    • USB functions - Virtual COM port, mass storage, debug port
    • 4.3" 480 x 272 colour LCD-TFT with capacitive touch screen
    • Camera connector
    • SAI audio codec
    • Audio line-in and line-out jack
    • Stereo speaker outputs
    • Two ST MEMS microphones
    • SPDIF RCA input connector
    • Two pushbuttons (user and reset)
    • 128Mb Quad-SPI Flash memory
    • 128Mb SDRAM (64Mb accessible)
    • Connector for microSD card
    • RF-EEPROM daughterboard connector
    • USB OTG HS with micro-AB connectors
    • USB OTG FS with micro-AB connectors
    • Ethernet connector compliant with IEEE-802.3-2002

    If we then make use of an embedded video engine  - or EVE chip, to drive a graphics display we could potentially have an ARM based computer with full colour high resolution graphics that runs about 1000 times faster than the ones we remember from the 1980s - and for a cost of about £20.

    My prototype EVE board is designed to plug into the underside of the STM32F7 Discovery board and provide full 1024 x 768 video output and a PS2 keyboard and mouse interface.   This combination would make a very comprehensive computing environment.

    In my tests I achieved about 300,000 Dhrystones per second  - about 1200 times the speed of the old Z80.

    No comments: