Tuesday, October 06, 2015

Open Inverter - Part 6 - Thinking Allowed

It's been a frenetic week of progress on the Open Inverter project. Both Trystan and I have managed to cobble together inverters from FETs, H-bridge ICs and other random, readily available modules - bought cheaply form China.

In this post, I pause for thought to decide upon the future direction of the project based on our findings, so far in this week of discovery.

Here's the current wish-list:

1. An Open Source Inverter, of modular construction that is scalable in blocks of 125W or 250W.
2. Built from readily available, low cost electronics.
3. Rugged, robust, reliable - delivering reasonable efficiency and power quality.
4. Grid synchronisable - if required - will synchronise to external source.
5. Built in power monitoring, with wireless communications compatible with emonCMS monitoring
6. "Arduino" based for hackability.
7. Supports a variety of power conversion topologies, including boost, buck, peak power tracking and split-pi.
8. Uses include micro-solar, LiPo4 battery charging, dc ring main schemes etc.
9. Under $20 for primary building block.
10. Easy to build, easy to repair, extendable, hackable.

Open Inverter - Part 5, More Experimentation

Today was a day in the lab to try out the various H-bridge modules I have access to, plus the toroidal transformers, which are more efficient than the E-core types that Trystan and I were using last week.

Low Cost IBT-2 Modules

Some months ago, I  bought a pair of the IBT-2  H-bridge modules from an ebay seller. These are advertised as 43A and 24V. They use the now obsolete Infineon BTS7960 half H-bridge IC, that I discussed in Part 3.

The IBT-2 Module uses the Infineon BTS7960 half-bridge module
These boards are 50mm x 50mm and the holes (3.2mm) are on 45mm centres - this is an ideal size to take advantage of the low cost 5 x 5 cm pcb services being offered.  I paid about £7 each for these - including free shipping from China!  There are several variants of this module around, as it has been widely copied in China by various manufacturers.

It uses two of the Infineon BTS7960 half bridge modules, and a 74HC244D buffer-driver IC to provide some isolation between the "Arduino" and the H-bridge.

Not entirely visible is the anodised aluminium heatsink that is screwed to the underside of this module - to take the heat from the BTS7960 devices.

The BTS7960 is quite easy to drive - it has a single PWM pin an an /INHibit pin - active low. Only 4 port lines are required from the Arduino to drive this module.

At first I was getting some very unusual waveforms from the output tab of the ICs.  I soon found out that they are limited to a 25kHz pwm signal - and I was using 62.5kHz. I made a quick change to the PWM timer control register - to reduce the PWM to 7.8125kHz and all was well - I was getting good clean sinusoidal signals from the IC tabs - with the scope set to 1kHz low pass filter mode

Connecting the Transformer

I happen to work for a company that uses a lot of toroidal transformers of various VA capacities. Today I selected our smallest and cheapest - which is 120VA with a nominal 24V rms secondary and 5A current.

Our standard 120VA toroidal transformer has a nominal 24Vrms secondary winding @ 5A
The transformer is intended for both 115V and 230V operation - so it is a case of connecting the split primary windings in series in order to get 230Vrms output.  If you get them the wrong way, the phases wil cancel out and you will see no output.  If you want 115V, you have to get the primaries the correct way around in parallel - otherwise you will short them out - which is bad   :-(

In initial tests, I found that the "Magnetising Current" for this particular transformer was 0.22A from a 24V dc supply.  So the inverter burns 5.28W when idle - before an ac load is attached.

The BTN8960 H-Bridge

My company makes products that uses brushed dc motor drives - between 100W and 600W.
Earlier this year I designed an experimental board o use the newer Infineon BTN8960 half-H driver IC. This replaces the BTS7960 - which is now end of life - and becoming harder to find.

On the left is an "Arduino" providing the pwm drive signals to the BTN8960 H-bridge board on the right.
The "Arduino" board is a ATmega328p-AU with 16MHz crystal, reset circuit and FTDI connection. Most of the I/O is brought out to connectors for easy plugging - this allows simple generation of 50Hz complimentary pwm sine waveforms using Timer 2.

The board on the right is an experimental motor drive board I cooked up earlier this year to evaluate the BTN8960 devices for dc motor control. The BTN8960 devices, IC1, IC2 are located in the middle of the upper and lower edges of the pcb - with the large dc-link capacitor located between them. A thermistor (thin yellow wires) allows the temperature of the upper BTN8960 to be measured.  This board has no external heatsinking -  but relies heavily on large "flag" copper areas both on the upper and lower surfaces of the pcb to dissipate the heat.

The 24V dc input to the board is on the lower left, and the output to the toroidal transformer is on the right edge of the pcb.  The orange device is a relay which allows the toroid to be disconnected from the transformer.  The board also includes a LM2576  5V "simple switcher"  voltage regulator - for powering the microcontroller.

Here we see the full set-up.

The Driver Board, Toroidal Transformer and switched socket outlet complete the prototype Inverter.

So that was the state of play at 6pm this evening. I had the opportunity to connect my Weller soldering iron station to the output of the transformer. Off load the mains ac output was 240V dropping to 238Vrms when the soldering iron was plugged in. It used 24V dc at 1.15A from the 24V bench supply to power the iron.

More Testing Tomorrow.

Today I was lacking in  suitable 230V loads to try. Tomorrow I have a bunch of 240V 60W incandescent lightbulbs to try out and slowly push this inverter up in power output to characterise its performance and efficiency.

Monday, October 05, 2015

Open Inverter Part 4 - Getting the Draft Design into eagleCAD.

Designing the H-Bridge

In today's post, the fourth in this series I start to look at some of the practical aspects of the Open Inverter design - starting with the proposed H-bridge and the choice of driver IC and heatsinks.

Sketching out the schematic, choosing the components and laying out the pcb is about 2 days work.

A typical N-FET H-Bridge with HIP4082 Driver IC and current sensing

The circuit above shows a typical H-bridge arrangement based on N type FETs and a readily available H-bridge driver IC - the HIP4082 from Intersil.

The circuit will fit easily on a 5 x 5cm pcb, and with the correct choice of FETs - and heatsinks will handle 50V and 50A .

The FETs are in standard TO220 packages, and there is a wide range of FETs that could be selected depending on application.

The 2 main parameters for FET selection are the maximum Drain-Source voltage Vds, and the maximum Drain Current Id.

For safe operation at 48V it would be worth using 100V FETS, and 70A current handling - such as the Infineon 70N10.

It should be noted that the gates of the FETs contain a pair of resistors and a diode. These components limit the current supplied to the gate, control the turn on time and the diode improves the turn-off time. They also help protect the driver IC from damage and oscillation of the gate drive outputs.

The bill of materials for this H-bridge will be around £10 in 1 off.

The Intersil HIP4082 is an economical way to drive an N FET, H-bridge
Above is the minimum circuit required for the HIP4082, this one with optional current sensing. The current sense resistor Rsh is typically 10 milliohm or even 1 mOhm - if you are working with high currents.

It should be a 1206 package or several resistors in parallel.  10A currents across 10 mOhm will dissipate 1W in the resistor. Special, precision metal current sense resistor links are available for this purpose.  The gain of the op-amp should be chosen to suit the full scale input of the ADC on the microcontroller (typically 3V3 or 5V).

Typically the FETs will have about 10mOhm drain-source resistance when turned on.  It is this RDSon that is the major dissipation of power in the FET.  Remember that the power dissipation in the FET is the product of the square of the drain current Id and RDSon.

For example 20A passing through 10 mOhm will dissipate 4W in that FET, but 50A will dissipate 25W - and also 25W in the FET that forms the other active device in the H-bridge.  The heatsink must me capable of removing this heat without the die of the FET overheating.  A heatsink of about 3 degrees C per watt will be needed to safely handle these levels of heat dissipation.

This one from Farnell (below) is fairly economical, and both the upper and lower FETs can be bolted to it - provided that they are insulated from each other and from the heatsink.  The thermal performance is 2.6C per Watt provided the heatsink is placed vertically.

The heatsinks are quite chunky, and I have had to exploit the 17.02mm wide channel for the placement of the gate drive and current sensing components. Placing the two mosfets back to back can make for a reasonably compact arrangement.

Heatsink Farnell 1699368 has 2.6 C/W 

Putting it into Practice

First draft pcb layout in eagleCAD Size is 50mm x 50mm
The circuit at the top of this post has now been laid out on a 50mm x 50mm 2 layer pcb.  Special attention was paid to maximising the copper areas that carry the high currents. These were laid out as polygons rather than traditional tracks.  There are partial  ground planes on both the top and bottom layers and these are "stitched" together with many vias which help carry the large currents - and in the case of the voltage regulator help dissipate the heat from the topside to the bottom.

The layout proved to be quite tight, and a dual op-amp such as an LM358 might be preferable to the quad LM324 making a bit more room in the centre of the pcb.

Two heatsinks like the one shown in the drawing above are fitted. The heatsinks overhang the top and bottom edges of the pcb by about 10mm.

The outputs of the inverter (to the 24V- 230V transformer) are on the left.  The battery or solar panel inputs are on the right.

A 10 way connector, PL1 allows the "Arduino" to be connected.

The connector in the centre of the right hand side is to accept an external capacitor C1, as only 1000uF at 50V will fit on the board.

As the driver chip runs on 12V, a 78M12 regulator in a TO252 package is included. Maximum voltage for this regulator is 35V.

Sunday, October 04, 2015

OpenInverter - An Open Source, Micro-Solar Inverter - Part 3

A 12A 24V motor drive H-bridge pcb repurposed into a 64W micro-solar inverter
Further Thoughts

Welcome to Part 3 of this series of posts regarding the Open Inverter - a micro-solar inverter.

The H-bridge is fundamentally a 2 port, power control device, similar to it's cousin the bridge rectifier. Instead of passive rectifiers the H-bridge consists of 4 semiconductor switches that can be controlled under firmware - and as such is much more versatile than the diode bridge. However, unike the diode bridge, it has electrical symmetry - and so power can flow in both directions - and we can use this unique ability to our advantage. The H-bridge becomes a versatile power transformation device.

Consider the H-bridge to be like a black box, that in it's simplest form has 2 ports A and B, to which power sources - or sinks, can be connected.

In the photo above, dc power  (15.949V 3.97A) from a 150W solar panel enters from the right hand side on the red and black wires.

It is then converted by the H-bridge under firmware control into a pwm 50Hz ac signal, that feeds the mains transformer - connected to the left hand side - by white and black wires, and is then transformed up into 230V ac mains .

Thus a board designed for dc motor control has been repurposed into being a micro-solar inverter.

Depending on how the H-bridge is controlled, it can:

1. Transfer power in either direction between ports A and B,
2. Rectify ac to dc
3. Synthesise ac from dc
4. Transform dc up and down in voltage and current
5. Provide a variable impedance load -  for load matching and peak power tracking.

All from a $20 module - Wow!

As I document the project so far and put my thoughts down into words it's becoming increasingly apparent that a versatile H-bridge controlled by a low cost microcontroller has a multitude of uses amongst the hobbyist community - a few here:

Solar Inverters
Step Up (Boost) dc/dc converters
Step Down (Buck) dc/dc converters
Boost-Buck Converters
Split Pi Converters
Synchronous rectifiers
Solar PV - peak power tracking
Battery chargers - high efficiency charging of consumer electronics and portable PCs
Load balancing
DC Motor Control for pumps, solar trackers, machine tools, vehicles (bikes etc)

However, the basic H-bridge design can then be further extended to 4 or more ports. By adding an extra half-H Bridge,  3 phase applications become practical:

3 phase ac or brushless dc motor drives
Solar boost peak power tracking inverter

Furthermore, each half-bridge can be regarded as a power port - where power may be supplied or removed to/from the system.  This means that the H-bridge can be seen as a 3 port device - and in this mode it has applications in some bi-directional boost-buck dc/dc converter topologies - such as the split-pi converter. 

If it is made in a modular fashion that can be extended to cope with more sophisticated or power hungry applications - then it will be a lot more versatile.

So it sounds like the world is in need of a low cost, open source, versatile H-bridge power converter. If it can be made for under $20, it can appeal to a whole variety of price sensitive applications.

A Modular Approach

An H-Bridge Module like this, connected to an Arduino is very versatile

Today it is time to make some fundamental design decisions regarding the inverter power stage.

There are 2 main options:

1.  Traditional N-FETs with driver ICs.
2.  Half-Bridge ICs - such as the Infineon BTN8962

Whilst there are many mosfet driver ICs, only a few work at 30V, which is essential for a nominal 24V battery supply. - eg Microchip TC4431/32

These are available in DIP for easy self-assembly.  The advantage with option 1 is that you can fit whatever FETs you have available - depending on your maximum voltage and current requirements.

Option 2 uses the BTN8962 or BTS7960 integrated driver and half-bridge ICs from Infineon.  These are available as ready made modules from China, at a price cheaper than they could be made here - and might appeal to some experimenters.

So in order to make a design decision, to further the project- I am going to suggest is a traditional  FET board which is footprint compatible with the Chinese module. 

Regarding the "Arduino" part of the design, this could be built on stripboard sized approximately 5cm x 5cm which is then stacked underneath the FET power module on hexagonal spacers.  Header connectors would connect up to the power module.  I believe that this would be a suitable option for home construction.

Additionally, I am looking at a pcb layout for the inverter's mcu section.  This would use the ATmega328 on a pcb sized about 5cm x 5cm so that it stacks below either the discrete FET board or the Chinese BTS7960 motor drive module.

5 x 5 cm is a good size as it allows for additional circuitry & connectors, a 5V regulator and 5x5 boards are very cheap from dirty-pcbs.com. Using the standard 5 x 5 or 10 x 10cm boards - these can be made stackable with as many power stages as required. In theory, an inverter could be built up, stage by stage, to allow for possibly 1000W.

Wireless Control and Monitoring.

For several years I have been using designs that incorporate a RFM 12B or RFM 69 wireless module. These modules make use of Jean Claude Wippler's Jee Libs - a wireless protocol devised for communication between low cost wireless nodes.

Jeeibs has been adopted by my friends at Open Energy Monitor - for communication between their wireless sensors and energy monitors - and a base station - often web conected, to their cloud based analysis and energy visualisation package, emonCMS.

By including a wireless module on the Open Inverter MCU board - it ensures emonCMS compatibility - allowing remote monitoring and control of the power transferred by the inverter -  using emonCMS.

As well as the micro-solar inverter, the Open Inverter boards could be used for pv peak power tracking, LiPo battery charging/monitoring, and dc/dc conversion for the various voltage outlets of the "dc ring main". They can be used anywhere that power is generated or converted and report back the individual power transfers to emonCMS. 

Saturday, October 03, 2015

An Open Source, Micro Solar Inverter - based on Arduino - Part 2

In the first part of this "Open Inverter" series, I described how Trystan and I had cooked-up a simple inverter based on a mosfet H-bridge, an "Arduino" and a 12V-230V mains transformer.

Before going into too much technical detail, (as I am still documenting it),  I first wish to explain why I think that the combination of microcontroller and H-bridge is an essential building block in modern power electronics, and the ability to efficiently transform ac to dc, dc to dc, and dc to ac are paramount to the renewable energy sector.

FETs capable of switching moderate power levels are available surprisingly cheaply. The ones we used in our inverter were under £1 each.  The driver ICs (IRF2110 or similar) are a couple of quid each.

So, it's possible to make up the H-bridge stage and drivers for under £10 - and that includes some heatsinks.

However, the ubiquitous H-bridge is also available in the form of an IC - or rather 2 ICs - as most implementations appear to use half H-bridges.

Infineon make a range of these - intended for automotive motor control, and so can handle high currents, but generally at 28V maximum.  This makes them suitable for 24V battery systems.

The BTS7960 is typical of the Infineon range.  It has a maximum voltage of 28V, but with correct heatsinking can switch up to 43A.  Theoretically, a pair of these devices would be capable of running a 1kW inverter - but I would be happier in the 250W to 500W range.

It includes over-temperature, over-voltage and current limiting built in. It also outputs an anaogue signal proportional to the drive current - which can be used to monitor the performance.

Ebay is a good source for ready built modules containing a pair of BTS7960 devices.  I bought a pair of these for about £8 each.

A low cost BTS7960 H-bridge module from Ebay or TaoBao

The BTS7960 is also available as an Arduino shield - called the MegaMoto shield, which holds a pair of BTS7960 plus jumpers for easy selection of which pwm to drive them from. At £32 its a bit overpriced, but handy if you are already working with an Arduino platform.

Other BTS7960 boards are availabe from Taobao - of varying design and quality - with or without heatsinks - but I consider adequate heatsinking to be essential. 

This application note for the BTN8962 - a newer, related family member - gives good details on how to get the best from these devices.

Making the H-Bridge Work for Us.

If you look at a typical FET H-bridge, you will see that each FET is bypassed with a reverse biased diode. This is sometimes called the body-diode, and it comes for free, as part of the process of implementing a  FET on the silicon substrate.  It is tremendously important in protecting the FET from inductive switching over voltage spikes - as it returns them safely to the supply rails, but can also work in our favour in allowing easy implementation of rectifier and boost converter topologies.

A half bridge like this one may easily be turned into a boost converter by supplying dc power into the terminal marked OUT, via a series inductor, and extracting the boosted voltage from the terminal marked VS.

Conversely, a buck converter can be made by applying power between VS and GND, and extracting a reduced voltage between OUT and GND - again through a series inductor.

So two of these half bridges, a couple of inductors and you have all the makings of a boost- buck dc/dc converter.

Why would you first boost a voltage, only to buck it down again?  Well if it was the varying output voltage of a solar panel which drifted depending on clouds - it would be handy to boost it so as to properly charge a battery pack at the correct charge voltage.  Then you might want a stable 12V or 5V supply for powering some equipment - in which case you would buck the voltage down again.  

This ability to transform dc power up and down, with high efficiency, or match the varying dc output of a pv panel - so as to capture peak power, is very important - and it can all be done with the H-bridge controlled by an 8-bit Arduino. 

The Tasks of the Arduino.

In the 10 years,  Arduino has become a familiar and accessible microcontroller platform.  Even though it is only an 8-bit, 16MHz device, it can still be used to great effect in power electronic applications.

We built up "breadboard Arduinos" which closely follow Cefn Hoile's Shrimp design. Essentialy a ATmega328 IC with crystal, reset and FTDI cable header.

Shrimp - a minimal breadboard "Arduino" - by Cefn Hoile

The Arduino has to generate complimentary PWM in order to drive the H bridge. In some applications independent PWM channels may be needed to control each side of the H-bridge separately.

In addition to pwm generation, the Arduino should also monitor current, voltage and load regulation.

By making use of "Fast PWM" and "Fast ADC" on the Arduino, the ATmega328 can achieve quite a lot of control whilst generating the sinusoidal pwm.

Our first task on the Arduino is to generate a sinusoidal signal using "Fast PWM".  

For those who are eager to experiment, I have created a Github Gist containing the sinusoid pwm generation sketch.  This produces complimentary pwm on digital pins 3 and 11 - just what you need for driving H-bridges.

To test this routine, make a low pass filter from a 10K resistor and 100nF capacitor and attach to either digital Pin 3 or 11. This will reconstruct the sine wave from the digital pwm waveform - and give you a scope trace similar to that at the start of this post.

In the next part - I will have the schematics for the FET version of the inverter for eagleCAD. In the meantime I encourage readers to try and get the Arduino or Shrimp to produce a 50Hz sine waveform.


A Micro Solar Inverter - based on Arduino - Part 1

Bothy-Hack - A Micro-solar inverter based on Arduino

About once a year, I get the opportunity to spend some time with my friends from @openenergymon in North Wales.  This year, having attended the oshCamp in Heben Bridge last weekend, I took advantage of the glorious late September sunshine to cross over to southern Snowdonia, to a rural bothy outside the village of Llanfrothen, to spend a few days working on some new projects with Trystan Lea.

Trystan had expressed an interest in building from scratch, a low power inverter.  This would take the dc output from a low-wattage solar pv panel and create a stable 50Hz, 230Vac mains - suitable for powering small items of equipment. So after a couple of beers and some tech discussion over a pub meal on Monday evening we set about beginning our micro-solar inverter project.

Open Source - Easily Built, Easily Repaired

There have been several inverter designs published on the web, but they are either crude, square wave or modified square wave and based on beefy bipolar transistors.

The intention was to make the design easily accessible to others, with the intention of using familiar and easily sourced components - available to hobbyists everywhere. The project was going to be open sourced, hopefuly with professional pcbs coming a bit later - so that others could follow our work.

We wanted to make a design that uses readily obtainable N-type FETS and an Arduino (more strictly a ATmega328P-PU on a breadboard) to generate the PWM signals and provide simple circuit protection, and load sensing.  With the PWM signals generated in firmware it can easily be modified for 50Hz or 60Hz operation, either 115V or 230V operation and a wide range of battery input voltages.

We imagined that the final design could consist of an Arduino, an "Inverter Shield"  containing FETs and driver ICs configued in a H-bridge and some voltage and current monitoring circuits.  To make the inverter a 12V or 24V battery (or PV panel) and a 12V (or 24V) torroidal transformer would be added.

As we really only had 2 days to work on the design, we decided to make a simple proof of concept prototype, which could later be refined.

We are happy to receive suggestions from the wider community  - in the hope that the basic design will evolve into an efficient unit.


The steps of the primary project were planned as follows, the time available was about two and a half days:

1.  Use a breadboard "Arduino" to generate the 50Hz sinusoidal pwm waveforms needed to drive the FETs.

2.  Breadboard the FET driver ICs and the 40A  55V FETs for ininial testing with a 4VA step up transformer.

3.  Build up the FETs on stripboard - with substantial current handling tracks and heatsinks.

4.  A series of tests with different ac loads, with both 12V battery and pv input power.

5. Documentation and blogposts.

A secondary project was to build a simple energy monitor - again using a "breadboard Arduino" which would measure the dc output of the pv panel, and allow us to perform efficiency tests on the micro-solar inverter.

As a fall-back position, I had brought along a dc motor driver board I have been developing at work that uses an ARM Cortex M4 processor and a 12A  24V H-bridge.  I wanted to have a go at repurposing this board to make a simple 50Hz inverter (and succeeded!).


Step 1 was fairly quick to achieve, because I already had some Arduino code to generate an 8-bit sine waveform, using "Fast-PWM" - which appears as complementary pwm outputs on Arduino Digital Pins 3 and 11.

Trystan had already built up a "breadboard Arduino" - so it was relatively simple to program this with a FTDI cable, and then test the outputs for frequency, using a low pass filter to reconstitute the sine waveform for the oscilloscope.

The next task was to build up the 4 FETs that form the H-bridge onto a breadboard, and wire them up to the IR2110 driver ICs.  These ICs produce a level shifted drive waveform, so that N-FETs can be used in the upper arms of the H-bridge - with reduced on resistance and therefore improved switching efficiency.  The driver ICs are designed to supply the high currents - both source and sink, required to turn the FETs on and off - quickly.

Breadboard construction is not ideal for building fast switching power electronics, and getting the driver ICs to work reliably was probably the biggest challenge of the project.

However, by 9:45pm on the first night, Trystan had the inverter running and lighting an ac powered LED bulb.

The LED lamp in Trystan's right hand was first signs of a working inverter!

The ac waveforms on the scope were not in great shape, so we added a couple of 0.33uF 250V capacitors, connected in series across the ac output - that cleaned things up a lot!
Scope waveform  - not great at first
Until we added 0.33uF capaciors across the mains output
Even with a 4VA step up transformer - the LED lamp was ultrabright!

So we retired at the end of Day One - with a working inverter, and the plan to characterise it and improve it on Day Two.

In the next part, I look in more detail at the design and performance.  Later there will be links to the schematics,  pcb layout and the Arduino code used to drive the FETs.

Micro Solar - The Future is Bright

Micro-Solar - A New Approach to Increasing the Installed PV Capacity


Recent changes in the solar grant scheme have had a very negative effect on the UK's fledgling solar pv industry. The rug has been pulled out from under the feet of all those that set up installation businesses. The FITs have been decimated - and so now there is virtually no reason why anyone would make a large, long term investment in a permanent pv installation.

Feed in tariffs (FITs) have been reduced to the point where there is no longer an incentive to invest in a photovoltaic solar installation.  Our current Government seem to be more interested in kickstarting the UK fracking industry and selling off large sections of our critical electricity generation infrastructure to the Chinese.

The reduction in FITs to just 1.63p/kWh in January 2016 is going beyond miserly, and at such levels completely removes any economic reason to export to the grid.

Without the grid export option, the pv output can only be used locally, and if used efficiently will help reduce the amount of power consumed from the grid.

In this post, I propose a new way to look at solar pv, in a way that could appeal to a much greater customer base, and at a price that is much more affordable.

A single 250 W pv panel, could reduce your electricity bills by up to 10%, and if repeated by millions, rather than 10's of thousands of consumers around the country, it would significantly increase the installed solar capacity.

There are many potential customers for micro-solar. Those who can afford an expenditure of up to £1000,  but not the £6K - £10K for a full system.  With the FITs gone, the market for larger systems will have evaporated.

Sources of PV Panels

A recent web-search revealed small solar systems being made and marked in China for about £750, complete with inverter, battery and controller.

If you choose to shop around on Taobao (Chinese ebay) you can find 250W panels for about £60 each. Even if these prices doubled by the time you had shipped to the UK.

If you want to search for your own panels - the Chinese term is 250W 太阳能板  - Happy Hunting!


A unit from the grid (Southern Electric October 2015) is currently 14.0385p (incl VAT). My annual consumption is approximately 2400 kWh.

A small system consisting of 4 x 250Wp panels would yield between 850 and 900kWh per annum, displacing about 35% of my grid consumption.

If we think that 800kWh of this can be used in the home, it could reduce the incoming electricity bill by
£112 per year.

Importing 250W panels from China at £0.50 per peak watt, means that the system could recover its costs in 5 years - without a grant or FITs, nor the additional expense of a grid-tied inverter and professional MCS installation.

The Case for Microsolar

I will define a Micro-Solar system as an installed system of 1000Wp or less.  With currently available panels this would consist of up to 4, 250Wp panels.  These panels are typically 1.6m x 1m and weigh about 21kg.

The emphasis being that microsolar should be small, cheap and portable.  Portable in the sense that it's not a permanent installation, can be deployed on a wall mounted bracket, especially if roof access is not available, and can be moved from property to property a required - a benefit for young adults in the rented sector. As it is a small system, typically it would be a DIY installation, not requiring specialist tools or equipment. One attractive use would be on a south facing balcony, with the panel securely clamped to the balcony rail.

As the system is small, it is essential to get the best use from it, and the best conversion efficiency. This will entail the use of high efficiency dc/dc converters and LiFePO4 batteries used for energy storage.

Whilst we are all familiar with ac mains electricity, and almost all of our appliances and consumer electronics products are intended for ac plug in use - for some products the ac is an inconvenience, and a huge source of inefficiency. With the rise of portable computing products, smart phones and other mobile gadgets - these increasingly require a 5V charge - from a standard "USB" style charger.

If you are interested in USB chargers - this excellent post covers them in great detail, in terms of efficiency, power quality and safety.

The underlying message is that small ac powered USB chargers are at best only 65 to 80% efficient, and this is a figure that has a lot of room for improvement.  Direct dc/dc conversion could improve this considerably.  Starting with a 12V input, this can be converted with 92% efficiency to 5V using a synchronous switching converter - such as this one from ON Semiconductor.

With the increase in electric bike technology - LiPo battery packs are available with high capacities and low cost.  The real benefit of Lithium battery chemistry is that it has a very high charge efficiency. Packs of welded lithium cells are typically available in 350W to 1kWh capacities.

What can you power with 1kWh per day?

Several years ago, when I was working from home, I mused on the idea of a home-office workspace that ran on just 200W. At the end of that post, I speculated that perhaps even 100W was possible. In the 10 years since that post we have had the benefit of  more power efficient laptops and netbooks, LCD monitors, LiPo batteries and LED lighting. I now believe that I can run the same work environment on an average power budget of  just 100W - and that puts home working well within the reach of a microsolar installation.

Whilst thinking about the e-bike battery packs, it occurred to me that a lot more people are cycling these days, particularly within our urban cities.  An e-bike consumes typically about 20Wh per mile, and so a 36V 10A pack could offer a range of about 15 to 18 miles between charges - depending on terrain and how much you pedal.  By using several e-bike battery packs as the basis of the modular battery store, it is possible to have a freshly charged pack every morning. Three or four interchangeable packs would form the system, so on any day there is always about 1kWh of storage.

Having arrived back home after a day at the office, the battery is fully charged and ready for the evenings use.  This would involve recharging of portable computing devices, smart phones and LED lighting in the evening.

A quick check on the specs of a 43" Samsung smart tv - showed a 51W consumption - so 6 hours of TV gaming or web-browsing in the evening is going to be well within the capability of a microsolar system.

What about the Winter?

Microsolar systems will run at much reduced capacity during the Winter months.

The graph below shows monthly output, for south facing panels, in southern England, averaged over 3 years (2012-2014) - scaled to reflect the output of a single 250 Wp panel.
Normalised output for a 250Wp panel
It shows that useful output is available from March until September - approaching 1kWh per day, but for 4 months of the year, you are getting only about a third of the summer peak.  If you are reliant on the solar contribution to charge your bike pack, then you will need to find an efficient means to recharge your pack from the ac mains during the winter months.

Fortunately, the designers of switched mode power supplies have made significant advances over the years in improving power supply efficiency. This is particularly important in the server farm and telecoms applications where efficiency and reliability is paramount.  These efficiency improvements have filtered down to the PC power supplies - and you can now buy a desktop PC supply with better than 95% efficiency.

Combining a high efficiency psu with a LiPo battery bank, means that you can top up your store when required, at maximum efficiency. It also helps maintain your output on cloudy days.

In summary, all the components for a high efficiency microsolar system are available and affordable.

In the next post I will go into some more detail of the proposed system.

Thursday, October 01, 2015

A Glorious Week in September

The First Rays of October Sunshine Bathe The Llanfrothen Bothy

This week I have been off work and attending various events around the country.

The weather has been exceptional - i am so lucky to have picked a week off work with such fine sunny autumnal weather

Last Thursday I went to an ARM Cortex M7 programming course near Cambridge.

On Friday, I hung out with Andrew Back and Omer Killic in Hebden Bridge and discussed plans for a new video frame store and imaging system for his 1985 Cambridge Instruments Scanning Electron Microscope.

Saturday was oshCamp 2015 (Open Source Hardware Camp)  - 2 days of tech talks and workshop sessions at Hebden Bridge Town Hall. This was the first event of the week long @wutheringbytes Digital Festival held in Hebden Bridge.

On Monday I attended the morning session of "Open  for Business" and then spent the early afternoon teaching a 75 year old pensioner the basics of programming Arduino.

In the late afternoon I drove to North Wales to a remote rural bothy near the village of Llanfrothen.  I then spent nearly 3 days developing a micro solar inverter with good friend Trystan Lea of @openenergymon.

I have returned to Redhill, Surrey and expect to return to work tomorrow for a well deserved rest.

More posts about the above to follow.

Monday, September 21, 2015

How SIMPL Can You Get?

In the last post, I explained how I had slimmed down the kernel of SIMPL - at the same time removing much of the Arduino specific code so that it would fit into 2Kbytes of Flash plus a lower RAM requirement.  This also makes it highly portable to other microcontrollers.

My intention was to be able to put an image of SIMPL onto any microcontroller target system that I happened to be working on at the time - and give myself a friendly, predictable environment with which to exercise the hardware. In some cases, SIMPL could even be loaded into the bootloader space of a processor - so that it was always accessible.

SIMPL fundamentally allows interaction with the microcontroller, because of it's interpreted nature. The interpreter is flexible enough to form the basis of a series of simple tools, such as cross assemblers, debuggers and simulators. It is, whatever you want it to be - you have absolute control over what action the cpu performs in response to your key strokes.

Kernel Functionality 

SIMPL communicates with a PC using a serial UART interface.  It can be driven from any terminal application.

It really only needs getchar() and putchar() routines that interface with the on-chip UART.

These together with a printnum() function which prints out a 16 bit unsigned number are all that is needed to communicate with the PC in a meaningful manner.  It's old-school, but it works - and easy to set up, on almost any microcontroller or SoC device.

SIMPL is a low overhead program - a kind of interactive tiny OS, that only takes a few Kbytes, yet provides all the means of accessing and controlling the micro.

A brief list of functionality.

The digital I/O is limited to the writing to or reading from a single I/O pin. In most cases this will be one that supports a LED.  The I/O functions can be extended to whatever is needed by the application - for example in one application - an LED chaser display I needed to write a 13 bit number to an array of LEDs each connected to an output pin of the Arduino.

Analogue Input (ADC) and output PWM functions may be ennabled if required - but these will add approximately a further 600 bytes to the kernel code.

The kernel uses the delay() and delayMicroseconds() functions to allow accurate timing of I/O operations. With these the microcontroller can generate pulse sequences (up to 100kHz on Arduino), generate musical tones, sound effects or animate LED displays.

As well as the functions that interact with the hardware peripherals, SIMPL also has a range of arithmetic and bitwise logic operators to allow simple integer maths and logical decision making.

There is a simple looping function which permits a block of code to be repeated - up to 32K times.

Recently added functions allow the printing of strings and the manipulation of ascii characters.


On top of the 2K kernel core is some further code which allows the user to define their own functions and store them in RAM.  Up to 26 user functions can be defined under the current system.  It's not exactly Forth - but borrows a whole lot of ideas from that remarkable language.

The system could be extended to include SPI or I2C functions to exercise specific peripheral chips or access a microSD card for program storage.

One of my designs "WiNode" is an Arduino compatible target board but with 433MHz wireless module, external SRAM,  RTC, motor driver/ speaker driver, and microSD card. SMPL may be used to exercise and interact with all of these peripherals.

32bit Math Extensions

This was remarkably easy to implement. By re-type-ing the x and y variables to long - it forced all of the arithmetic routines to 32 bit.  Whilst this pushed the code size up by about 900 bytes - some of this was offset by rewriting the printnum() function as a 32 bit printong() and deleting  the original printnum().  The code now stands at 4826 bytes and can be found on this GitHub Gist

This means that SIMPL can do 32 bit integer maths - straight form the can.

Whatever Next?

SIMPL has been an ongoing project for over 2 years, and as it has developed - so have my C coding skills.  As the code becomes larger, things become easier  - as the switch from 16 bit to 32 bit integer maths has proven - it was literally a 10 minute hack.

I am very aware that SIMPL is not yet a fully fledged language - it can flap its wings and tweet a bit - but is not ready to leap out of the nest and fly. Perhaps I am a bad mother bird - too eager to experiment with new ideas rather than concentrate on the basics. Time will tell.

I have ported SIMPL to STM32Fxxx ARM microcontrollers and seen a 25X increase in speed.  Now their are ARMs that run at 240 and 300MHz  (Atmel SAM E7) that will give even more of a performance boost.

The final intention is to create a SIMPL virtual machine (SVM) that  can be hosted on nearly any micro - including FPGA soft core stack proessors - such a James Bowman's J1b. With these we hope to see a large leap in performance.

In the meanwhile, I still have an Arduino plugged into my laptop - as my preferred development platform - if it will run on Arduino - it will run on everything else a whole lot better!

Next Time - more uses for the SIMPL Interpreter.

A Closer Look at the SIMPL Interpreter

Keeping it SIMPL

Since May 2013, I have been slowly developing a tiny interpreted language that can be used to initialise and exercise hardware when developing with a new processor.

SIMPL is primarily intended to be a very low overhead language, requiring only a serial uart  (or bit banged serial) for communication to a PC hosted terminal program.

Commands are in plain, human readable ascii text - with an emphasis on being easy to remember.

SIMPL is based on Ward Cunningham's Txtzyme interpreter - originally for Arduino - but ported onto several other microcontrollers - as it is written mainly in C.

The kernel or SIMPL interpreter needs only a few resources:

2K bytes of program memory (Flash)
35 bytes of RAM
UART  getchar and putchar functions
microsecond delay
millisecond delay

On the Arduino these delays are provided by the delay() and delayMicroseconds() functions but can be provided with simple delay loops.

Once you have this 2K of code on-board, you can then start to add it more functionality - that is tailored to your particular application.

Slimming Down the Interpreter Kernel.

As originally written, Ward Cunningham's Txtzyme compiles to 5032bytes of flash and 209 bytes of RAM. (The exact number of compiled bytes may vary on what version of the Arduino IDE you are using).

As it made use of several of the high level functions available in Arduino - such as Serial.print, digitalWrite etc,  -  it was certainly not optimised for codesize.

I rewrote and enhanced the interpreter - so that now it fits into just short of 2048 bytes, and is written in more generic standard C for easier porting to other processors.

I have also added more functions including arithmetic, bitwise logic and memory operations.

I am sure that if the routines were handcoded in AVR assembly language, that further reductions in codesize could be achieved. However, I wanted a useful kernel that would fit in 2K and was easy to understand.

I have placed the SIMPL kernel here as a Github Gist.

Growing the Kernel

It has long been my intention to make SIMPL an extensible language, and so for this approach I have chosen to use some of the ideas used in Forth.

The kernel can easily be extended from some 30 basic functions to about 85, just by extending the switch/case statement that forms the basic subroutine calling mechanism at the heart of the kernel.

I keeping with Charles Moore's philosopy of "Problem Oriented Languages"  the kernel of SIMPL may be extended in whatever way needed for solving the problem, and should as such be considered to be a minimum common starting point - for any cpu.

Once the 2K core of the kernel was established, it was time to add in the extra functionality that allows users to add their own functions.  This is done in the spirit of Forth - but with certain limitations to keep the code size down.  However, with the added functionality - the code grew from 2Kbytes  to 3982 bytes.  The main difference is in the amount of RAM that is used - the extra code allocates a User RAM array of 1248 bytes.

If you would like to look at the code and try it out on an Arduino - I have created a Github Gist here.

If you are using a standard Arduino with the LED on Pin 13 change line 64 to:

int d = 13;          // d is used to denote the digital port pin for LED operation

As this is a work in progress - more details will emerge in a later post.

Monday, September 07, 2015

A Simple Assembler for the J1 Forth Processor


I am currently exploring the use of soft core processors implemented in FPGAs, with a view to developing an image capture, process and display unit for a friend's scanning electron microscope.

Earlier this year I bought a Papilio Duo FPGA board and a computing shield from The Gadget Factory.  Some early experimentation with the ZPUino soft processor - programmed using the Arduino language, gave a taste for what could be achieved using soft core cpus and accompanying VGA hardware defined within the FPGA.

Very recently, James Bowman has released his J1b Forth processor implemented on a Papilio Duo and computing shield.  It is this processor core that ultimately I want to use.

However, there is quite a steep learning curve, not only to learn FPGA programming in Verilog and VHDL,  the J1 instruction set and the Forth language - to the point where I can program my intended application.

This is quite an ambitious journey for me - an opportunity to stretch my skill set and programming abilities. As with any long journey, it starts with the smallest steps.

This week I am looking at how the J1 instruction set works with the help of a simulator and an assembler, both written in C and ported onto either and Arduino or STM32F407.

With these simple tools, I hope to learn sufficient about how the J1 executes it's native machine language, to the point where I can implement a small Forth-like language.


Inspired by Frank Carver's blog Raspberry Alpha Omega, I decided to dig up some of the work I did earlier in the year - to create a tiny Forth-like language that would run on virtually any processor.

My ambition is to have a language nucleus that resides in about 2K of memory which provides a means to debug and bootstrap an application with limited tools or resources. I imagine it as a common core, which can be accessed via a serial UART, and can be used right from the start of a project and form a foundation onto which an extendable application can be built.

Whilst the usual image of code development is typing into a text editor or IDE and then compiling before flashing the machine code into the microcontroller, my plan is to take a huge chapter out of Charles Moore's book and make my language interpreted and have the means to compile and edit code right there on the microcontroller itself.  Indeed very Forthlike.

However over the years and through the various ANSI standardisation processes, Forth has become large and bloated - and that was never what Chuck Moore intended or wanted.  So I am going to pick and choose from the characteristics of Forth, and come up with something very much simpler.

The plan is to have a compact language kernel which resides on the microcontroller - regardless of whether it is an AVR or ARM  or a specialist stack processor burned into an FPGA.  In each case, it will present me the same user interface and experience - for low level hacking or code development.

From a hardware developer's perspective, every microcontroller I work on needs to have the means to print to a terminal and waggle a port pin - right from the get-go.

However, this language need not just be for human interaction. As the commands are very compact, they lend themselves to being packetised, and sent from machine to machine by whatever appropriate communications channel - be it wireless, BLE, TCP/IP or 140 characters at a time via SMS or Twitter. It also allows a microcontroller such as that on the Raspberry Pi, to communicate with other task specific hardware - solely using a UART connection - the speed of control and interaction is not restricted to the speed of a few characters a second that a human can type.

Creating a Virtual Machine

For all this to work, we need to establish a virtual machine on the chosen microcontroller. The virtual machine could initially be coded in C, to run on the target, but later it can be created as a specialist soft core processor on a FPGA.  On the Arduino, the virtual machine codes into about 2Kbytes  - or 3K when you add the Serial.begin() function for UART output.

Once the virtual machine has been installed it will happily execute it's way through the memory on it's own, until it crashes or is reset. The challenge now becomes writing the low level inner interpreter application code in the assembly language of the virtual machine. This step is something that I will put off until I have generated a means of creating and assembling the language.

Txtzyme Revisited

To create an assembler, I am going to use the tiny Txtzyme interpreter, written by  Ward Cunningham,  which was the original inspiration for this project.  It allows very basic parsing of a text buffer and then performs one of a series of function calls depending on the character typed, or read from the buffer. Numerical characters are converted into an integer and placed in a variable x.

For a simple implementation of an assembler using txtzyme , let's consider that the instruction word consists if 4 fields

Class                 Class Field

Literal               0x8000
Jump                 0x0000
Jump if zero      0x2000
Call                   0x4000
ALU                  0x6000

For the first four of these - the assembler just needs to OR the class field with the literal number or the target address.  It may be worth ensuring that the literal is constrained to 15 bits and the target address is constrained to 13 bits.

We will use the following characters to define the instruction class

#  literal
j   unconditional jump
z  conditional jump when T=0
:   call

For the ALU instruction there are 4 more sub-fields to populate depending on the nature of the instruction.

1.   ALU op-code
2.   Transfer field
3.    Pointer field
4.    Return field

ALU op-code

The 2nd nibble of the instruction word controls the ALU.  It's 16 instructions are decoded thus:

0       t          NOP
1       n         COPY   (T=N)
2       +        ADD
3       &       AND
4       |         OR
5       ^        XOR
6       ~        INV
7       =        T = !(T == N)  Sets T to status of EQ flag
8       >        T= !(N < T)    Sets T to status of GT flag  
9       /        RShift
A      d        DEC     (T= T-1)
B      r      r-fetch
C      @       fetch
D      *        LShift
E      d        depth (shows dsp +1)
F      u        U<

So we have about 20 instructions - that fall into one of 5 categories

LIT   - load the included 15 bit literal onto the top of the stack
CALL  - call the subroutine at the enclosed 13 bit address
JMP     - non-conditional jump to the enclosed 13 bit address
JPZ      - conditional jump - only if the top=0
ALU    - ALU and stack operations

Literal Instructions take the form 8xxx   (in hex)
Jumps                                           0xxx or 1xxx
JPZ                                               2xxx or 3xxx
Calls                                             4xxx  or 5xxx
ALU                                             6xxx  or 7xxx if you include the "return"

So the plan is to adapt the txtzyme interpreter to convert text input into machine language in the form of the 16 bit instruction words.

The 3rd nibble of the instruction word insn controls the data flow from the stack to memory

N     Insn[7]  Top transfers to Next (2nd)
R    Insn[6]  Top transfers to Return
@   Insn[5]  Next transfers to address pointed by Top
_     Insn[4]  Not used

The lower nibble of the instruction word is used to control the incrementing or decrementing  of the data and return stack pointers dsp and  rsp.  Pushes to the stack involve incrementing the dsp, whilst popping from the stack means that the dsp needs to be decremented.  Some actions are stack neutral, and involve no net gain or loss in stack items.

The parentheses can be conveniently used to represent push and pop operations -  memorable that you start with a push  (left bracket) and end with a pop (right bracket)

(    push ds
)    pop ds

[   push rs
]   pop rs

ds field

1    dsp++
2    dsp--

rs field

1   rsp++
2   rsp--

The basic core of the assembler which accept the text input and  generates instructions as 16-bit hex words fits into under 200 lines of C.

Assembler Instruction set Summary

Implemented so far:


t             NOP
n            Copy

+           ADD
&          AND
|                OR
^            XOR
~            INV



r                 Right Shift
l                 left shift

d                T - 1  (Decrement)


@    Fetch
!      Store

Data Transfer

N         T-> N
R         T -> R
A         T-> A

Stack Ops

(       Push Data Stack
)       Pop Data Stack
[       Push Return Stack
]       Pop Return Stack


Sunday, September 06, 2015

Emulating a J1 Forth Processor on an Arduino


Emulation is a useful technique - especially when you don't actually have the processor that you are writing code for.

In the Spring of 1975, 19 year old William Gates III did not possess an 8080 microcontroller, but he and friend Paul Allen had committed to writing a BASIC interpreter for the company supplying the new 8080 based Altair microcomputer.

Fortunately Paul Allen had written an emulator for the similar 8008, which ran on a PDP-10 mainframe at Harvard, and working nights for several weeks on the PDP-10, they managed to produce the first Microsoft BASIC product - and the rest is history.

Back in April, when I had a little spare time, I started to work on a program to emulate James Bowman's J1 Forth CPU - and it was the subject of my post "One Song to the Tune of Another".
At that time I had it running on a FPGA soft core - the ZPUino, and it was complete with a VGA display. I am now taking a step back to just isolate the J1 simulation part of that project, so that I can build it into a set of simple tools that I am developing.

Now as my thought processes are starting to converge, I thought I'd dust off the code and start to see how it will fit into my grand scheme for a stand alone code development system based on a J1 running on a FPGA.

James has put a lot of effort into writing his "swapforth" for the J1, but I am treating this as a learning exercise, so rather than use James's swapforth, I am setting about writing my own tiny language - it's the journey, not the destination I am interested in at the moment.

Not being as ambitious (precocious) as Bill Gates, I set my sights a little lower and in just 200 lines of code, I have a J1 emulator that runs on an Arduino.  The code I am using has been adapted for the Arduino from Samawati's J1 simulator on  GitHub.

Slow Forth

Not renowned for high speed or vast resources, the Arduino munches through the J1 code at a pedestrian  63,000 instructions per second. That's about  1600 times slower than an actual J1.
Slow, but nevertheless useful. I can now write snippets of assembly language to run on my "J1" and test them out.

The J1 machine code is stored in an array of 16 bit integers m[xxx ] set up in memory. As the ATmega328 only has 2K bytes of RAM, I kept the array size down to  768 words.

Here is the first J1 program - a simple counter

// Load up a simple count program into first 7 locations of the memory array m[ ]

   m[0] = 0x8020;      // LIT 0x20  (0x20 is the address of the count variable
   m[1] = 0x6C00;      // Fetch   [0020]
   m[2] = 0x8001;      // LIT 1   We are going to add 1
   m[3] = 0x6200;      // ADD
   m[4] = 0x8020;      // LIT 0x20
   m[5] = 0x6020;      // Store
   m[6] = 0x0000;      // JMP 0000

Translating these 7 instructions into Forth  we get

32 @ 1 + 32 !  followed by a jump back to the first instruction

Forth is clearly a little easier than assembly language, but note how the J1 instructions translate on a one to one basis into Forth, so validating the idea from yesterday's post about using the SIMPL interpreter to create assembly language - this is the next step.

Slightly Quicker Forth

Further experiments with a  STM32F407  Discovery board - and ARM Cortex M4 clocked at 168MHz showed that the emulator would run at approximately 700,000 J1 instructions per second - about 1% of the speed of the proposed hardware.

Exploring Forth for Low Level Hacking


Forth is an interactive, low level language which shares a lot in common with machine code. It allows low level access to the processor and its resources and can therefore be quick and powerful - in the right hands.
It allows a degree of interaction which has now been lost in the higher level compiled languages, but for the right applications it provides all the flexibility needed.

The following videos illustrate some aspects of Forth when used for controlling hardware.

Open Firmware

It Was Twenty Years Ago Today.......

In the mid-1990s Chuck Moore, Jeff Fox and others worked towards forth computing engines that would achieve burst speeds of around 500MIPS.

Chuck Moore developed custom VLSI devices - a series of  processors where the machine language instructions were essentially Forth primitives.  These processors all used a minimal instruction set - and were known as MISC processors.

Dr C.H. Ting, had also shown with his eForth model, that a working Forth could be composed from just 31 Forth primitives, and that all other definitions could be assembled from this core set. Thus a processor with a 5 bit instruction length could potentially be used for Forth execution.  Dr. Ting explored this further with a series of chip designs - where 5-bit instructions were packed into 16 bit or 32 bit words - allowing 3 or 6 instructions to be fetched from memory at a time, which better suited the slower RAM access.

Keeping the speed up...

When Forth is implemented on a  register based load-store architecture- such as the ARM, the overheads of running the Forth inner interpreter - in particular NEXT,  means that around 10 machine instructions need to be executed in order to execute a Forth primitive. This suggests that an ARM clocked at 100MHz will only achieve around 10MIPS.

Forth requires the right architecture in the processor in order to be able to execute the Forth primitives efficiently - preferably as single cycle instructions.  The processor should have a stack-based architecture, and the machine instructions should be directly map-able to the Forth primitives for efficient executing. Using this approach allows a simple Forth processor to be designed as a soft-core cpu for a FPGA - and maintain a performance of around 50 to 100 million Forth instructions per second. (Forth MIPS).

Affordable FPGAs

Whilst much of this work was done about 20 years ago using custom VLSI chips, progressive improvements to FPGAs, falling memory prices and greater access to sophisticated design and simulation tools has allowed the creation of FPGA soft-core microcontrollers to be in the reach of the hobbyist.  Low cost FPGA dev-boards are available for the $50 to $80 price range.

There have however been a number of stack machine cpu designs developed over recent years, several of which have been implemented on a low cost FPGA.  Notably ZPUino - by Alvaro Lopez, and J1 - by James Bowman, although several others exist.

James Bowman's J1 design is of interest because it is close in architectural design to Chuck Moore's 1985 Novix NC4000, but much simpler because the data and return stacks are implemented in on-chip RAM. This gives it the potential for 100 Forth MIPS - when implemented in a Xilinx Spartan 3E - and described in under 200 (160) lines of Verilog code.

J1 is incorporated into the Gameduino Shield - a gaming -  graphics and sound generator for Arduino. Versions are also available from Olimex - which include PS2 keyboard connector and additional 32MB SDRAM for extended resolution - although Olimex leave you high and dry when it comes to implementing firmware to make full use of the extra 32Mb!


The J1 Processor Model

The J1 processor is simple enough that it may be modelled in about 100 lines of C code. I used this model available from ddb's Github Repository 

The J1 model is created from James Bowman's original documentation "J1: a small Forth CPU Core for FPGAs" and is very similar to the verilog code that defines the J1 implementation in hardware.

More documentation at James's J1 site .  As can be seen, the J1 has been used in a variety of projects including the Gameduino shield - which is a graphics engine in the form of an Arduino shield.

The J1 has just 5 categories of instruction coded up into a 16 bit instruction word:

Literal                           a 15 bit literal pushed onto the data stack
Jump                            Jump to a 13 bit target address
Conditional Jump          Jump if T is zero to a 13 bit target address
Call                              Call a subroutine at a 13 bit target address

The ALU uses a 4 bit field to determine the its action, and there are additional bit fields to control access to the stacks and memory.

T -> N        Copy T to Next
R -> PC      Put the return stack into the PC to get a free Return
N -> [T]     Store Next at the location addressed by T
T  -> R       Copy T to Return stack

Additionally there are two, 2-bit, bit fields that allow for the increment and decrement of the data stack pointer, and the return stack pointer - this enables items placed further down the stack to be accessed.


The instruction set of any proposed processor may be simulated in software. Once a model of the various stacks, registers and memory has been devised, it becomes a relatively straightforward task to create a C program, with text output, that simulates the operation of the cpu and instruction execution. Whilst the output of the simulator is either text or graphics, the process can be further developed to the point where any processor can emulate the instruction set of another - but with a vast speed penalty.

Fortunately the relatively simple J1 processor may be quite easily simulated in C - even using an Arduino.

The model consists of a 512 word memory  (As an Arduino Uno only has 2K of on chip RAM)

Snippets of machine language are loaded into the RAM during the setup() function - for example

 m[0] = 0x6000;       // NOP
 m[1] = 0x8020;      // LIT 20
 m[2] = 0x8010;      // LIT 10
 m[3] = 0x6400;      // ADD
 m[4] = 0x6700;      // NEG
 m[5] = 0x6000;      // NOP
 m[6] = 0x0001;      // JMP 0001

In this trivial example two literals are loaded onto the  stack,  added together, negated and the whole process is repeated as an endless loop - by the unconditional jump back to the beginning. This is definitely not a particularly good example to illustrate Forth, but it's a good test case to show that the processor model is correctly fetching, decoding and executing code, and that the "alu" and pc are working properly together.

The information contained in the machine instructions contains the following

Numerical constants or literals.  These are 15 bits packaged into a word that has bit 15 set - i.e. 0x8xxx in hexadecimal.

Target Addresses - there are signed 13 bit addresses, which are used to force the processor to branch to a new subroutine address or jump to a new address. The jump in unconditional, but the branch may be conditional - in that the top of the stack needs to equal zero for the branch to be executed. This gives the processor a branch range of +/- 8192 addresses.

ALU Instructions.

The ALU has 16 possible instructions as controlled by a 4-bit field.  Instructions of the type 0x6X00 are alu - where the X is the 4 bit instruction.

Code is Code

It might be worth stating that the entire operation of the processor is controlled by the various fields coded within the instruction.  This is what makes machine language very powerful, and yet very easy to make mistakes.  A single mistake in a field might send your processor off into an unintended area of RAM, where it can misinterpret your stored data as a program, and then start indiscriminately writing to RAM.  Invariable this ends up as a system crash.

As writing in machine code has always been a thankless task and prone to mistakes, it is best to spend time writing an assembler to help "assemble" programs from the processor's instruction set.

Assemblers use human readable mnemonics such as ADD, OR, JMP and allow numbers to be entered in decimal or hex. The assembler will use a text file which contains the source code, and which can be edited using a text editor. This can then be processed by the assembler to produce a binary or hex file that may then be loaded into the RAM of the processor.

Assembly language is the first layer of abstraction above the processor's own machine language.  As a tool it makes programming simpler, faster and less prone to mistakes.

These tools first stared becoming available in the early-1980s. Early 8-bit home-micros often had an assembler/disassembler available as part of it's toolkit.

An excellent reference book on Assemblers by David Salomon.

Forth as an Assembly Language

In the early 1960's, Charles Moore - the creator of Forth, realised that there may be a better way of writing programs, than the traditional assembler or high level language compiler method.

He knew that any program consisted of small snippets of code, each performing some small function within the program.  These functions and routines would be stitched together with calls and jumps to form the structure of the program.

He came up with the concept of the  Forth word,  where the word is the name of such a function - for example SQUARE.

Running on the processor was a small interpreter program, which could take the text input and compile it into executable machine code.

The word SQUARE could be written at the keyboard, or typed into a text file, and every time it was encountered it would perform the function of calculating the square of a number.

For this to work, SQUARE had to be created using the colon definition method of defining new words - which is written like this:

: SQUARE DUP  *  ;

: This colon is the word that tells the interpreter that this is a new definition
SQUARE this is the name of our new word, and that will be put into the dictionary
DUP is a forth word that duplicates the top word on the stack
* multiplies the top two entries on the stack, leaving the product on the stack
;  Semi-colon  - this denotes the end of the definition and a return to the inner interpreter

For a much fuller explanation of how this works - have a Read of Brad Rodriguez' excellent article "Moving Forth"

Suffice to say, that the Forth system provides the assembly, compilation and run-time execution environment needed for a self contained system, and it does it in a user interactive manner.

This video shows a typical Forth work session.

N.I.G.E Machine

In another post, I describe my project to combine a simulation of the J1 Processor with a set of simple graphical tools to allow assembly, disassembly and memory viewing.

A Graphical User Interface for Low Level Hacking

Disassember Window
Over the last couple of days - spare time permitting, I have written a simple application to assist in the development of code for a FPGA soft core processor.

So far, this consists of a memory view, a register view, stacks and a disassembler window.  The windows into memory are animated such that the actions of the instruction set on memory and registers may be viewed whilst single-stepping through the code.

The novel thing about this simple application, is that it has been written in Arduino C++ code, and once compiled, it runs on a ZPUino softcore processor hosted on a FPGA.  Additionally, the hardware which generates the 800x600 VGA display is also hosted on the FPGA. So we have a complete computer system consisting of cpu, memory and video generation hardware supplied on the Papilio Duo FPGA board.

The first part of this exercise was to get the graphical parts of the user interface working.  These consist of the hex memory dump window and the various stack, registers and disassemnler windows.

Now, whilst the ZPUino is itself a stack based processor, and these tools will eventually be used to examine it's operation, it was decided that initially I would use the ZPUino to emulate an even simpler stack processor.  The candidate is James Bowman's J1 Forth Processor (also available as a softcore for FPGA use) - which has the advantage of a very small instruction set, and a processor behaviour that is easily modelled in C code.

This may appear a somewhat round-about route but was chosen for the following reasons:

1.  The ZPUino can be programmed in "Arduino code" using DesignLab - the Papilio Duo IDE
2.  The ZPUino interfaces in hardware to the 800x600 VGA engine
3.  Adafruit's GFX graphics library has been ported to ZPUino
4.  A compact C model, and sufficient documentation exist for the J1 processor
5.  I wanted to understand how the J1 works, and what it's limitations are
6.  This is a programming project that meets my elementary coding skills

So my approach is to make use of the tools available.  James Bowman is working on an implementation of the J1 to run specifically on the Papillio Duo board, and make use of its 2Mbyte of SRAM. Whether he will develop it to the point where a VGA engine is supported is unknown - so for the moment I have to be content emulating the J1 with the ZPUino in C, with the heavy burden of the GFX library calls.

If we can develop sufficient momentum, then there might be a srong case to put a fast Forth soft core on a FPGA with VGA. This however is beyond my coding skills - but on my wish list for the future.

Disassembler Window

This does a very simple disassembly on the instructions in memory. The jump, branch, call and ALU instructions are decode to their mnemonics for easier reading. The animated display shows the instructions highlighted in cyan as they are executed by the processor emulator.

More Interaction

So far only the graphical code has been prototyped - just enough to see the animation of the J1 processor emulation.  For complete user interaction, it will require more code,  in particular that to support keyboard, mouse and a text editor window.

I have ordered a Classic Computing shield for the Papilio Duo from Gadget Factory in Denver. This includes sockets for PS2 keyboard and mouse, VGA output, microSD card and a pair of Atari style joystick connectors. This will allow keyboard and mouse interaction to be developed, plus program and data storage on the microSD card.  The thought had occurred to me that the Atari ports might be useful to accept switch presses from some form of custom keypad - a bit like Chuck Moore uses with his OKAD and colorForth environment.

Text Editor

The existing graphical layout allows for a text window of about 90 character columns by 75 rows.  This should be sufficient for 80 column mode plus a few clickable buttons.  The mouse will be used extensively for click and drag type operations - so a routine that links mouse position to the position of objects on the screen will be central to the user interaction.

As most coding languages are text based - the efficient manipulation of text leads to high productivity whilst programming. The use of colour text and highlighting of selected areas will enhance the user experience. The text window will additionally be used for serial output and command line input, and the use of the microSD card will allow source code to be saved and retrieved from "disk".

Assembler and Compiler

The J1 is a "Forth" processor, in as much that it is stack based, and almost all of it's instructions are Forth primitives. This allows it to execute the Forth language efficiently.  However, a modern Forth consists of about 200 definitions, and these have to be encoded in the native instruction set of the processor.  Fortunately, this is something that Bowman, and others have already done.