Sunday, July 23, 2017

When Microprocessors are a commodity – How do you choose?

Slightly over 20 years ago I had the need for a microprocessor to use in a project. I needed to keep the budget low. The 8 bit microprocessors of the time cost $15.00 in single quantities, that wasn’t an issue, but the tool chain cost was. I didn’t want to spend the thousands on a traditional tool chain that many of the processors of the day used. I bought a hobby programmer for $69.00. The required UV EEPROM Eraser cost me $29.00, the C compiler was $99.00. The tool chain was the most expensive part and that really drove the Microprocessor vendor choice. Which in this case was a Microchip PIC16C71 with all of 1k program memory and 36 bytes of RAM, running at a blazing 1 MIPS! The final project worked fine.

Thanks to Mr. Moore and his ‘Law’: Now we have 32 bit processors with built in floating point processors that have at least 512MB of flash and 256k of RAM all running with a 200 MHz system clock and it costs less than $10.00 in single quantities.

The tool chains are free and based on the GCC compiler. Similarly the basic programmer / debuggers are less than $20.00.

I have experience with both Microchip PIC32MZ and ST Micro STM32F7 products. So when I had a new project come by, how would I choose the ‘best’ device?

One processor is MIPS core based and the other is an ARM design. For 99.9% of the code I needed to write, the C compiler hides the underlying processor details so I don’t need to know anything about the underlying differences between a MIPS and ARM core. GCC has optimizations for both types and does a fine job of making efficient code.

The client may care about what processor to use, but if they say this: What they really care about is the tool chain, and I agree. You simply can’t run a successful operation if every project requires a different tool chain. So if you are a “ARM” processor house you are really is a: ST Micro, NXP, Atmel, et. al. tool chain house that just happens to target ARM Core processors from some manufacturer.

The choice as to what to pick comes down to slight features or preferences in tool chains or processor features. Everything else is pretty equal. Believe it or not, 32 bit microprocessors are a commodity.

My Current Project

The project at hand was a dual channel data acquisition and computation instrument that needed to drive two fast 16 bit ADC’s, buffer the samples in a large amount of on board RAM, then process the results with FFT’s and drive an Analog Output with some computed results. The design also needed a USB connection to a PC for setup and monitoring.

Normally this is done with at least three chips: An FPGA for ADC and Memory interfacing and a 32 Bit Processor acting as the DSP number cruncher and communication processor. For this project I needed to keep the chip count to one, the 32 Bit Processor. So speed and memory was a primary consideration.

Speed was the first consideration: At least 200 MHz system clock was required, just so I would not fail on the DSP computation part. I had enough bench-marking experience with some previous projects that I knew that a 32 Bit processor running at least 200 MHz would give me the desired number crunching performance.

The next constraint was that the ADC’s chosen were parallel output devices. I needed to be able to get a full 16 bits read in to the processor in a single chunk and ping-pong between the ADC’s to get both read in at a 2 MSPS rate. This is where demo boards came in. Both processors had (in a 64 pin package) at least one fully pinned out 16 bit port and writing some test bit-banging code I verified that both would also be able to manipulate the ADC’s and get the data to RAM fast enough. Both processors passed this constraint.

ADC interfacing: The ADC’s had a 3.3V IO voltage levels and both processors also have 3.3V IO pin voltage levels. Both processors also use a single voltage for the core and IO pins, this makes the overall design simpler to only have to supply one voltage to the processor. So no advantage to either processor.

Next was RAM – I wanted the biggest amount of RAM possible, just to be safe. The initial design was to run 2 x 8k FFT’s with another 2 x 4k buffers for a continuous averaging the results. Both chips had many times this RAM, but you can never have too much RAM can you?

The PIC32MZ had a slight advantage here as that part has a 512k or RAM versus the STM32F7 parts 256k.

Program Memory: I always buy the biggest memory part available for prototyping – after all you want to get the design going fast, not save a few bucks and end up sending days trying to figure out how to make the program fit. The PIC32MZ also has a slight advantage here as it has an unbelievable 2048k or Flash program memory! The STM32F7 topped out at 512k – Although to be honest I would never end up using all that Flash from either part for this application.

Note: If you are building a Web interface and will be serving up Web pages, all that RAM and ROM starts to look pretty small, pretty fast!

Speaking of FFT’s – Life is too short to be writing your own FFT and DSP routines and both chips passed this test by supplying a very complete and functional DSP library at no cost!

Speaking of Math – Both processors have Floating Point Units (FPU’s). The PIC32 however can do both single and double precision floating point, whereas the STM32F is a single precision unit only. The thought was in the code that I would do integer FFT’s and averaging, then convert the single FFT bin of interest to floating point to do the control math then convert this to back to integer for output to the DAC. Using floating point when the processor has a FPU has almost no speed penalty and just makes the code easier to understand (which my clients like). This gives the nod to the PIC32 for never leaving me out in the cold if I needed more precision than the single precision STM32F FPU provides.

Processor Package Size: The ideal was to use a 64 pin LQFP – Check on that as both families have that package available.

Tool chain: Both are usable, free, GCC based tool chains with very low cost programmer / debuggers. The STM32F7 tool chain is a little more common as it provides a set of tools to initialize and configure the peripherals through a HAL (Hardware Abstraction Layer) Library. The PIC32MZ uses what Microchip calls their Harmony configuration program. Harmony abstracts all the various PIC32 chips to a single very high level HAL

Documentation wise: The PIC32 XC-32 Compiler and Standard Library documentation is very good and customized for the PIC32 GCC extensions. Likewise the online help in the IDE, while not being perfect or complete is more than just auto-generated listings of function calls, so it is quite useful.

With the STM32 you are left with the standard GCC documentation that can be found on the web. The STM32 HAL library documentation is basically just auto generated listing of the functions and their parameters with no other useful information.

Nod to the PIC32 for better compiler documentation.

Both programming IDE’s are easy to use and have the same amount of annoying little issues (nothing works perfectly, does it?), so that’s a dead heat. At least neither ever crashed on my computers. They just have the annoying bugs like the dreaded “red squiggle underline” under perfectly fine code and not syntax highlighting correctly all the time.

Processor Interrupts – The firmware design was such that the sampling clock would drive the ADC’s convert pin directly and an external processor interrupt pin to initiate the processors data collection function. Another external interrupt would be needed for an external trigger circuit. Both processors have fast interrupts available on all IO pins, so no advantage to either here.

Communication – The plan was to use a trusty FTDI USB to Serial converter to get the USB communication to the Control PC. Hence I wanted a USART that could run at 3 Million Baud (The maximum rate for a FT232R chip). Both processors have multiple USART’s and can support the 3 MBAUD rate. A tie here. The PIC32MZ has a slight toolchin advantage as the stdio functions like printf() are already wired up to USART 2 and don’t require any further setup. On the STM32F parts there is some user code required to route the standard out to the proper USART. Not a big deal, but it has to be done.

Sampling Clock – The design was to use the 96 MHz system reference oscillator divided down to 24 MHz as the processor system clock input, this would be further divided to get an adjustable ADC sampling rate clock. The divider needed to be completely in hardware so that it would have no extra uncertainty jitter (An interrupt based timer/divider would have too much jitter and would not work).

The PIC32MZ has four independent Reference Clock Dividers that can be clocked from a variety of sources and can generate a divide ratio of 2-32768. This fit perfectly for my needs. The STM32F7 does not have such a divider. This is the biggest difference between the parts. It did take two hours to figure out how to program this on the fly as the core has to be unlocked, to change the divide ratio, etc. This was well covered n the documentation I just had to find it.

Analog Output – The result of all this sampling, FFT’s and math was a single number that could be output to a DAC. 12 bits was the minimum precision and 16 bits would be preferred. The initial thought was to use an external DAC on a SPI bus for this. As the update rate only had to be at 100 Hz, this would be easy to implement. The STM32 may have an advantage here – as it has two 12 bit DAC’s built-in whereas the PIC32 only has a low performance, 5 bit voltage divider built in. The worry here is that the internal DAC would be corrupted with noise from the processor core, but at 100 Hz output update rate I could have easily filtered off any noise. Slight advantage to the STM32 here.

Core Features – The MIPS core has an independent counter on the system clock. This counter runs at one half the system clock rate (100 MHz in this case) and can be easily used to make very precise delays and timing measurements.

The STM32F7 has a DWT Cycle counter on the core clock and a SysTick counter, but it is not straightforward to start and use and there are typos in both the ARM and STM32 documentation. Also the STM32 configuration program does not configure the DWT counter for you it must be done with bare metal code. That’s a negative for the STM32 toolchain, as the PIC32 Harmony configurator exposes every part of it’s chip for configuration.

Help – Both the PIC32MZ and STM32F7 processors have active and helpful forums on the interwebs, so no advantage to either part here.

Resume – Having ARM Core experience on your resume looks better than MIPS core experience, so points to the STM32F7 here. ARM cores are just more popular, go figure...

And the Winner is,

The choice was simple, even though I slightly prefer the STM32 tool chain over the PIC32 and if all other things were equal that would have been the deciding factor, the fact that the PIC32 has one single little hardware peripheral: “The Reference Clock Divider” ultimately drove the decision.  The use of an external DAC for the PIC32 implementation was not as big a deal as the reference clock, which would have required much more extensive circuitry than a simple DAC to implement externally. The use of an external DAC also alleviated all fear of having processor noise on the output.

The PIC32’s more ROM and especially RAM memory was simply icing on the cake.

Appendix – Why use an external USB chip when your processor implements USB?

A FTDI FT232RL costs around $4.50 in single units, so why put one in front of a microprocessor that has a built in USB interface?

Several reasons, actually.

#1 – Driver stability: Using a microprocessors built in USB interface usually means using the Windows CDC class driver. The CDC driver up until Windows 10 has been reported to have many quirky issues. The FTDI driver on the other hand is bullet proof. Unless cost is of paramount important, I always do my customers (and myself) a favor and use the most bullet proof solution possible. I just have never had any issue with the FTDI drivers on any PC. They work very well.

#2 – Program development: When developing programs frequent processor resets are the norm. Everytime you program new code the processor restarts. A processor restart resets the processors built in USB peripheral also, this breaks any connection that you have with the PC at the time, forcing you to restart the PC control program also.

When you use a FTDI chip, resetting the processor does nothing to the USB to PC connection, it does not reset and any PC programs keep running as if nothing happened. You might get a few garbage characters, but you won’t have to restart any PC program. This saves tremendous time and frustration during code development.

Even if your final design is going to use the processors built in USB, at least design your board to use a FTDI dongle for development purposes, then switch to the built in USB peripheral when the design is more stable.

#3 – Speed: USB is USB. The speed of the transfer is dependent on: How many bits are to be sent, the latency times and buffer size of the USB driver. The built in USB peripheral can’t be any faster than the FT232 chip. To maximize speed for any given application these two parameters need to be modifiable [1]. The FTDI driver exposes these two parameters in their excellent DLL so that they can be changed on the fly. It is a control panel / registry hack to accomplish this with the Windows CDC driver and I’m not sure that the values can change on the fly without disconnecting and reconnecting the device.

#4 – Ease of programming: With the FT232 chip the interface is through the processors USART and can directly use functions like printf(). If you use the built in USB peripheral you will be on your own to build packets and stuff them down the USB pipe optimally. This takes time to benchmark and analyze, the speed gains will also be marginal with un-optimized driver settings.

Do yourself and your customer a favor and give them the reliability of a FT232 USB connection, you won’t be sorry and in the long run everyone will save money.

[1] For a very interesting overview of USB latency and buffer size and how it effects total transfer speed see: “AN232B-04 Data Throughput, Latency and Handshaking” published by FTDI Inc.

Copyright notice: MPIS, ARM, PIC32, STM32, Microchip, ST Micro, NXP, Atmel and FTDI logos and names are copyrighted by their respective owners. 

Article By: Steve Hageman     

We design custom: Analog, RF and Embedded systems for a wide variety of industrial and commercial clients. Please feel free to contact us if we can help on your next project.  

This Blog does not use cookies (other than the edible ones). 

No comments:

Post a Comment