Sunday, June 19, 2016

DSPLib – An Open Source FFT Library for .NET 4

There are many Open Source and Commercial implementations around the web and in textbooks for computing Fourier Transforms. Unfortunately most are flawed in a number of ways,
  1. They produce an un-calibrated result that changes depending on the number of points transformed. 
  2. They include no built in methods to scale for Windowing of the input data.
  3. They always have no proper way to measure noise accurately.
  4.  They don't size the returned spectrum to have just the real part of the spectrum. 
  5. They implement their own Complex number type. Ignoring .NET 4's built in Complex data type.
  6. They aren't complete. You have to add a bunch of helper routines every time.
  7.  They have restrictive Open Source Licenses.

All of these things take hours of tweaking to get a usable FFT or DFT running from even the best of the currently available libraries.

I decided to solve this problem for myself and my clients by making a pretty complete Fourier Transform Library that implements, 
  1. Properly Scaled Fast Fourier Transforms.
  2. Properly Scaled Discrete Fourier Transforms. 
  3. Properly Scaled Data Windowing.
  4. Proper functions to scale for noise and signals in the correct manor. 
  5. Signal Generation for Testing.
  6. Useful Array Math routines.

This work is the culmination of about 5 years worth of work using, revising and tweaking other libraries and implementations, both open source and commercial before I wrote my own.

DSPLib is the first Fourier Transform Library that can take any time domain signal input, like from an ADC, apply one of the 27 built in window types and produce a correctly scaled Spectrum Output for either signal or noise analysis with no code tweaking required at all.

The library is released under the very non restrictive MIT License and is essentially royalty free for any use, even commercial.

The complete write up is at – take a look and enjoy never needing to spend hours tweaking Fourier Transform code again.

Open Source .NET FFT Fast Fourier Transform Library Code
Open Source C#  FFT Fast Fourier Transform Library Code
Open Source .NET  DFT Discrete Fourier Transform Library Code
Open Source C#  DFT Discrete Fourier Transform Library Code
Open Source C# FFT Library,   Open Source .NET FFT Library
Open Source C# DSP Library Code
Open Source .NET DSP Library Code

Article By: Steve Hageman 

We design custom: Analog, RF and Embedded systems for a wide variety of industrial and commercial clients. Please feel free to contact us if we can help on your next project. 

This Blog does not use cookies (other than the edible ones). 

Monday, April 25, 2016

FFT's meet 200MHz, PIC32MZ Microprocessors


Lately I have been working on a series of Analytical Instruments that are roughly “Hybridized” Lock In Amplifiers[1]. These designs are Hybrids because the purely Analog Lock In Amplifier block diagram has been augmented to incorporate Digital Signal Processing functions by digitizing the waveforms then applying DSP Techniques (including FFT's) for noise reduction, reference signal phase comparisons and maintaining the quadrature phase lock of the synchronous detector function, which is also a digital processing block. In addition, sophisticated noise filtering can be accomplished all in software at very respectable update rates.

This is all made possible and simple because of the latest crop of single chip microprocessors that are capable of running at 200 MHz core frequencies and have specialized DSP functions like: DMA and DSP Processing instructions that allow them to beat even most specialized DSP Processors at their own game. The latest Microchip PIC32MZ processors even include free and quite capable DSP libraries so it really is one stop shopping now. All this power for less than the price of a good lunch, as these chips cost less than $10 each.

PIC32MZ FFT Performance

The MIPS core based PIC32MZ processors running at 200 MHz have the potential to do a really fast and big FFT. To find out just how well they do, I fired up my PIC32MZ EF / Connectivity Evaluation board from Microchip Technology (DM320007) to answer a few questions.

The first question is always: “How fast is the FFT?” There is, after all, no point in asking any other questions if a 1024 point FFT takes all day.

If the answer to the first question was reasonable, the next question should be: “How big of an FFT can I do?”, followed by: “What type of dynamic range can I expect?” and finally, perhaps more towards the implementation side is the question: “How is the FFT scaled, so I can get a calibrated response?”.

For the FFT implementation I used the DSP library included with version 1.07.01 (March 2016) of Microchips Harmony Software suite. This library works on the well known Q15 or Q31 bit fixed point schemes. If you are unfamiliar with Q15 and Q31, just think of them like signed 16 and 32 bit integers [2].

The Harmony libraries are easy to use and well documented. Only two functions need to be called to accomplish a FFT. An initialization routine must be called first (and only once) to load up the twiddle factors for the specific length of FFT. Then the FFT itself can be called to actually perform the FFT. To get more information on the libraries, open the Harmony Help and search for “DSP”.

The libraries contain both a Q15 and a Q31 bit version, so I bench-marked both. About the only guidance that Microchip gives in the documentation is that the Q15 version will probably be faster due to optimization, and my testing proved that to be true.

I used my PC and a quick C# program I wrote that uses a USB based Serial connection to load the test waveform down to the PIC's RAM and then to read the results back to the PC for final processing and display. The test waveform was a sine wave that was properly scaled to either 16 or 32 bits. Since the sine wave was calculated with .NET doubles the result is better accuracy than the final cast. This means that the sine wave distortion was not the limiting factor in dynamic range in any test case.

For timing I used the MIPS Core Counter. This counter runs at ½ the core frequency or in this case 100 MHz. The Core timer was read at the start of the routine and after the routine finished, then the counts are subtracted to get the counts between calling the routine and its return. This count is converted to the equivalent time in microseconds for the timings.

No interrupts were enabled in the program and the PIC program was running in a big loop, where the FFT functions were called in a blocking fashion. I allocated the data arrays at compile time and didn't use anything like malloc() when the program was running. I also used Microchips free version XC-32 compiler with the Harmony Framework and no optimizations turned on [3].

The Evaluation board is fitted with the largest PIC32MZ available, a PIC32MZ2048EFH144, this device has a whopping 2MB of Program Flash and 512kB of SRAM on board. This large amount of SRAM allows for some rather large FFT's to be run.

The whole program even with the command interpreter I built took less than 1% of the program flash memory.

Now - On with the show

The first test is naturally to find the FFT speed versus size of the FFT for both the Q15 and Q31 formats.

Table 1 – The FFT speed in micro-seconds was measured for each FFT size for both the Q15 and Q31 formats. The Compiler could only allocate enough memory for a 16k Q15 FFT and 8k Q31 FFT.

The Harmony documentation states that the FFT Initialization function is written in C code and as such, is sower. This is OK, as the initialization function only needs to be called once or if the FFT size changes.

Table 2 – The Q15 and Q31 Initialization function does take longer than the FFT itself, but this function only needs to be called once or if the FFT size changes. The table values are in micro-seconds.

Figure 1 – The Q15 and Q31 FFT times are plotted for a graphical comparison.

Determining the FFT Speed for various FFT sizes actually answered two questions, eventually as I increased the FFT size I ran into a point where the program execution crashed when I tried to run the FFT. I was able to coax a 16k Q15 and an 8k Q31 FFT out of the processor before it would not run anymore.

If your application needs to apply a window to the time data, that is another vector array that would need to be maintained in RAM. Vector averaging and display buffering also require vector arrays and these can quickly eat up all available RAM so that even these FFT sizes may not be achievable in a full featured application as RAM gets gobbled up very quickly as the FFT sizes increase.

Dynamic Range – Or how many bits is that?

Dynamic range in the FFT processing is an important factor I determining whether a Q15 or Q31 format FFT is needed.

For instance a 12 bit ADC can have around 72 dB of dynamic range with no averaging or processing gain applied and a 24 Bit ADC can have a basic 144 dB of dynamic range.

It wouldn't do you very much good to attempt to use a 24 Bit ADC with a 12 Bit dynamic range signal processing chain. You'd just be in the numerical noise with no way of averaging your way out of it.

To determine the FFT's dynamic range I generated a perfect sine wave with my PC and then scaled it to either Q15 or Q31 full scale format. Then I sent this waveform to the PIC and had the PIC do a FFT on the data. I returned the complex FFT result to the PC where I converted it to magnitude dB format for display.

Below is the results for a large FFT with a full scale signal for both the Q15 and Q31 format FFT's.

Figure 2 – The Q15 FFT has about a 74 dB full scale to numeric noise dynamic range.

Figure 3 – The Q31 format has a more 'Spurious' noise look to it but the full scale range is just around 140 dB. Good enough for 16, 18 and probably 24 bits depending on the applications exact needs. Note that the 'spurious looking' noise floor is real and not some internal limiting problem. I backed off the input signal amplitude to make sure that some calculation was not saturating and it had no effect on the noise spurs amplitude.

Anybody have a ruler?

As I mentioned at the start – only if an affirmative answer is given to the speed, size and dynamic range questions should you start worrying about the “How is the FFT Scaled?” question.

Actually the Harmony FFT implementation is quite easy to work with. Most FFT implementations that you will find scale the amplitude with the size of the FFT or 'N'. This means that you have to apply a scale factor proportional to 1/N to the result to get a constant amplitude regardless of the FFT size. Not here, the proportional to 1/N scaling has already been applied, meaning that you will get the same amplitude output for any size FFT. Points scored for whoever wrote this FFT!

The amplitude that you get back depends on the format that you used in the first place. If you use a Q15 format FFT, for a full scale peak to peak sine wave input, you will get out a peak magnitude of (2^15)/2 or 16,384. Similarly the output of a Q31 format FFT will be (2^31)/2 = 1,073,741,824.

The ½ effect is due to the fact that the actual output of a FFT is a mirror of positive and negative frequencies with ½ the total power in each spectrum. Since we are only normally concerned with the positive frequency side, we observe that the power is effectively reduced by ½.

You can easily apply a constant multiplier to get any proper scale factor that you need to the FFT result. Remember that the input signal, if it is a sine wave will probably be measured as a peak to peak value, whereas the FFT Magnitude display shows the RMS value of the input signal at each specific frequency bin.

Also remember that any windowing applied to the time series or zero padding will effect the amplitude of the resulting FFT output also and this must be accounted for by proper scaling. You can check out some previous articles for more about how to do that [4] [5].


It is simply unbelievable what we can do now with a single chip micro-controller when they cost less than lunch and run at 200 MHz. It wasn't too long ago that I had a 33 MHz, 386 Desktop PC and this little 32 Bit PIC can do a FFT faster than that PC could!

Article References

[1] Lock-In-Amplifiers, Wikipedia

[2] Q / Fixed Point Notation formats, Wikipedia

[3] Most of the DSP library that really needs speed is provided in object file format or coded in assembly already. Using the 'Pro' level compiler will probably not increase the speeds reported here.
[4] Hageman, Steve, EDN Online, June 19, 2012, “Understanding DFT & FFT Iplementations”

[5] Hageman, Steve, EDN Online, August 6, 2015, “Real Spectrum Analysis with Octave & MATLAB”

Article By: Steve Hageman
We design custom: Analog, RF and Embedded systems for a wide variety of industrial and commercial clients. Please feel free to contact us if we can help on your next project. 
This Blog does not use cookies (other than the edible ones). 

Friday, February 5, 2016

Decoupling RF Circuits - Part 1

Several weeks ago I saw a RF circuit that scared me from the “Measurement Repeatability” standpoint. The circuit in question had RF at one end and DC at the other, simple enough and any number of modern devices contain such circuits. Whether it be a Bluetooth module or a PLL Synthesizer IC, a simple RF switch or as in this case one of those very useful little RF Power to DC converter IC's.

The design intent here was to make a decent RF power measurement at the input to the circuit and then digitize the output and pass that result on to a controller. The design was straight forward enough. The board was laid out with a reasonable SMA edge launch connector and passable RF trace design to the IC. What caught my eye however was the DC portion of the circuit layout. This should have been the easy part, but as it turns out the physical implementation of the DC circuit had a very large impact on the circuits RF performance.

Figure 1 – A simplified representation of the physical layout is shown above, straightforward enough, but as it turns out the DC portion of this circuit had a very large impact on its RF performance of this particular circuit. 
The DC output trace looked exactly like the RF input trace. Now the RF input trace would be properly terminated by whatever is connected to the SMA input one end and by the IC + Matching network on the other end. If you remember your transmission line theory a matched transmission line will have a low VSWR (also known as low reflection) and exhibit very little amplitude ripple due to mismatch and hence will have low measurement uncertainty.

An improperly matched transmission line will act like a resonant structure with the potential for an amazingly large Q. The DC output trace was clearly the same width as the RF input trace, it was also longer than the RF input trace (about 0.6” long). Since the operating frequency range of the circuit was upwards of 10 GHz, this means that the supposedly DC output trace would will likely resonate in the band of operation. Schematically the equivalent circuit could be modeled as in Figure 2.

Figure 2 – Above is a schematic representation of the circuit and below is the simplified transmission line equivalent. Estimating 0.1 to 0.2 pF of coupling across a small IC die like this seems entirely likely and at high frequency’s will certainly be a significant amount of coupling. The 3mm QFN also adds effective length to the circuit so it has to be taken into to account also if one wants to simulate the resonance peaks at the proper frequencies.

I Hooked the circuit board up to my network analyzer and measured the S11 and found the expected result. The input match was not well controlled and did indeed show all the characteristics of coupling to a high Q structure (Figure 3).

Figure 3 – The actual measured S11 (Input Match) to the circuit showed the expected and undesired coupling. The actual measured S11 was less than -5dB in a couple of spots. That poor of a match (under 3 dB!) can add quite a bit of amplitude uncertainty to any measurement.

To make sure that the coupling was indeed caused by the DC Output trace and not some other problem, I removed the QFN and replaced it by using two 100 ohm 0402 resistors connected in parallel to the ground plane on either side of the RF input trace directly at the QFN Pads, this gives a very low inductance and capacitance 50 ohm load to analyze just the RF input trace, I also modified the input matching to account for this. The results of just measuring the RF input trace are shown in figure 4.

Figure 4 – Repeating the S11 measurement with the IC replaced by two 100 Ohm 0402 resistors in parallel directly at the QFN input pins shows that the RF portion of the circuit was quite well behaved and actually had an outstanding match all the way up to 10 GHz.

Repeating the measurement a third time with just a 0.2 pF capacitor bridged across the RF trace to the DC output showed essentially the same results as figure 3, thus confirming the cause of the bad match as the DC output trace resonating. A simulation of the transmission line equivalent circuit of Figure 2B also showed about the same response as the actual measurements as is shown in Figure 5.

Figure 5 – Simulating the equivalent circuit of Figure 2B and we see essentially the same as the measured response, the first peak is at about 4 GHz as was measured. The continued sharp resonances in the simulation are because this simple linear simulation does not include real PCB and circuit losses, hence the simulated Q's are higher than actual.

Decoupling Is The Solution

The basic solution to this problem is to decouple the DC output trace properly.

Step #1 When designing the PCB in the first place, narrow the trace to a more reasonable width for a DC signal, narrowing this trace from 20 mils to 6 mils changes it's impedance from 50 ohms to around 95 ohms. There was simply no reason to have the DC trace the width of a 50 Ohm line other that it: “Looked Pretty”. By making the trace narrower it will have more loss at high frequencies and when it is a higher impedance, it will be easier to decouple.

Step #2 Is to keep the RF off the trace as best we can in the first place. This can be accomplished by decoupling the DC output from the trace. This solution can also be hacked on an already built PCB as a band-aid.

Decoupling essentially puts some sort of low pass filter between the IC and the output trace to keep the RF off the trace in the first place. This can be accomplished many ways here I will present one of my favorites, simple RC filtering.

By adding a simple and low cost RC Low Pass Filter as close as possible to the IC (Figure 6), RF can effectively kept off the output trace really reducing coupling.

Figure 6 – Modifying the original circuit by adding R2, R3, C1 & C2 to decouple the RF energy at the IC from getting on the long output trace and causing resonances can be accomplished by one of my favorite techniques shown above. I size the resistors and capacitors based on the technology that the majority of the rest of the design is using, normally this means either 0603 or 0402 sized parts.

Figure 7 shows the measurement results of this simple addition when it was hacked on to the test PCB.

In circuits like Figure 6 shows I usually start with 200 to 511 Ohm resistors as they are relatively flat over a large frequency range such as this and the largest COG capacitor I can find in whatever size the design is using, in this case it was a 1000 pF / 0603 Sized part. COG Temperature coefficient parts typically have better RF Performance because of lower losses.

The circuit of Figure 6 also adds a voltage divider that must be taken into account in the overall final design, in this case the A/D's gain was adjusted to compensate for the extra drop across the two 511 ohm resistors. Simulation also showed that the resistors can be made about 200 ohms each with nearly the same decoupling result.

Of note: Normally we would like to add a capacitor directly at every DC pin on the IC, but this particular pin is an OPAMP output and we don't want to make the OPAMP unstable with the addition of a capacitor directly at it's output.

Finally to test the decoupling circuit itself to make sure that no unruly resonances are occurring there, I connected SMA cables to both sides of the DC output trace and did a S21 through line measurement. The results shown in figure 8 show a nicely damped response with no funny resonances.

Variations on a Theme

There are naturally as many possible variations on making a suitable decoupling network as there are designers and there are many different circuit requirements too. For instance, the RC decoupling circuit of Figure 6 would not work very well for a DC bias line as it would produce a lot of voltage drop when the current get's into the MilliAmp range. Likewise the 3dB bandwidth of Figure 6 is just a little over 100 kHz, so using this decoupling network on a I2C or SPI line won't work very well either unless the clock frequency is very low. So there are many requirements to consider before selecting the actual circuit values.

Figure 7 – Adding Figure 6's decoupling to the original circuit tamed resonance on that long and wide DC trace to a very reasonable level, there is still coupling but the input Match is much more well behaved (Better than 17 dB everywhere). All this without even touching the actual input trace or match at all.

Figure 8 – S21 Measurement of the through response from one side of the long DC trace to the other shows a nicely damped response with no troublesome resonances anywhere. The reason the response is rising instead of falling is because at high frequencies the 1000 pF decoupling capacitors are actually inductive because of the package parasitics. 0402 Sized parts have less package inductance and for higher frequency circuits may be more appropriate.

Appendix: Sometimes finding the culprit resonance is not so easy

Sometimes it is not clear on a complex PCB what is causing the problem resonances, in these cases a bit of detective work is required. Remember that a poorly terminated transmission line will resonate when the frequency reaches a quarter wavelength and every quarter wavelength after that it will resonate with the opposite impedance.

Start off by measuring the resonant peak frequency or if multiple peaks, the delta frequency between peaks, then calculate the transmission line length that will give these frequency peaks and start looking for trace lengths on the PCB that match the calculation and pretty rapidly you will find the culprit. Sometimes you can also use the "Move your finger around touching things" method.

If your frequency sweep is from a low frequency and your circuit is fine, then suddenly there is a resonance, you are probably seeing a first quarter wavelength effect (Also see Reference [1]).

On the other hand if you are sweeping across some very high frequency band and you see some repeating ripple pattern then you are probably seeing a one half wavelength effect. At every half wavelength the impedance of "whatever" that is resonating will circle around to the same impedance again, hence giving rise to a repeating pattern.

In practice I have found problems in the range of 7 inches for a cable that ran off the PCB and then back on again causing a resonance to traces like this example PC that were a few tenths of an inch long.

Some common approximate frequencies and quarter wavelength for Microstrip on FR4 are given in the table below,

  The Table can be written as a formula like,

Where the Microstrip quarter wavelength on FR4 is in inches and F is the frequency in GHz.

Figure A.1 – As an example of the kind of repeating pattern that can be seen on a transmission line. Here an open circuited line was simulated. The Y axis here is Impedance Magnitude in Ohms and the X Axis is frequency.

The blue arrows on the bottom mark off ¼ wavelengths. As can be seen, a open circuit transmission line resonates every ¼ wavelength but the impedance at each ¼ wavelength goes from high impedance to low impedance and back again.

The red arrows at the top mark off ½ wavelength segments. On this open circuited line the ½ wavelength repeating pattern is at a very high impedance and there is also a repeating ½ wavelength pattern at low impedances (the peaks on the bottom of the graph, not marked).

As a note, a short circuited line has exactly the same pattern, it is just reversed in impedance. That is, instead of starting at a high impedance at low frequencies, it starts at a low impedance and the first ¼ resonant peak is at a high impedance, not the low impedance shown here. Then the pattern repeats the same as for this simulation.
The simulated response shown in Figure A.1 is for some randomly selected line length. Note that a longer line would produce peaks closer together while a shorter line would produce peaks that are farther apart. 


[1] A resonance effect on a RF circuit can also be caused by any shielding or box that the circuit is put into. You can read more about that here,

By: Steve Hageman
We design custom: Analog, RF and Embedded system for a wide variety of industrial and commercial clients. Please feel free to contact us if we can help on your next project. 
This Blog does not use cookies (other than the edible ones).