MCUs Team with FPGAs to Boost Embedded Designs Performance

Por Lee H. Goldberg

Colaboración de Electronic Products

Dramatic improvements in FPGA cost, performance, and power consumption are winning them a place alongside microcontrollers in a growing number of embedded systems. Designers on tight schedules (and even tighter budgets) are discovering that the recent crop of value-priced, medium-density FPGAs provides a cost-effective way to augment embedded MCUs’ capabilities. Their mix of programmable logic and high-performance interface circuits used to add custom I/O, network connections, and smart peripherals in industrial, commercial, and medical systems (Figure 1). Meanwhile, their larger, higher-powered brethren have found nearly as many applications as outboard processors for algorithmic accelerators in high-performance imaging, networking, or wireless systems. In these applications, smart designers are using programmable logic elements to gain strategic advantages in managing their product line’s life cycle.

Image of Altera medical imaging system block diagram

Figure 1: This block diagram of a medical imaging system illustrates how FPGAs can augment an embedded system’s host processors with cost-effective, high-performance I/O expansion, signal processing, and algorithmic acceleration capabilities. (Courtesy of Altera Corp.)

In this article, we will explore how MCUs can team with FPGAs to change how embedded products are designed at all levels of the embedded ecosystem.


Programmable logic devices (PLDs) and their denser, faster cousins, field-programmable gate arrays (FGAs), have played important roles in embedded systems design, even before the term “embedded system” entered the engineering lexicon. Back in the days when 16-bit processors and 64 Kbyte memories sat at the top of the silicon food chain, designers relied on them as so-called “glue logic” to stitch together the timing and control signals between the CPU and its associated components. PLDs and even EPROMs (remember them?) were also used to implement custom decoder and lookup functions. However, until recently large FPGAs were relatively power-hungry, expensive beasts, used primarily in prototypes, as well as high-ticket, low-volume products or as a quick way to offload well-defined compute-intensive operations from an overburdened processor.

This changed as FPGA makers moved production to increasingly-fine sub-micron processes. As the devices’ logic density grew and power consumption decreased, it became increasingly cost-effective to use programmable solutions in higher-volume applications (Figure 2). Although implementing a specific function in FPGAs remains larger, slower, and more power-hungry than an equivalent dedicated cell-based design at the same process node, FPGA manufacturer’s rapid adoption of aggressive sub-micron process nodes keeps them a step or two ahead of most ASICs and ASSPs. Products’ reduced power consumption and cost-per-element of each allow them to win new applications and larger markets.

Image of Xilinx FPGAs

Figure 2: As FPGAs crossed the sub-micron threshold, their higher logic density and reduced power began to make them increasingly practical for mid- to high-volume applications. (Courtesy of Xilinx.)

FPGAs as co-processors

Embedded designers are using FPGAs to augment an MCU’s processing capabilities or offload the FPGA’s logic fabric, adding DSP elements, hardware acceleration logic for encryption, and error correction, as well as other application-specific cores. Even powerful ARM9-based MCUs such as Freescale’s Kinetis series or NXP’s LPC2/3 family require either an outboard DSP or an FPGA-based accelerator to parallelize tasks like matrix processing, image/video compression, or encrypt/decrypt of AES and other crypto schemes. Thanks to their price/performance ratio and reconfigurability, FPGAs are now commonly used in up/down conversion, mod/demod, envelope control, and other functions in software-defined radios.

Many embedded automotive systems are putting FPGAs to work in infotainment, and safety applications such as lane departure warning systems. In a vehicle’s combined entertainment and information systems, programmable logic serves as a companion to the host processor, providing media processing, graphics acceleration, and vehicle networking functions (Figure 3). FPGAs also perform key roles in lane departure sensing. By providing the high-speed image processing and radar signal processing needed to extract features (such as lines on road, street signs, etc.), the device is able to pinpoint the vehicle’s location on the highway. Similar algorithms are used to determine the position/relative velocity of other vehicles which is then used in driver assist functions (i.e. braking assist, accelerator management and providing steering suggestions). FPGAs enable quick-turn development and rollout of these advanced systems, which are too new and evolving too rapidly to be economically implemented using ASICs or ASSPs.

Image of Xilinx Spartan-6 FPGA

Figure 3: Serving as a companion to the host processor, a single Spartan-6 FPGA supports audio/video acceleration, graphics subsystem, and vehicle networking functions. (Courtesy of Xilinx.)

FPGAs have a long history in medical applications. As the largest, Altera Stratix III and Xilinx Virtex-4 /5 devices were used to process the vast quantities of raw data produced by high-resolution CAT, MRI, and PET imagers. In medical applications like this, FPGAs are also used as front ends for high-speed A/D and D/As that interface directly to sensors and transducers for processing and transmittal to host system across a wide, fast PCIe Gen2 bus.

However, smaller, less costly devices are becoming integral part of low-cost portable imagers (CT, endoscope, and ultrasound) that can deliver advanced services to rural and urban clinics, as well as health care centers in emerging economies. These applications run scaled-down versions of the algorithms used in high-end equipment, which can be implemented using mid-range, value-oriented FPGAs, including substantial amounts of DSP fabric, such as Altera’s Arria Xilinx’s KINTEX families.

I/O expansion has always been one of the most common applications for FPGAs in embedded systems. Nevertheless, the addition of specialized elements has made it even easier for embedded designers to use their host processor’s PCIe bus to add a custom mix of I/O and networking functionality to their design. The FPGA’s hardware functional blocks and programmable logic can be fashioned to implement the PCie host interface, as well as connections that add Gigabit Ethernet, SATA, FibreChannel, or Infiniband networking capabilities to a Texas Instruments Stellaris MCU (Figure 4). Other FPGA resources can be used to build connections for other commonly-used I/O connections such as USB, FireWire, HDMI, and DisplayPort, or specialized high-performance system interconnects such as CPRI/OBSAI.

Image of Altera PCIe bus

Figure 4: A high-performance processor’s PCIe bus serves as the interface to an FPGA configured as an I/O companion/expansion device. (Courtesy of Altera Corp.)

FPGA-based I/O and memory expansion is becoming increasingly popular for expanding the capabilities and extending the service life of older designs whose host processors may need additional processing power or interface capabilities to address new requirements. In addition to the I/O elements just discussed, most FPGA families offer variants equipped with interfaces which can be configured for most common DRAM/SRAM connections and independent DDR/QDR memory controller elements. Together, they can allow the FPGA to buffer and manage the torrents of data flowing through its I/O connections. This technique also allows designers to adapt their designs so that an older or lower-cost processor used in a “legacy system” is able to enjoy reduced BOM costs. These systems are able to use whatever DDR2/3 RAM is most abundant on the market or take advantage of the higher performance available from DDR/QDR memories.

The next step: MCUs on board

As FPGAs compete against ASSPs and ASICs for socket space in a growing number of mid-to-high-volume markets, manufacturers have rolled out new families of devices whose hardwired processor cores deliver higher performance and even lower TCO. In many cases, these so-called SoC FPGAs are available with application-specific combinations of logic elements, processors, memories, and related hardware cores. For example, Microsemi’s SmartFusion devices pair their non-volatile mixed-signal FPGA technologies with an ARM Cortex-3 32-bit RISC processor (Figure 5). The precision analog temperature, current, voltage and temperature sensing elements, PWM engines and other real-world interfaces in the SmartFusion A2F200 can be configured to provide multi-axis motor control in industrial, automotive, and aerospace systems.

Image of Microsemi’s SmartFusion series

Figure 5: Microsemi’s SmartFusion series adds the ARM Cortex-3 processor to Fusion’s mixed-signal FPGA architecture. (Courtesy of Microsemi SoC Products Group.)

Further up the performance spectrum, Xilinx’s recently-introduced Zynq-7000 Extensible Processing Platform is equipped with the ARM Cortex A9 high-performance RISC processor (Figure 6). The Cortex A9 features a high efficiency, dynamic length, multi-issue superscalar, out-of-order, speculating 8-stage pipeline. It is available as either a single core or configurable multi-core element whose processing power can be tailored to a wide variety of intelligent video, communication, and control applications. Should additional processors be required, soft cores of the 8-bit PicoBlaze or 32-bit MicroBlaze processor can be constructed from the Zynq’s programmable logic.

Image of Xilinx’s Zynq-7000 series

Figure 6: Xilinx’s Zynq-7000 series of FPGA SoCs pair the powerful Cortex A-9 processor with a heady mix of high-performance programmable logic and dedicated peripheral cores. (Courtesy of Xilinx.)

Altera’s latest FPGA SoCs are also based around the A9 processor in several product families, including the value-oriented Cyclone V GX series. Targeted at industrial and automotive applications where production volume is moderate and I/O density is high, Cyclone devices equipped with a single-core A9, 25k programmable logic elements and an assortment of peripheral cores are available with volume pricing as low as $15. Since Cyclone SoC products also integrate the Ethernet, CAN controller, DSP (for motor control), and other I/Os, they are very competitive in with conventional multi-chip solutions.

Altera has also created several members of its performance-oriented Arria V series, which are equipped with single- and multi-core A9 processors and are aimed at cost reduction of existing designs in automotive, factory automation, and video processing applications. Since most wireless systems are already based on the ARM Cortex-A9, both Altera’s and Xilinx’s A9-equipped FPGA SoCs are helping drive down the cost and power of infrastructure products as designers recruit them for service in baseband processing, remote up/down processing, and digital and pre-distortion/envelope control.

Descargo de responsabilidad: Las opiniones, creencias y puntos de vista expresados por los autores o participantes del foro de este sitio web no reflejan necesariamente las opiniones, las creencias y los puntos de vista de Digi-Key Electronics o de las políticas oficiales de Digi-Key Electronics.

Acerca de este autor

Lee H. Goldberg

Acerca de este editor

Electronic Products

La revista Electronic Products y ofrece ingenieros y gerentes de ingeniería responsables de diseñar equipos y sistemas electrónicos.