Implementing Advanced Video Graphics Capabilities with ARM-Based MCUs

By Warren Miller

Contributed By Hearst Electronic Products

Systems that require advanced video graphics features have been a challenge to implement. Traditionally, these systems required a complex combination of hardware features and software functions to create the necessary higher-level video and graphics required by the application. The most current crop of ARM-based MCUs now offer a comprehensive set of advanced hardware features that typically implement common video and graphics standards as stand-alone blocks, minimizing the need to “hard code” these hardware functions. Additionally, the extensive ARM-based ecosystem provides a host of software support for implementing common application-level functions (for example, to easily build applications for Graphic User Interfaces, or GUIs) with a minimal amount of low-level coding needed. Extensive reference designs and hardware platforms allow the designer to leverage manufacturer-developed routines to simplify the creation of their own custom applications. Let’s look at some of the new video hardware and software capabilities now available in ARM-based MCU families.

Three key elements: input, processing, and display

It is convenient to separate the three key video graphics elements into input, processing, and display. The input functions typically allow the device to connect to a variety of sensors and cameras over a wide range of standards. Processing functions support translation of one standard to another, the creation of multi-layer graphics objects for display (for example, as an element in a GUI), and even more complex functions for creating objects with depth and shading attributes. Finally, the display functions convert the graphics elements stored in internal memory into data that can be transmitted to a flat- panel display or a TV. Let’s now look at each of these elements in more detail, using example ARM-based MCUs.

Video graphic input

Most modern ARM-based MCUs include a video-input interface to common image sensors. These interfaces typically store the video input into internal memory using a standard format (such as RGB or YCbCr) that can be processed and then displayed on a standard LCD panel. The Atmel SAMA5D3 series of ARM-based MCUs is a good example that illustrates some desired video-input features. The main video-input block is the Image Sensor Interface (ISI) and connects to an image sensor using two popular methods: hardware synchronization with Vertical and Horizontal synchronization signals, or International Telecommunications Union Recommendations (ITU-R) standards (such as BT.601/656). The use of BT.601/656 reduces the pin count since vertical and horizontal signals are not needed, but is less flexible than the generic vertical and horizontal hardware synchronization mode. Image sensor data consists of up to 12-bits of parallel data and supports up to 2048 x 2048 resolution. The full input data path is shown in Figure 1 below.

Image of Atmel SAMA5D3 MCU video input path

Figure 1: Video input path for Atmel SAMA5D3 MCU (Courtesy of Atmel). 

The timing signals and image sensor input data are shown on the left side of the diagram. There are two input paths shown at the middle bottom of the figure; one for the preview path and one for the codec path. The preview path can be formatted for converting and then displaying the input video directly on an output LCD panel (perhaps in RGB format). The 2D image scaler and pixel formatter can be used to implement simple video scaling and clipping is required. The complete path can be supported via interrupts and DMA accesses without getting the CPU involved at all. If video processing is required, the input data can be converted (for example, from RGB to YCbCr) and stored in memory in a packed format using the Codec Path shown at the bottom of Figure 1. FIFO buffers and an AHB master with scatter gather capabilities complement the data input capabilities.

The ability to manage the data path using interrupts and DMA transfers can be critical for improving efficiency in video applications. This allows the ARM CPU to focus on implementing the management functions and providing customized processing for feature differentiation. The Atmel SAMA5D3 is a good example of a device with significant autonomous operations, something you should look for when processing efficiency is critical.

In many cases, the video data is already compression processed and needs to be captured in a different format. Security cameras, for example, often provide video data in a compressed JPEG format to reduce the bandwidth required for transmitting video data. This can reduce power consumption requirements at the receiving end and simplifies data bandwidth requirements. Encryption is often used to secure the video data from network intrusion. Let’s consider an example security camera design using the STMicroelectronics STM32F207 MCU as our target platform.

The STM32F207 digital camera input is a synchronous parallel interface able to capture high-speed image data from an external CMOS camera module. It supports common formats such as YCbCr4:2:2, RGB656, 8/10/12/14-bit progressive video and JPEG. JPEG is useful in low-power applications since it is compressed and thus requires less bandwidth to transmit images. The STM32F207 receives JPEG video within the Horizontal Synch portion of a video stream as illustrated in Figure 2. Note that the HSYNC signal varies in width depending on the amount of data required for each JPEG packet. Beginning and end-of-stream indications are bounded by the VSYNC signal for easy identification.

Image of STMicroelectronics STM32F207 JPEG capture

Figure 2: JPEG capture with camera input on STM32F207 (Courtesy of STMicroelectronics). 

Once the JPEG video data is captured, it is stored in memory using a DMA controller to reduce CPU overhead. Prior to transmission, the video data is encrypted to make it difficult for the video to be captured or tampered with by network intruders (how often have you seen security video data “hacked” in caper films). The STM32F207 has an on-chip cryptographic processor that supports popular cryptographic standards like Triple-DES and AES-256. The cryptographic processor is accessed as a 32-bit AHB peripheral and supports DMA transfers. A random number generator and a secure hash processor are also available to add authentication capabilities so that commands to the security camera and data transmitted from the camera can be proven to be from the expected source.

Once data is fully processed and ready to send, the STM32F207 has an Ethernet controller that supports 10/100 data transfers using an IEEE 802.3-compliant MII interface to connect to an external PHY. The controller also supports LAN wake-up frames so that low-power modes can be used while waiting for activity on the Ethernet port. Dual 2 KB FIFOs (one for transmit and one for receive) provide sufficient buffer storage to keep efficiency high and thus reduce power consumption. DMA is used for transferring Ethernet traffic as well as to further reduce CPU overhead. The pervasive use of DMA to move data from input, through processing to transmission, is a common technique in low-power video implementations, and should be a key element in any design where power efficiency is a key consideration.

Video graphic processing

Video graphic processing is one of the more complex capabilities advanced ARM MCUs and MPUs have added in recent years. MPUs in particular have added hardware acceleration for video processing targeted at creating innovative User Interfaces that can include real-time video. The ability to transform video data from one format to another, crop, scale, and color correct for the properties of the target display or varying lighting conditions can dramatically enhance the users experience. The Freescale i.MX53xx has an on-chip 2D Graphics Processing Unit (GPU) and a separate 3D GPU. The 2D GPU implements a variety of graphics and video processing functions targeting the OpenVG 1.0.1 graphics API and feature set. The 3D GPU is targeted at DirectX9 shading and texturing for constructing advanced graphics images.

The 2D GPU is particularly useful in constructing visually compelling UIs and can use graphic elements created by the 3D GPU. The 2D GPU supports bitmap graphics operations, such as BitBlt, fill, and rasterizer operations on a frame buffer up to 2048 x 2048 in a full range of source and destination bitmap formats (for example, from ARGB4444 to ARGB8888). Three separate source bitmaps for mask, pattern, and alpha layers are provided to simplify the implementation of complex graphics constructs. A vector graphics engine provides polygon and geometry operations in conjunction with the 2D unit. A block diagram of the 2D GPU is shown in Figure 3 below.

Image of Freescale i.MX53xx MPU block diagram

Figure 3: Freescale i.MX53xx MPU embedded 2D GPU block diagram (Courtesy of Freescale). 

The 2D GPU accepts a command stream, seen at the top of Figure 3. Commands are separated and sent either to the 2D unit or the vector graphics unit. The 2D unit operates on pixels, gradients, textures, and colors on graphics memory via the memory arbiter. The vector unit generates and operates on geometric shapes also using the memory arbiter to access graphics memory. The arbiter maximizes bandwidth efficiency by prioritizing and combining memory operations.

The architecture of the i.MX53 video graphic systems, with separate 2D, vector, and 3D units is not unusual. Having separate blocks that support different standards makes it easier to create function-specific drivers, middleware, and more complex high-level Application Program Interfaces (APIs), allowing the designer to focus on their key differentiators instead investing time in implementing low-level “housekeeping” functions. Many times these blocks also support low-power modes where an unused element can be powered-down to significantly reduce power dissipation. Look for these types of capabilities when video graphic processing is a significant requirement in your designs.

Video graphic output

In some applications, a direct video-input capability is not required; however, it can be critical to have a video-output function. For example, handheld test devices might need both a graphic user interface and the ability to generate video output showing real-time test results graphically. Common examples of these types of functions might be found in ultrasound medical equipment, materials inspection, or telecommunications frequency testing. The NXP LPC4350 dual ARM core MCU is a good example of the type of device well suited for these applications. The LCD panel output controller for the NXP LPC4530 is shown in Figure 4 below.

Image of NXP LPC4350 dual ARM core MCU

Figure 4: LCD output on the NXP LPC4350 dual ARM core MCU (Courtesy of NXP). 

The system interface to the frame buffer is via the AHB master shown on the middle left of Figure 4. The controller performs the translation of pixel-coded data into the required formats for a variety of possible display devices including single- or dual-Super Twisted Nematic (STN) panels or Thin Film Transistor (TFT) color panels. Display resolutions from 320 x 240 to 1024 x 768 are supported with up to 24 bits per pixel true-color non-palletized color on TFT displays. A RAM palette of 245 entries by 16 bits is available for palletized implementations. Note that that separate DMA FIFOs can be used when dual panels are present or can be combined when only one panel is present. This helps improve transfer efficiency and reduces CPU overhead. A hardware cursor is included to simplify Graphic User Interface (GUI) implementations.

The dual ARM cores available on the NXP LPC4350 are of particular interest for applications with large or complex GUI functions. One of the ARM cores can be dedicated to managing the user interface and the real-time display of test data. This helps segregate the time-critical functions from the less-critical management or processing functions. For example, a short delay for data processing is much less noticeable than a delay during data display (creating a “ragged” and distracting rollout of test results). Dedicating a CPU to key display functions can also be helpful in optimizing power since it is possible to put the display processor in a low-power mode when the display is not active. You just wake the CPU when the display needs to be updated.

ARM-based ecosystem support

The pervasive use of then ARM CPU has created a very robust ecosystem of support for video-graphic-related functions. Software support includes video codecs, tools for building GUI functions, and even advanced video processing functions. One popular User Interface (UI) builder is Inflexion UI from Mentor Graphics. It enables drag-and-drop creation of compelling UIs that can be targeted for applications running an RTOS and can use a target device’s OpenGL hardware graphics engine for 2D, 2.5D or full 3D effects.

Another example is emWin from Segger. The emWin middleware system provides an efficient GUI for a graphical LCD. It is provided as “C” source code for easy implementation on ARM-based devices. The system includes support for a widget library, window manager, rendering support (with a graphic library, basic fonts, and touch/mouse support), as well as output and memory device drivers. These are just two examples of the many graphic and GUI-oriented elements of the ARM software eco-system.

Hardware support is abundant with vendors and third parties supplying development platforms on which video-based applications can be developed. The Freescale i.MX53 device has a complete development platform with targeted reference designs for even complex designs, such as wireless-camera interfaces, videophone platforms, or more general-purpose development platforms. Even a simple starter kit for the i.MX53, like the MCIMX53-START-R-ND, is useful in evaluating some of the advanced video graphic features of the i.MX53, including booting Linux, interfacing to a camera, and driving an LCD panel.


Systems that require advanced video graphic features now benefit from the use of modern ARM-based MCUs that implement common video and graphics standards as stand-alone blocks, minimizing the need to “hard code” these hardware functions. Additionally, the extensive ARM-based ecosystem provides a host of software support for implementing common application-level functions and offers hardware platforms to simplify the creation of custom applications. For more information on the MCUs and the video and graphic features discussed here, use the links provided to access product information pages and training modules available on the Digi-Key website.

Electronic Products Logo

Disclaimer: The opinions, beliefs, and viewpoints expressed by the various authors and/or forum participants on this website do not necessarily reflect the opinions, beliefs, and viewpoints of Digi-Key Corporation or official policies of Digi-Key Corporation.