Use an Efficient Multicore Processor to Build Smarter Voice-Enabled Products

By Stephen Evanczuk

Contributed By Digi-Key's North American Editors

Demand for more intelligent products has become pervasive in nearly every application area as users gain familiarity with and entrust virtual digital assistants such as Amazon Alexa, Google Assistant, Apple Siri, and Microsoft Cortana, among others. Besides offering convenience, these assistants play a growing role in enhancing safety and security in a broad array of products for industrial systems and healthcare applications. For developers, however, the underlying designs for these products sometimes bring conflicting demands for processors able to deliver enhanced performance, optimized cost and footprint, and efficient operation.

This article shows how developers can use a multicore processor—in this case the i.MX 8M Nano from NXP—to meet the broadly diverse processing and interface requirements of emerging smart products in application segments ranging from smart home and industrial automation to medical systems. In particular, this article shows how developers can use this processor to more easily implement next-generation voice-based solutions with advanced audio processing capabilities.

How smart products are evolving

The rapid rise of voice assistant technologies has left users looking for more functionality from smart products. Emerging products not only need to respond to voice commands but also need to embed more intelligence, using a greater variety of input data from sensors, cameras, and other products. It's not enough for smart light switches to turn lights on and off or for dishwashers to operate in response to voice commands. As applications grow in sophistication, their underlying devices will need to support more diverse combinations of sensors, enhanced processing capabilities using artificial intelligence (AI) methods, and 3D graphics displays.

The need for more intelligent products goes beyond a desire for greater convenience. In critical application areas such as industrial automation and healthcare, a device’s ability to proactively alert its users to hazards or pathological conditions can prove essential. A factory worker wearing a hard hat that is able to monitor the immediate surroundings can more readily avoid hazards; an at-risk patient wearing a tiny healthcare monitor that continuously monitors vital signs can receive needed intervention well before a crisis.

These and other smart products impose specific design requirements that are as varied as their target applications, but most share the need for high performance processing, multimedia capabilities, and secure operation. For developers, these functional requirements combine with the fundamental need for solutions able to scale up to serve more robust applications, while physically scaling down to fit user expectations for reduced size, cost, and power consumption. Based on a heterogenous multicore architecture, the NXP i.MX 8M Nano applications processor family meets the broad and diverse requirements of designs for emerging smart products.

High-performance cores

The NXP i.MX 8M Nano is the newest member of the NXP i.MX 8M processor family designed to provide a scalable multicore processing platform. For high-end video applications, NXP i.MX 8M flagship processors such as the MIMX8MQ5DVAJZAB provide up to 4K display resolution with hardware decoding of 4K high dynamic range (HDR) video. For 1080p video, NXP i.MX 8M Mini processors such as the MIMX8MM6CVTKZAA provide 1080p hardware decode support. Both the i.MX 8M and 8M Mini series combine up to four Arm® Cortex®-A53 application processor cores with an Arm Cortex-M4F microcontroller core.

In contrast, the NXP i.MX 8M Nano MIMX8MN6CVTIZAA processor combines four Arm Cortex-A53 cores with an Arm Cortex-M7 core, which provides the highest performance among other cores in the Arm Cortex-M series, including the Cortex-M4F.

Besides their complement of processor cores, i.MX 8M Nano processors support a wide variety of external memory devices and provide a full range of external peripheral interfaces typically required in consumer and industrial applications (Figure 1).

Diagram of NXP i.MX 8M Nano processorFigure 1: The NXP i.MX 8M Nano processor combines up to four Arm Cortex-A53 application processors, an Arm Cortex-M7 microcontroller, specialized hardware subsystems, and comprehensive external peripheral interfaces typically used in consumer and industrial applications. (Image source: NXP)

The different available variants of the i.MX 8M Nano processor series enable developers to easily meet their specific requirements for cost and performance. For example, the high-performance members integrate an extensive array of specialized subsystems for security, 3D graphics displays, audio processing, and more. Other members of the i.MX 8M Nano series provide options with fewer Cortex-A53 cores, and "Lite" versions that feature reduced graphics capabilities.

All members of the i.MX 8M Nano processor series nevertheless offer the ability to deliver the combination of application performance and real-time capability required in emerging smart products.

Designed to provide high-performance execution of application software, each Arm Cortex-A53 application processor core can operate at clock frequencies up to 1.5 gigahertz (GHz) while working from dedicated level 1 (L1) 32 kilobyte (Kbyte) instruction cache (I-cache), 32 Kbyte data cache (D-cache), and a shared L2 512 Kbyte unified cache. Along with their integrated floating point unit (FPU), these cores support Arm's Neon technology for advanced single instruction multiple data (SIMD) operations used in digital signal processing and other advanced algorithms in data-intensive applications.

For embedded system requirements, the Arm Cortex-M7 microcontroller core runs at frequencies up to 750 megahertz (MHz) while providing high performance execution of real-time processes that require low latency and deterministic operation. To further speed processing, the core includes an integrated FPU and 256 Kbytes of tightly coupled memory (TCM) used for instruction and data cache.

For complex real-time processing tasks, however, the ability to rapidly recognize separate interrupt sources can be as critical as raw processing horsepower. In i.MX 8M Nano processors, a global interrupt controller (GIC) built into each Arm Cortex-A53 core and a nested vector interrupt controller (NVIC) in the Arm Cortex-M7 core, enable fine grain interrupt handling from nearly 128 distinct interrupt request sources corresponding to core states, timers, peripheral interface events, direct-memory access (DMA) operations, specialized hardware processes, and many more.

Heterogeneous multicore processing

Separately, each i.MX 8M Nano processor core provides a robust computing resource. Used together, the processor's multiple cores offer a power computing platform well able to manage the conflicting requirements for both real-time performance and application software execution that can confound design of smart products. A smart product based on this processor can for example use the Cortex-M7 core to process audio streams in real time, while using algorithms running on one or more Cortex-A53 cores to analyze the resulting data and provide users with a 3D graphics display of the results.

To reliably perform this kind of coordinated heterogeneous multicore processing, however, a multicore system requires careful orchestration of the processing operations and data exchange between the various cores, specialized hardware blocks, and peripherals. In the i.MX 8M Nano processor this orchestration is built into hardware-based mechanisms for semaphores and messaging typically used by low-level software services in multiprocessing environments.

In embedded systems, this orchestrated execution also extends to hardware resources such as memory and peripherals. For this task, the processor integrates a dedicated resource domain controller (RDC) designed to ensure safe resource sharing where appropriate, or reliable isolation when needed. As a result, applications software and real-time code can each control resources dedicated to their domain while sharing a set of common resources (Figure 2).

Diagram of NXP i.MX 8M Nano processor hardware-based mechanismsFigure 2: Hardware-based mechanisms in the NXP i.MX 8M Nano processor ensure isolation of resources dedicated to the Cortex-A53 application domain or Cortex-M7 real-time domain, while enabling safe resource sharing where needed. (Image source: NXP)

Specialized support for smart products

Using only the i.MX 8M Nano processor's multiple cores and resource sharing capabilities, developers can create sophisticated applications for emerging smart products built with voice assistants and 3D graphics. These applications gain a further performance boost while reducing their software footprint thanks to specialized hardware support for smart products built into i.MX 8M Nano processors.

For graphics, the processor's integrated graphics processing unit (GPU) provides 2D and 3D graphics acceleration, and supports standard graphics libraries including Vulkan, Open Computing Language (OpenCL), and Open Graphics Library (OpenGL). An integrated liquid crystal display interface (LCDIF) controller supports displays at 1080p60 (1080 progressive 60 frames per second).

While the on-chip GPU offloads display processing from the cores, another set of hardware subsystems offloads a variety of audio processing tasks that typically slow systems that are based on conventional processors. For processing microphone inputs, the processor's pulse density modulation (PDM) microphone interface (MICFIL) provides a multistage pipeline designed to generate filtered 16-bit pulse code modulation (PCM) data from the 1-bit input received from PDM microphones (Figure 3).

Diagram of NXP i.MX 8M Nano processor's interface subsystem (click to enlarge)Figure 3: The NXP i.MX 8M Nano processor's interface subsystem for PDM microphone input combines separate hardware pipelines for audio signal processing and voice activity detection. (Image source: NXP)

For a typical voice-based application, designers need only connect a PDM microphone to one of the eight PDM channels supported by the processor. Within the PDM microphone interface subsystem an input interface combines time multiplexed PDM data from a pair of microphones to form a lane comprising left and right channels.

In the next stage for each channel, a dedicated programmable decimation filter provides different passbands depending on the desired output rate and one of six quality select (QSEL) settings including high, medium, and low quality, as well as three additional very low quality levels. For example, at a 48 kilohertz (kHz) output rate, the very low quality modes set the filter passband to 10.5 Hz to 11.25 kHz compared to a passband of 21 Hz to 22.5 kHz for the high, medium, and low quality modes. Finally, the results for each channel are made available in separate first-in, first-out (FIFO) buffers for each channel for generating interrupts, transfer using DMA, or bus access.

Hardware-based voice activity detection

In parallel with this audio signal conditioning pipeline, the PDM microphone interface provides a set of hardware voice activity detectors (HWVADs) that monitor the desired microphone input channels. (Note: The figure above suggests each HWVAD is associated with a lane, but the documentation says channels, and the name of a register, VADCHSEL, supports that). To support HWVAD operation, the processor provides a rich set of device registers that enable developers to define the specific HWVAD configuration needed for their application (Table 1).

VADCICOSR - Voice Activity Detector CIC Oversampling Rate This bitfield defines the oversampling rate of the CIC filter.
VADCHSEL - Voice Activity Detector Channel Selector Selects the number of the channel with which the hardware voice activity detector will work.
VADINITT - Voice Avtivity Detector Initialization Time Selects the number of frames to be used to initialize the voice detection. During this period the output of the voice activity detector is forced to inactive.
CICOSR - CIC Oversampling Rate This bitfield defines the oversampling rate of the CIC filter.
CLKDIV - Clock Divider The CLKDIV field sets the divisor for the MICFIL internal clock.
QSEL - Quality Select This bit defines the actual quality mode of the Decimation Filter.
VADIE - Voice Activity Detector Interruption Enable This bit enables interrupts in the PDM Interface when a voice activity event has been detected by the Hardware Voice Activity Detector (HWVAD).
PDMIEN - PDM Inteface Enable The PDMIEN bit enables the operation of the filters in the module.
VADINITF - Voice Activity Detector Initialization Flag This flag signalizes when the HWVAD is being initialized.
VADIF - Voice Activity Detector Interrupt Flag This bit indicates when voice activity has been detected by the HWVAD.
VADNDATA - Voice Activity Detector Noise Data This bitfield is the noise energy or noise envelope calculated by the HWVAD. It can be used by software for a further voice activity detection.

Table 1: NXP i.MX 8M Nano processor registers typically used to configure the hardware voice activity detectors integrated in the processor's PDM microphone interface (MICFIL). (Table source: Digi-Key Electronics, based on NXP data)

Based on these register settings, the HWVAD uses built-in voice-detection algorithms to identify voice activity. On detection, the HWVAD generates an interrupt to wake up a core, typically the Cortex-M7, for further processing (Figure 4).

Diagram of NXP i.MX 8M Nano processor hardware voice activity detectors (click to enlarge)Figure 4: Using configuration settings programmed by the developer, the NXP i.MX 8M Nano processor hardware voice activity detectors allow processor cores to sleep or perform other processing until a voice is detected and further voice processing is required. (Image source: NXP)

In a voice assistant application, the core would check the audio stream for the appropriate wake word. If the wake word is detected, the core would typically provide the audio stream to the cloud-based voice assistant services supported by the application.

Besides the PDM microphone interface subsystem, the i.MX 8M Nano processor also provides five synchronous audio interface (SAI) modules that support a number of standard audio formats including Inter-IC Sound (I2S), audio codec 97 (AC97), time division multiplexed (TDM) audio, and Direct Stream Digital (DSD), as well as codec or digital signal processing (DSP) data.

To meet their specific application requirements, developers are often left with the task of converting audio input samples to some other required sample rate and resolution. Rather than using processor cycles to accomplish this common task, the i.MX 8M Nano processor integrates a dedicated asynchronous sample rate converter (ASRC) subsystem.

Capable of simultaneously processing up to 32 audio channels, the ASRC automatically converts source samples to the desired sample rate (between 8 kHz and 384 kHz) and resolution (IEEE single-precision floating point or fixed point format at 16, 20, 24, or 32 bits per sample). In the process, the ASRC converts all input data to a 64-bit IEEE floating point format to ensure accurate up or down conversion of audio sample data as needed to achieve the desired result.

Power management using a general power controller

With its extensive integration of processor cores and hardware subsystems, the NXP i.MX 8M Nano processor architecture combines a number of separate power domains and power modes built into the individual cores and subsystems. To manage power for this collection of cores and specialized blocks, the i.MX 8M Nano processor includes a sophisticated general power controller (GPC) designed to coordinate multiple power management features. Within the GPC, the system mode controller (SMC) manages each core's low power mode (LPM) and overall deep sleep mode (DSM), while a power gating time slot controller (PGTSC) manages the clock gating features used to reduce system power by disabling power to inactive subsystems (Figure 5).

Diagram of NXP i.MX 8M Nano integrates a comprehensive power controllerFigure 5: To enhance system-level power optimization, the NXP i.MX 8M Nano integrates a comprehensive power controller that manages power gating features and low-power modes built into processor cores. (Image source: NXP)

Under either software or hardware control, the GPC uses 20 different time slots in the PGTSC to power up or power down any of the multiple clock-gated power domains in the processor. Here, the timeslot controller works sequentially through these time slots, activating any power up or power down requests before proceeding to the next slot. Besides meeting specific power sequencing requirements, this approach allows developers to reduce ramp up current during system power on, or when waking the system from a low power or deep sleep mode.

Supplying the i.MX 8M Nano processor's multiple power domains is straightforward. Designed specifically to support the NXP i.MX 8M Nano processor, the ROHM Semiconductor BD71850MWV power management integrated circuit (PMIC) provides all the power rails required by the processor as well as other system peripherals. In fact, the BD71850MWV PMIC integrates its own power sequencer, further simplifying safe execution of power up and power down, not only for the processor but also for external memory, sensors, or other devices in the system (Figure 6). For developers, incorporating the BD71850MWV PMIC in a design requires no additional components, beyond the usual decoupling capacitors (not shown).

Diagram of NXP iMX 8M Nano processor's multiple cores and hardware subsystemsFigure 6: The NXP iMX 8M Nano processor's multiple cores and hardware subsystems drive the need for multiple supply rails, but the ROHM BD71850MWV power management integrated circuit (PMIC) provides a ready solution. (Image source: NXP)

Development support

Although the hardware interface requirements for i.MX 8M Nano-based designs are relatively simple, NXP lets developers avoid even this relatively straightforward design task for evaluating the processor or prototyping new smart products. Providing a fully implemented development kit and reference design for the i.MX 8M Nano processor, the 8MNANOD4-EVK evaluation kit combines the i.MX 8M Nano processor and BD71850MWV PMIC with the Murata Electronics LBEE5HY1MW Wi-Fi/Bluetooth transceiver module to provide a platform for immediate evaluation and prototype development. Along with multiple interface options and associated connectors, the evaluation kit includes a full set of external memory devices including static dynamic ram (SDRAM), NOR flash, and NAND flash. Using the evaluation kit, developers can explore different operating modes and configurations including boot from external flash or secure boot using signed boot images.

When developers are ready to proceed with their own custom software development, NXP provides drivers, board support packages (BSPs), and middleware designed to work with its own MCUXpresso integrated development environment (IDE), as well as third-party IDEs. For building applications designed to leverage machine-learning methods, developers can turn to the NXP eIQ machine learning software development environment and i.MX-optimized inference libraries such as eIQ for TensorFlow Lite for Cortex-M7-based inference, or NXP's port of the Arm neural network software development kit (NN SDK) for Cortex-A53-based inference.


Propelled by the rapid acceptance of voice assistant products, next-generation smart products face growing expectations for not only better audio support but also for greater performance, enhanced graphics, and power efficient operation. For developers, however, effective system design for these products requires a combination of high-performance applications software execution and low-latency real-time capabilities that has been difficult to achieve within their associated size, cost, and power constraints. With the availability of a scalable family of multicore processors from NXP, developers can readily meet design requirements for smart products in a broad array of application segments including consumer, industrial, and medical, among others.

Disclaimer: The opinions, beliefs, and viewpoints expressed by the various authors and/or forum participants on this website do not necessarily reflect the opinions, beliefs, and viewpoints of Digi-Key Electronics or official policies of Digi-Key Electronics.

About this author

Stephen Evanczuk

Stephen Evanczuk has more than 20 years of experience writing for and about the electronics industry on a wide range of topics including hardware, software, systems, and applications including the IoT. He received his Ph.D. in neuroscience on neuronal networks and worked in the aerospace industry on massively distributed secure systems and algorithm acceleration methods. Currently, when he's not writing articles on technology and engineering, he's working on applications of deep learning to recognition and recommendation systems.

About this publisher

Digi-Key's North American Editors