Why and How to Get Started with Multicore Microcontrollers for IoT Devices at the Edge

By Jacob Beningo

Contributed By Digi-Key's North American Editors

Developers of Internet of Things (IoT) devices at the edge are being asked to incorporate an increasingly diverse and processing-intensive range of functions, from communications and sampling sensors to executing machine learning (ML) inferences. At the same time, developers are being asked to maintain or reduce power consumption. What’s needed is a more flexible architectural approach to a core element of their design—the microcontroller—that will allow developers to add features while achieving the optimal balance of performance, functionality, and power consumption.

This architectural approach comes in the form of multicore microcontrollers. These have, as their name suggests, multiple processing cores built into a single package. However, just throwing more cores at the problem won’t solve the issues. Developers need to understand the differences between symmetric and asymmetric multicore processors, how to approach functional partitioning, and how to program them effectively.

This article will introduce the concept of multicore microcontrollers before discussing how developers can leverage multicore microcontrollers to balance performance and energy constraints. Several multicore microcontrollers from STMicroelectronics’ STM32H7 line will be introduced by way of example. The article will also examine several use cases where developers can leverage multicore processing and split the workload between multiple cores.

Introduction to multicore microcontrollers

As mentioned, multicore microcontrollers have more than one processing core. There are two types of configurations which are often used, symmetric and asymmetric processing. Symmetric core configurations contain two or more of the exact same processing cores. For example, they might both be Arm® Cortex®-M4 processors. Asymmetric cores on the other hand may contain an Arm Cortex-M7 processor and an Arm Cortex-M4 processor. They could also contain an Arm Cortex-M4 and an Arm Cortex-M0+ processor. The combinations are many and depend upon application and design requirements.

IoT developers are interested in multicore microcontrollers because they allow them to separate their application into multiple execution domains. Separate execution domains allow precise control of the application’s performance, features, and power needs. For example, one core may be used to interact with a user through a high-resolution display and touch panel, while the second core is used manage the real-time requirements of the system such as controlling a motor, relays and sampling sensors.

There are many ways that a developer can partition their application, but the two biggest paradigms are to separate the application into:

  • Feature rich/real-time
  • Real-time/secure

In the first paradigm, feature rich/real-time, the system is exactly like the one described in the paragraph above. Feature rich application components, such as the display, ML inferences, audio playback, and memory storage, among others, are all handled by one core. The second core then handles real-time functions such as motor control, sensing, and communication stacks (Figure 1).

Diagram of feature rich/real-time paradigmFigure 1: One paradigm for application design with multicore microcontrollers is to place the feature rich application components in one core and the real-time components in the second core. (Image source: STMicroelectronics)

The second paradigm separates the application into real-time and secure functionality. In the first core, the application may handle things like the display, memory access, and real-time audio playback. The second core, on the other hand, may do nothing more than act as a security processor. As such, the second core would handle storage of critical data like device and network keys, handle encryption, secure bootloader, and any other features deemed to fall within the secure software category (Figure 2).

Diagram of place the real-time application components in one coreFigure 2: Another paradigm for application design with multicore microcontrollers is to place the real-time application components in one core and all the security components in a second core. (Image source: STMicroelectronics)

There are other potential ways to parse up a multicore microcontrollers’ application space, but these two paradigms seem to be the most popular among IoT developers.

Selecting a multicore microcontroller development board

While multicore microcontrollers are becoming very popular, they are still not quite mainstream and selecting one can be tricky. For a developer looking to work with multicore microcontrollers, it’s best to select a development board that has the following characteristics:

  • Includes an LCD for feature rich application exploration
  • Expansion I/O
  • Is low cost
  • Has a well proven ecosystem behind it including example code, community forums, and access to knowledgeable FAEs

Let’s look at several examples from STMicroelectronics, starting with the STM32H745I-DISCO (Figure 3). This board is based on the STM32H745ZIT6 dual core microcontroller that comprises an Arm Cortex-M7 core running at 480 megahertz (MHz) and a second Arm Cortex-M4 processor running at 240 MHz. The part includes a double-precision floating point unit and an L1 cache with 16 kilobytes (Kbytes) of data and 16 Kbytes of instruction cache. The discovery board is particularly interesting because it includes additional capabilities such as:

  • An SAI audio codec
  • A microelectromechanical systems (MEMS) microphone
  • On-board QUAD SPI flash
  • 4 gigabyte (Gbyte) eMMC
  • Daughterboard expansion
  • Ethernet
  • Headers for audio and headphones

The development board has a lot of built-in capabilities that make it extremely easy to start experimenting with multicore microcontrollers and really scale up an application.

Image of STMicroelectronics STM32H745I-DISCO boardFigure 3: The STM32H745I-DISCO board integrates a wide range of on-board sensors and memory capabilities that allow developers to test out the dual core microcontrollers running at 480 MHz and 240 MHz. (Image source: STMicroelectronics)

For developers who are looking for a development board that has additional capabilities and far more expansion I/O, the STM32H757I-EVAL may be a better fit (Figure 4). The STM32H757I-EVAL includes additional capabilities over the evaluation board such as:

  • 8 M x 32-bit SRAM
  • 1 Gbit twin quad SPI NOR flash
  • Embedded trace macrocell (ETM) for instruction tracing
  • Potentiometer
  • LEDs
  • Buttons (tamper, joystick, wake-up)

These extra capabilities, especially the I/O expansion, can be extremely useful to developers looking to get started.

Image of STMicroelectronics STM32H757I-EVAL boardFigure 4: The STM32H757I-EVAL board provides developers with lots of expansion space, easy access to peripherals, and an LCD screen to get started with multicore applications. (Image source: STMicroelectronics)

Having looked at several development boards, the next step is to outline some recommendations for getting started with a multicore microcontroller application.

How to start that first multicore application

No matter which of the two STM32H7 development boards is selected, there are two main tools that are needed to get started. The first is STMicroelectronics’ STM32CubeIDE, a free integrated development environment (IDE) that lets developers compile their application code and deploy it to the development board. STM32CubeIDE also provides the resources necessary to step through and debug an application, and is available for major operating systems including Windows, Linux and MacOS.

The second tool is STMicroelectronics’ STM32H7 firmware package. This includes examples for the STM32H7 development boards for:

  • Multicore processing
  • Using FreeRTOS
  • Peripheral drivers
  • FatFS (file system)

Developers will want to download the firmware application package and become familiar with the examples that are supported by the chosen development board. Specifically, there are two folders that developers will want to pay attention to. The first is the applications folder which has two examples that show how to use OpenAMP (Figure 5). These examples show how to transmit data back and forth between the microcontroller cores where one core sends data to the other core, which then retransmits it back to the first core. Both examples perform this in a different way. One is baremetal, without an operating system, while the other is with FreeRTOS.

Image of OpenAMP STM32Cube_FW_H7Figure 5: The STM32Cube_FW_H7 provides several examples that demonstrate how to get started with multicore processing using OpenAMP. (Image source: Beningo Embedded Group)

The second set of examples demonstrates how to configure both cores with and without an RTOS (Figure 6). One example shows how to run FreeRTOS on each core, while the other shows how to use an RTOS on one core and run the second core baremetal. There are several other examples throughout the firmware package that demonstrate other capabilities, but these are good choices to get started.

Image of STM32Cube_FW_H7 provides several examplesFigure 6: The STM32Cube_FW_H7 provides several examples that demonstrate how to configure an operating system with multicore processors. (Image source: Beningo Embedded Group)

Loading an example project will result in a developer seeing a project layout similar to that shown in Figure 7. As illustrated, the project is broken up into application code for each core. The build configuration can also be setup such that a developer is working with only one core at a time. This can be seen in Figure 7, through the grayed-out files.

Image of example OpenAMP Ping-Pong projectFigure 7: An example OpenAMP Ping-Pong project demonstrates to developers how to create a communication channel between the two CPU cores. (Image source: Beningo Embedded Group)

A full description of the example code is beyond the scope of this article, but the reader can examine the readme.txt file that is associated with any of the examples to get a detailed description of how it works, and then examine the source code to see how the inter-processor communication (IPC) is actually performed.

Tips and tricks for working with multicore microcontrollers

Getting started with multicore microcontrollers is not difficult, but it does require that developers start to think about their application’s design a bit differently. Here are a few “tips and tricks” for getting started with multicore microcontrollers:

  • Carefully evaluate the application to determine which application domain separation makes the most sense. It is possible to mix domains on a single processor, but performance can be affected if not done carefully.
  • Take the time to explore the capabilities that are built into the OpenAMP framework and how those capabilities can be leveraged by the application.
  • Download the application examples for the STM32H7 processors and run the multicore application examples for the selected development board. The H747 includes two: one for FreeRTOS and one for OpenAMP.
  • When debugging an application, don’t forget that there are now two cores running! Make sure to select the correct thread within the debug environment to examine its call history.
  • Leverage internal hardware resources, such as a hardware semaphore, to synchronize application execution on the cores.

Developers that start with a well-supported development board and then follow these “tips and tricks” will find that they save quite a bit of time and grief when working with multicore microcontrollers for the first time.


For developers of IoT systems at the network edge, multicore microcontrollers provide the ability to better match and balance functionality, performance, and power consumption per the application’s requirements. Such microcontrollers allow a developer to partition their application into domains such as feature rich/real-time or real-time/secure processing. This ability to separate an application into different domains allows a developer to disable a core to conserve energy when the processing domain is no longer needed or turn it on in order to enhance application performance.

As shown, there are several different development boards that can be used to start exploring multicore microcontroller application design and take full control over its performance and energy profile.

Disclaimer: The opinions, beliefs, and viewpoints expressed by the various authors and/or forum participants on this website do not necessarily reflect the opinions, beliefs, and viewpoints of Digi-Key Electronics or official policies of Digi-Key Electronics.

About this author

Jacob Beningo

Jacob Beningo is an embedded software consultant. He has published more than 200 articles on embedded software development techniques, is a sought-after speaker and technical trainer, and holds three degrees, including a Masters of Engineering from the University of Michigan.

About this publisher

Digi-Key's North American Editors