Use Flashless Microcontrollers to Lower System Costs and Increase Performance

By Bill Giovino

Contributed By Digi-Key's North American Editors

With Internet of Things (IoT) networks being called upon to perform more complicated tasks, memory requirements of IoT endpoints have been increasing, especially for endpoints now performing higher levels of computing at the edge. However, on-chip microcontroller Flash memory is limited to about 1 megabit (Mbit), and many high-end IoT endpoints need many times more memory than that.

The conventional solution has been to expand the microcontroller’s program memory with an external Flash chip. However, when memory requirements are as high as 8 Mbits or more, the majority of the program memory ends up being off-chip.

As a result, designers may in many instances be better off dispensing with on-chip Flash entirely and instead use a Flashless—also known as ROMless—microcontroller, coupled with an external, high-speed octal SPI eXecute-in-place (XiP) Flash chip. This greatly reduces the cost of the microcontroller and future proofs the design by allowing for greater scalability. Any concerns about memory access times are mitigated by the emergence of very-high-speed octal memory interfaces.

This article discusses ROMless microcontrollers and external memories, and their evolution into a viable option for IoT endpoints and embedded systems that require large amounts of program memory. It then shows how to apply the concept using ROMless microcontrollers from NXP Semiconductors and an octal SPI XiP Flash chip from Adesto Technologies.

Memory expansion at the IoT edge

Most low to medium performance IoT endpoints use a microcontroller to manage the endpoint, storing the firmware in on-chip Flash. Firmware expands as it adds additional application code, wireless IP communication stacks, and enhanced security code. Code expansion can occur during development and as a result of field updates.

Some of these IoT endpoints are now being called upon to perform more compute functions. Instead of transmitting raw or partially processed sensor data over the network to a central computer for processing, these IoT endpoints perform more complex tasks that can include sensor fusion algorithms, data interpolation, pattern or image recognition, and increasingly complex artificial intelligence (AI) computing.

Since this results in the central computer receiving only the result of the IoT node’s local processing instead of every byte of the raw sensor data, this can result in reduced traffic on the wireless network. Because the RF transmitter can be the most significant power draw in an IoT endpoint, edge computing often results in improved battery life for battery-powered endpoints.

Systems that need to be updated in the field face additional memory challenges. Conventional systems require at least twice the estimated program memory space in order to be able to handle such updates. This is to handle both the existing program memory, as well as the size of any over-the-air (OTA) updates. Some systems may require three times the estimated program memory space, with the additional memory allocated to a read-only factory firmware image. In case of certain types of system failures, including the detection of hacking or a corrupted firmware image, the system can load the initial factory firmware image to recover the system.

For some applications this memory expansion can quickly exceed the 1 Mbit embedded Flash limit, requiring external memory expansion. Traditionally, the solution has been to add an external parallel Flash memory chip. However, this has the disadvantage of using about 36 external pins on the microcontroller—pins that could otherwise be used for application I/O. This also wastes pc board space and increases the likelihood of electromagnetic interference (EMI) from the board.

SPI program memory expansion

Besides using a parallel bus, program memory can also be expanded using the Serial Peripheral Interface (SPI). While the conventional SPI uses only one data line for half-duplex single-bit transmissions, over the years it has grown to support dual and quad data lines, resulting in a corresponding increase in data throughput. This throughput has increased to a point where it has become practical to interface to a large capacity external SPI Flash chip.

For program memory applications, a conventional dual or quad SPI uses a shadow Flash configuration where the external Flash data memory is copied to an embedded static random-access memory (SRAM) that is mapped into program memory space. While this has the advantage of being able to easily expand program memory while improving execution speed by running out of fast SRAM, it also has significant disadvantages. Since the amount of internal SRAM is limited, memory is accessed in a paged mode as Flash memory is swapped into internal SRAM as needed. This bottleneck can be reduced by putting more SRAM on-chip; but since SRAM is one of the most expensive blocks on any semiconductor, this has the disadvantage of greatly increasing the cost of the microcontroller.

A more recent evolution of SPI has been XiP. SPI XiP allows the microcontroller’s CPU to execute firmware code directly out of the external SPI Flash. Program execution speed can be significantly improved by adding a cache to the SPI XiP interface.

The popularity of SPI XiP has resulted in a recent expansion of the interface with eight data lines. This octal SPI XiP interface has increased throughput to a point where it is much faster than running out of on-chip Flash memory—faster than 100 Mbits/s.

SPI memory revolution

This has led to a curious evolution that is a throwback to 30 years ago. Consider a system where there is 1 Mbit of Flash on-chip and 32 Mbits of external program memory Flash accessed by an octal SPI XiP interface. The on-chip program memory is so minimal, it begs the question: can that on-chip microcontroller Flash be eliminated and still be a cost-effective system?

It has long been assumed that a mid-range system with a Flash microcontroller is always more cost-effective compared to one with a Flashless microcontroller with an external Flash chip. Only recently has that changed.

If the on-chip Flash memory is removed, it of course reduces the cost of the microcontroller. But a deeper examination shows that if the Flash is no longer needed, then the process technology features that are only used for Flash can be eliminated as well. This reduces the cost of the manufacturing process, greatly reducing the cost of the microcontroller. This has resulted in a reemergence of what 30 or so years ago was called the “ROMless” microcontroller. Today we call it Flashless.

(Re)introduction to Flashless microcontrollers

A high-performance microcontroller that can take advantage of the Flash memory’s speed is the Flashless MIMXRT1052DVL6B (RT1052) from NXP Semiconductors. The RT1052 is a member of NXP’s i.MX RT1050 processor family and is based on a 600 megahertz (MHz) Arm® Cortex®-M7 with 32 Kbytes instruction cache and 32 Kbytes data cache. The 600 MHz clock speed is achieved by removing the Flash and using a high-speed CMOS process technology that is not limited by internal Flash memory. The RT1052 has a large amount of SRAM—512 kilobytes (Kbytes)—that can be partitioned for program or data memory use.

The microcontroller has a variety of high-end peripherals including an LCD interface, a digital camera sensor interface (CSI) and Pixel Processing Pipeline (PXP) for high-end camera support, an SPDIF interface for digital audio, two USB OTG interfaces, two eMMC/SD Flash card interfaces, two 20-channel analog-to-digital converters (ADCs), and an encryption module. A complete list of peripherals can be seen in the block diagram (Figure 1).

Diagram of NXP RT1052 high-end peripherals (click to enlarge)Figure 1: The NXP RT1052 has a wide variety of high-end peripherals, including an SPI XiP serial interface and support for data encryption. (Image source: NXP)

Another option is the NXP MIMXRT1051DVL6B (RT1051). It has all the same features of the RT1052, minus the LCD interface, CSI, and PXP.

The RT1052 has a FlexSPI interface, which can execute code using dual, quad, or octal SPI XiP Flash interfaces. For added firmware security, the microcontroller supports encrypted program memory over SPI XiP, a good example of which is the ATXP032-CCUE-T from Adesto Technologies.

Modern XiP Flash operation

Adesto’s ATXP032-CCUE-T is a 32 Mbit octal Flash chip that supports data transfers of up to 266 megabytes (Mbytes) per second in octal dual transfer rate (DTR) mode—much faster than on-chip microcontroller Flash memory. It requires a single 1.8 volt supply and has a typical octal mode standby current of 35 microamps (µA). It supports a maximum SPI clock of 133 MHz.

Diagram of Adesto ATXP032-CCUE-T Flash memoryFigure 2: The Adesto ATXP032-CCUE-T Flash memory uses an octal SPI, I/O0 to I/O7, to interface to a microcontroller. An SRAM write data buffer improves the performance of write-to-Flash operations. (Image source: Adesto Technologies)

During an active Flash read, the ATXP32 current draw is 142 µA/MHz plus 1 milliamp (mA) overhead (typical). At its top SPI clock speed of 133 MHz in octal mode, it draws only 19.9 mA.

The ATXP032-CCUE-T also supports standard SPI modes 0 and 3, as well as quad SPI mode. A 256 byte security register has a 128 byte factory programmed unique identifier, as well as 128 bytes of one-time programmable (OTP) memory that can be used to store device information such as an Ethernet media access control (MAC) address or a security key.

The ATXP032-CCUE-T’s memory arrangement is noteworthy. It is partitioned into four banks of 8 Mbits each. The internal logic is configured so that a host microcontroller can execute code out of one bank while programming or erasing another one. The operation is transparent to the host microcontroller and requires no special configuration settings.

The ATXP032-CCUE-T also has three status and control registers for configuring the operating parameters of the device such as low-power mode, enabling or disabling DTR mode, and setting standard, quad, or octal SPI modes (standard SPI is the default). Reading from the status registers can indicate the status of a program or erase operation, the low-power status, and if any of the memory is write protected.

Putting Flashless microcontrollers and external XiP together

Putting the RT1052 and the ATXP032-CCUE-T together is straightforward (Figure 3). On power-up the RT1052 begins executing code out of the 96 Kbytes of on-chip boot ROM. The boot ROM reads the state of 14 boot mode configuration pins that select which of the many RT1052 external memory interfaces to use for program memory. Options include an external eMMC card, a micro SD card, a conventional external parallel interface, or the SPI XiP (in this case, Adesto’s ATXP032-CCUE-T).

Diagram of NXP RT1052 Flashless microcontroller with Adesto ATXP032-CCUE-TFigure 3: The NXP RT1052 Flashless microcontroller has an octal SPI XiP interface that can easily interface to the Adesto ATXP032-CCUE-T. The octal SPI XiP interface operation is transparent to the Arm core. (Image source: Digi-Key Electronics)

The RT1052 boot options also include downloading code from the USB OTG or a UART to be executed from SRAM. Boot mode options can also be set during manufacturing by blowing internal fuses inside the RT1052 instead of using the boot mode configuration pins. Once the octal SPI XiP interface is enabled for program memory execution by the RT1052’s boot ROM, program execution is immediate. The Arm core then executes firmware from the Adesto ATXP032-CCUE-T in the same manner as from an external parallel Flash device or internal Flash.

Because of the high-speed data transfers involved, the octal serial Flash should be placed on the pc board as close as possible to the microcontroller’s octal SPI XiP port. To reduce interference, none of the pc traces should be longer than 120 millimeters (mm). The clock signal should be at a distance of at least three times the width of the pc board traces away from other signals to avoid interference. The I/O [0:7] bidirectional data signals should all be within 10 mm of each other to avoid skew.


IoT endpoints have increasing memory requirements due to trends toward edge computing and the need for OTA updates and associated memory scalability needs. At some point, designers of these endpoint devices may find the option of using a Flashless microcontroller worth considering.

As shown, advances in Flashless microcontrollers, high-speed interfaces, and octal SPI XiP Flash chips provide developers the option to build high-performance, cost-effective IoT endpoints or embedded systems in lieu of using traditional Flash-based microcontroller approaches.

Disclaimer: The opinions, beliefs, and viewpoints expressed by the various authors and/or forum participants on this website do not necessarily reflect the opinions, beliefs, and viewpoints of Digi-Key Electronics or official policies of Digi-Key Electronics.

About this author

Bill Giovino

Bill Giovino is an Electronics Engineer with a BSEE from Syracuse University, and is one of the few people to successfully jump from design engineer, to field applications engineer, to technology marketing.

For over 25 years Bill has enjoyed promoting new technologies in front of technical and non-technical audiences alike for many companies including STMicroelectronics, Intel, and Maxim Integrated. While at STMicroelectronics, Bill helped spearhead the company’s early successes in the microcontroller industry. At Infineon Bill orchestrated the company’s first microcontroller design wins in U.S. automotive. As a marketing consultant for his company CPU Technologies, Bill has helped many companies turn underperforming products into success stories.

Bill was an early adopter of the Internet of Things, including putting the first full TCP/IP stack on a microcontroller. Bill is devoted to the message of “Sales Through Education” and the increasing importance of clear, well written communications in promoting products online. He is moderator of the popular LinkedIn Semiconductor Sales & Marketing Group and speaks B2E fluently.

About this publisher

Digi-Key's North American Editors