Enabling the NVMe SSD Interface on a Xilinx ZCU102 Evaluation Kit

By Design Gateway Co., Ltd.

Contributed By Digi-Key's North American Editors


The Zynq® UltraScale+™ MPSoC family, based on the Xilinx® UltraScale™ MPSoC architecture, integrates a feature-rich 64-bit quad-core or dual-core ARM-based processing system (PS) and Xilinx programmable logic (PL) UltraScale architecture in a single device. Also included are on-chip memory, multiport external memory interfaces, and a rich set of peripheral connectivity interfaces, especially, a GTH 16.3 Gbps transceiver which has the capability of supporting a PCI Express® Gen3 storage device interface such as an NVMe SSD drive. This article demonstrates the solution of NVMe SSD (solid state drive) interface implementation on Xilinx’s ZCU102 Evaluation Kit by using Design Gateway’s NVMeG3-IP Core, which is able to achieve incredibly fast performance: 2,319 MB/s write and 3,347 MB/s read speed.

Introduction to the Zynq® UltraScale+ MPSoC ZCU102 evaluation kit

The ZCU102 is a general-purpose evaluation board for rapid-prototyping based on the XCZU9EG-2FFVB1156E MPSoC device. Included on the board are a high-speed DDR4 SODIMM and component memory interfaces, FMC expansion ports, multi-gigabit per second serial transceivers, a variety of peripheral interfaces, and FPGA logic for user customized designs, all of which provides a flexible prototyping platform.

Image of Xilinx ZCU102 evaluation kitFigure 1: ZCU102 evaluation kit. (Image source: Xilinx Inc.)

The ZCU102 provides programmable logic capabilities for creating state-of-the-art applications such as 5G Wireless, next generation advanced driver-assistance systems (ADAS) and Industrial Internet of Things (IIoT) solutions.

Anyway, for the application that requires external data storage with high performance and high reliability such as NVMe SSD drives, the right solution that takes advantage of the GTH Transceiver for PCI Express® Gen3 compliance interface is needed.

Introduction to NVMe SSD storage

NVM Express (NVMe) defines the interface for the host controller to access as SSD by PCI Express. NVM Express optimizes issuing the command and completion process by using only two registers (Command issue and Command completion). Otherwise, NVMe supports parallel operation by supporting up to 64K commands within a single queue. 64K command entries improve transfer performance for both sequential and random access.

NVMe drives have paved the way for data storage and computing at very high speeds. By using PCIe Express® Gen3 technology, modern NVMe SSD drives can achieve speeds as high as 40 Gbit/s peak performance.

An example of an NVMe storage device is shown here.

Implementation of NVMe host controller on the ZCU102

Diagram of NVMe ImplementationFigure 2: NVMe Implementation. (Image source: Design Gateway)

Conventionally, the NVMe host is implemented by using a Host Processor operating with a PCIe Controller for transferring data to and from the NVMe SSD. NVMe protocol is implemented for device driver communications with the PCIe controller hardware CPU peripheral connected through a very-high-speed bus. External DDR memory is required for data buffering and command queue to transfer the data between the PCIe controller and SSD.

Since a PCIe Gen3 Integrated Block isn’t available on a XCZU9EG-2FFVB1156E FPGA device on the ZCU102, a conventional implementation approach is not possible.

Design Gateway proposes a solution by using the NVMeG3-IP Core, as shown in Figure 2, to enable a NVMe SSD interface for a Zynq® UltraScale+™ MPSoC device for which a PCIe integrated block isn’t available. The NVMe interface for the ZCU102 allows the building of a multi-channel RAID system with very high performance and the lowest possible FPGA resource consumption. The NVMeG3-IP core license includes the example reference design that helps designers reduce development time and cost.

Overview of Design Gateway’s NVMeG3-IP

NVMe IP Core with PCIe Gen3 Soft IP (NVMeG3-IP) is ideal to access a NVMe SSD without a PCIe integrated block, CPU, and external memory. NVMeG3-IP includes PCIe Gen3 Soft IP and 256 Kbyte memory. This solution is recommended for the application which requires NVMe SSD storage with ultra-high-speed performance by using a low-cost FPGA which does not contain a PCIe integrated block.

Diagram of NVMeG3-IP block diagramFigure 3: NVMeG3-IP block diagram. (Image source: Design Gateway)

NVMeG3-IP’s features

NVMeG3-IP has many features, some of which are highlighted below:

  • Implement application layer, transaction layer, data link layer, and some parts of the physical layer to access the NVMe SSD without CPU usage
  • Operate with Xilinx PCIe PHY IP configured as a 4-lane PCIe Gen3 (128-bit bus interface)
  • Includes 256 Kbyte RAM data buffer
  • Simple user interface via dgIF typeS
  • Supports six commands, i.e. Identify, Shutdown, Write, Read, SMART, and Flush (support additional command as optional)
  • Supported NVMe device:
    • Base Class Code:01h (mass storage), Sub Class Code:08h (Non-volatile), Programming Interface:02h (NVMHCI)
    • MPSMIN (Memory Page Size Minimum): 0 (4Kbyte)
    • MDTS (Maximum Data Transfer Size): At least 5 (128 Kbyte) or 0 (no limitation)
    • LBA unit: 512 byte or 4096 byte
  • User clock frequency must be more than or equal to PCIe clock (250MHz for Gen3)
  • Available reference design:
    • ZCU102 with AB17-M2FMC adapter board
    • KCU105 with AB18-PCIeX16/AB16-PCIeXOVR adapter board
    • VCU118 with AB18-PCIeX16 adapter board

Design Gateway developed the NVMeG3-IP to run as a NVMe host controller for accessing a NVMe SSD. The user interface and standard features are designed for ease of use without needing knowledge of the NVMe protocol. The additional feature of NVMeG3-IP is the built-in PCIe soft IP which implements the Data link layer and some parts of the Physical layer of the PCIe protocol by pure logic. So, NVMeG3-IP can run in an FPGA, which does not have a PCIe integrated block, by using built-in PCIe soft IP and Xilinx PCIe PHY IP core. Xilinx PCIe PHY IP is a free IP core available which includes a transceiver and a logic equalizer.

NVMeG3-IP supports six NVMe commands, i.e. Identify, Shutdown, Write, Read, SMART, and Flush. 256 Kbyte BlockRAM is integrated in the NVMeG3-IP to act as a data buffer. The system does not need a CPU and external memory. More details of the NVMeG3-IP are described in its datasheet which can be downloaded from our website.

FPGA resource usages on the XCZU9EG-2FFVB1156E FPGA device are shown in Table 1 below.

Family Example Devices Fmax (MHz) CLB Regs CLB LUTs CLB IOB BRAMTile PLL GTX Design Tools
Zynq-Ultrascale+ XCZU9EG-2FFVB1156E 300 18982 17109 3690 - 70 - - Vivado2017.4

Table 1: Example Implementation Statistics for Ultrascale/Ultrascale+ device

Implementation and performance result on ZCU102

Figure 4 shows the overview of the reference design based on the ZCU102 to demonstrate NVMeG3-IP operation. The NVMeG3IPTest module in the demo system includes with following modules: TestGen, LAxi2Reg, CtmRAM, IdenRAM and FIFO.

For more detail of NVMeG3-IP reference design, please refer to the NVMeG3-IP reference design document provided on Design Gateway’s website.

Diagram of NVMeG3-IP reference design overviewFigure 4: NVMeG3-IP reference design overview. (Image source: Design Gateway)

The demo system is designed to write/verify data with the NVMe SSD on the ZCU102. The user controls the test operation through a Serial console. For the NVMe SSD to interface with the ZCU102, an AB17-M2FMC adapter board is required as shown in Figure 5.

Image of NVMeG3-IP demo environment set up on Xilinx ZCU102Figure 5: NVMeG3-IP demo environment set up on ZCU102. (Image source: Design Gateway)

The example test result when running the demo system on the ZCU102 while using the 512 GB Samsung 970 Pro is shown in Figure 6.

Diagram of NVMe SSD read/write performance on Xilinx ZCU102Figure 6: NVMe SSD read/write performance on ZCU102 by using Samsung 970 PRO S. (Image source: Design Gateway)


NVMeG3-IP Core provides a solution to enable NVMe SSD interface on the ZCU102 evaluation kit and also the solution for Xilinx®’s Zynq® UltraScale+™ MPSoC device family where a PCIe integrated block isn’t available. NVMeG3-IP is designed with the goal of achieving the highest possible performance with the lowest possible FPGA resource usage for NVMe SSD access without requiring a CPU. It’s very suitable for high performance NVMe storage without CPU invention and able to implement multiple NVMe SSD interfaces by utilizing GTH transceivers with no limitations from the number of available PCIe integrated blocks on the FPGA device.

Disclaimer: The opinions, beliefs, and viewpoints expressed by the various authors and/or forum participants on this website do not necessarily reflect the opinions, beliefs, and viewpoints of Digi-Key Electronics or official policies of Digi-Key Electronics.

About this author

Design Gateway Co., Ltd.

About this publisher

Digi-Key's North American Editors