Analysis of IPU Roadmap Based on FPGA

Big data has given birth to a new computing architecture with multiple computing power. DPU is born at the right time and has become a battleground for industry giants and startups. Various companies have launched DPU-related solutions, claiming that they can reduce data center taxes and help data centers respond to diversified computing power needs more efficiently. So, is the DPU well-deserved or just a dubious name? This article will walk into the Intel lab to explore the reality.

DPU is currently divided into three main technical forms: SoC (Arm and ASIC collaborative architecture), FPGA, and ASIC. FPGA-based DPU has the best flexibility, but throughput and power consumption are limited to a certain extent; as bandwidth becomes higher, ASIC-based DPU demand will increase; for more complex and broader use cases, based on The DPU of the SoC has a good price/performance ratio, is easy to program and is highly flexible.

Analysis of IPU Roadmap Based on FPGA

In June 2021, Intel first proposed the concept of IPU (Infrastructure Processing Unit), and announced the latest roadmap at the “Intel Vision 2022” conference, showing the overall plan of the IPU from 2022 to 2026. IPU is a programmable network device designed to enable cloud and communication service providers to reduce CPU overhead and fully release performance value. Because its functions and application scenarios overlap with DPU, it can also be seen as It’s an attempt at DPU.

Analysis of IPU Roadmap Based on FPGA

From the above figure, we can see the functional division of the IPU. The Block design on the left, Processor Complex and/or FPGA provide software and hardware programmability, run specific services of ISP/CSP, and Infrastructure AcceleraTIon accelerates storage virtualization, security (encryption and decryption) ), network virtualization and other load requests, Network provides high-bandwidth packet processing, packet analysis and other capabilities.

Intel IPU Roadmap

Currently Intel IPU has two main product lines, one is an FPGA-based solution and the other is an ASIC-based solution. Intel’s first publicly disclosed IPU was codenamed Big Springs Canyon. It is FPGA based with Xeon-D CPU and Ethernet support, providing hardware programmable data path. The successor to Big Springs Canyon, called Oak Springs Canyon, is based on Intel’s Agilex FPGAs and Xeon-D SoCs and provides network virtualization function offload for workloads such as Open Virtual Switch (OVS) and storage functions such as NVMe over Fabric.

Analysis of IPU Roadmap Based on FPGA

Intel’s second IPU, codenamed Mount Evans, is Intel’s first ASIC IPU, developed in partnership with Google Cloud, and is currently targeting high-end and hyperscale data center servers. Mount Evans provides programmable packet processing engines that support use cases such as firewalls and virtual routing, and can deploy advanced encryption and compression acceleration with high-performance Intel Quick Assist technology. Shipments to Google and other service providers are expected to begin in 2022, with widespread deployment expected in 2023.

Analysis of IPU Roadmap Based on FPGA

Intel Mount Evans Vision 2022 top3 Intel’s planned IPU roadmap is as follows:

2022: Launch of 200 Gbps IPU, codenamed Mount Evans and Oak Springs Canyon.

2023/2024: Launch of 400 Gbps IPU, codenamed Mount Morgan and Hot Springs Canyon.

2025/2026 : Launch of 800 Gbps IPU.

Analysis of IPU Roadmap Based on FPGA

Big Spring Canyon IPU Live Demo

This demo was built in Intel Labs using an IPU on a 2U Supermicro Ultra server rack.

Analysis of IPU Roadmap Based on FPGA

The card used for the demo features both an Intel StraTIx 10 FPGA and an Intel Xeon D-1612.

Analysis of IPU Roadmap Based on FPGA

Below we remove the heatsink assembly to see the StraTIx 10 FPGA in the middle and the Xeon D-1612 processor on the right, both with their own memory and local storage, and the Xeon D running its own operating system.

Analysis of IPU Roadmap Based on FPGA

The StraTIx 10 FPGA is hardened with PCIe and Ethernet IP to run packet processor, virtio, NVMeoF and more on the FPGA. The Xeon D is connected through the FPGA on this card.

Analysis of IPU Roadmap Based on FPGA

Intel BSC IPU Example Path Shown next is the NVMeoF demonstration, where the BSC IPU is handling the system’s RDMA NVMeoF, managed by the Xeon D SoC. The entire stack is offloaded here, so the host server thinks it’s interacting with a normal NVMe device, while the IPU is actually remote storage directly over the network, emulating an NVMe block device to the system.

Analysis of IPU Roadmap Based on FPGA

Intel BSC IPU Storage and RDMA Offloading With the heatsink removed, the back of the card is shown below.

Analysis of IPU Roadmap Based on FPGA

Logging into the Xeon D-1612, we can see a 16GB memory system with 4 cores and 8 threads. In addition, storage and FPGA are connected.

Analysis of IPU Roadmap Based on FPGA

Logging into the Xeon D-1612, we can see a 16GB memory system with 4 cores and 8 threads. In addition, storage and FPGA are connected.

Analysis of IPU Roadmap Based on FPGA

Below is the lscpu output of the Xeon D

Analysis of IPU Roadmap Based on FPGA

BSC IPU Xeon D 1612 Lscpu Output Next, the demonstration begins. The first step starts VirtIO and initializes the card, which is automated using a script. We installed the IPU in a rack of 2U Supermicro Ultra servers that were both the IPU host and NVMeoF target systems.

Analysis of IPU Roadmap Based on FPGA

For storage, a total of eight 1.6TB Intel P4610 SSDs are installed in the target system.

Analysis of IPU Roadmap Based on FPGA

The Intel IPU Intel DC P4610 SSD FPGA can change the data path by adding new features, it needs to be programmed, this can be done directly from the Intel Xeon D processor. After completing the above steps, we can look for the SSD on the host server, and the drive of the host system will look exactly the same as the drive on the target server. The host server thinks it has standard NVMe devices, but doesn’t know that these devices are delivered over a 100GbE fabric using NVMeoF and IPUs.

Analysis of IPU Roadmap Based on FPGA

The Intel BSC IPU is connected to the target via RDMA NVMeoF, the drives are mounted on the host, the target server is on the two terminals above in the picture, the six drives are connected via the IPU, and the iostat is shown on the right. Bottom left is Xeon D-1612 for IPU, bottom right is host server with 8 Intel P4610 1.6TB NVMe SSDs. The IPU’s Stratix 10 FPGA connects to the target server and presents the NVMeoF driver to the host as a standard NVMe block device. Now that we have these drives installed on the system, let’s get started.

Performance

The left side is the fio test, the right side is the iostat monitor, the bottom left is the Xeon D on the IPU card, and the bottom right is the iostat on the target server.

Analysis of IPU Roadmap Based on FPGA

First, we run a 4K random read script, in the 1.2M to 1.4M 4K random read IOPS range, the iostat data can be seen on the right. Next run a sequential read test, in the range of 5.5-6GB/s

Analysis of IPU Roadmap Based on FPGA

Write IOPS is in the range of 1.3-1.4M.

Analysis of IPU Roadmap Based on FPGA

Sequential writes again in the 5.5-6GB/s range.

Analysis of IPU Roadmap Based on FPGA

On iostats, CPU utilization is very low, in the range of 3% for sequential operations and 10% for random operations. Also, the overhead of fio traffic generation, Ethernet, and NVMeoF is included here. Overall, this is a decent statistic.

In addition to what is covered in this demo, there are more areas of FPGA logic that can be used to add more services. For cloud providers, an IPU can be added to skip local NVMe storage and high-speed NICs. With this solution, storage can be delivered transparently to bare metal or virtual machines. This allows infrastructure providers to dynamically allocate storage to each client from a centralized pool without using the platform to expose the inner workings of the infrastructure to third parties.

Analysis of IPU Roadmap Based on FPGA

FPGAs can also be used to protect and reduce transmitted data by running encryption and compression in the data path. This helps protect data in transit and reduces the amount of data transferred, further reducing network stress.

Performance

Intel IPUs feature Xeon D CPUs, allowing infrastructure providers to manage the IPUs as infrastructure endpoints and then provide selective services through the FPGA. FPGAs offer a lot of flexibility to present different types of devices to client systems and users. The same FPGA and Xeon D control planes give infrastructure providers an easier way to manage complex infrastructures

Haoxinshengic is a pprofessional FPGA and IC chip supplier in China. We have more than 15 years in this field。 If you need chips or other electronic components and other products, please contact us in time. We have an ultra-high cost performance spot chip supply and look forward to cooperating with you.

If you want to know more about FPGA or want to purchase related chip products, please contact our senior technical experts, we will answer relevant questions for you as soon as possible

Our Products