Mellanox HDR 200G InfiniBand demonstrates two times higher performance for AI platforms with Nvidia

Gilad Shainer of Mellanox Technologies

Mellanox Technologies, Ltd. a supplier of high-performance, end-to-end smart interconnect solutions for data centre servers and storage systems, announced that its HDR 200G InfiniBand with the “Scalable Hierarchical Aggregation and Reduction Protocol” (SHARP) technology has set new performance records, doubling deep learning operations performance.

The combination of Mellanox In-Network Computing SHARP with Nvidia 100 Tensor Core GPU technology and Collective Communications Library (NCCL) deliver leading efficiency and scalability to deep learning and artificial intelligence applications.

The combination of the state-of-the-art Nvidia GPUs, Mellanox’s InfiniBand, GPUDirect RDMA and NCCL to train neural networks has already become a de-facto standard when scaling out deep learning frameworks, such as Caffe, Caffe2, Chainer, MXNet, TensorFlow, and PyTorch. With the Mellanox SHARP technology and HDR InfiniBand, deep learning training’s data aggregation operations can be offloaded and accelerated by the InfiniBand network, resulting in improving their performance by two times.

The joint effort with Nvidia and testing performed in Mellanox’s performance labs, using the Mellanox HDR InfiniBand Quantum connecting four system hosts, each with eight Nvidia V100 Tensor Core GPUs with NVLink interconnect technology and a single ConnectX-6 HDR adapter per host, have achieved an effective reduction bandwidth of 19.6GB/s by integrating SHARP’s native streaming aggregation capability with Nvidia’s latest NCCL 2.4 library, which now takes full advantage of the bi-directional bandwidth available from the Mellanox interconnect.

This implementation is effectively two times higher bandwidth than Nvidia’s current tree-based implementation using the same hardware configuration.

In the more common setup for this configuration, four HCAs in each system host are used for balanced performance across a variety of workloads where the initial SHARP and NCCL results yielded an expected 70.3GB/s.

For more densely populated GPU-based systems, like Nvidia DGX-2, which houses 16 Nvidia V100 Tensor Core GPUs with NVLink in each system node, the in-network capabilities and available bidirectional bandwidth of the Mellanox fabric can be fully leveraged.

“Our long-standing collaboration with Nvidia has again delivered a robust solution that takes full advantage of the best-of-breed capabilities from Mellanox InfiniBand, including GPUDirect RDMA and now extending in-network computing to NCCL, which delivers two times better performance for AI,” said Gilad Shainer, vice president of Marketing at Mellanox Technologies. “HDR InfiniBand in-network computing acceleration engines, including the SHARP technology, provide the highest performance and scalability for HPC and AI workloads.”

“Mellanox solutions amplify Nvidia’s unmatched CUDA-X acceleration libraries using NCCL, our open source collective communication library,” said Ian Buck, vice president and general manager of Accelerated Computing at Nvidia. “Together, we offer solutions that ensure the most demanding AI applications in the data centre benefit from cutting-edge performance and scaling efficiency.”

Comment on this article below or via Twitter: @IoTNow_OR @jcIoTnow

Recent Articles

Services firm ISS partners with Haltian to build smart facilities

Posted on: May 29, 2020

Facility services company ISS requires an Internet of Things (IoT) platform that is secure, reliable, and easily scaled and modified to customers’ needs. Finnish IoT and product development company Haltian has already been the main provider of IoT solutions for ISS Finland, and now the co-operation is to be extended globally.

Read more

‘0G’ in the management of epidemics

Posted on: May 28, 2020

The coronavirus epidemic has turned into a pandemic. The lack of hindsight and visibility in the face of increasing numbers of victims and the overloading of emergency services has led to the lockdown of half the planet. This containment impacts and will impact all economies, but even more so the economies of developing countries.

Read more