Home › IoT News › Mellanox HDR 200G InfiniBand demonstrates two times higher performance for AI platforms with Nvidia

Mellanox HDR 200G InfiniBand demonstrates two times higher performance for AI platforms with Nvidia

Gilad Shainer of Mellanox Technologies

Mellanox Technologies, Ltd. a supplier of high-performance, end-to-end smart interconnect solutions for data centre servers and storage systems, announced that its HDR 200G InfiniBand with the “Scalable Hierarchical Aggregation and Reduction Protocol” (SHARP) technology has set new performance records, doubling deep learning operations performance.

The combination of Mellanox In-Network Computing SHARP with Nvidia 100 Tensor Core GPU technology and Collective Communications Library (NCCL) deliver leading efficiency and scalability to deep learning and artificial intelligence applications.

The combination of the state-of-the-art Nvidia GPUs, Mellanox’s InfiniBand, GPUDirect RDMA and NCCL to train neural networks has already become a de-facto standard when scaling out deep learning frameworks, such as Caffe, Caffe2, Chainer, MXNet, TensorFlow, and PyTorch. With the Mellanox SHARP technology and HDR InfiniBand, deep learning training’s data aggregation operations can be offloaded and accelerated by the InfiniBand network, resulting in improving their performance by two times.

The joint effort with Nvidia and testing performed in Mellanox’s performance labs, using the Mellanox HDR InfiniBand Quantum connecting four system hosts, each with eight Nvidia V100 Tensor Core GPUs with NVLink interconnect technology and a single ConnectX-6 HDR adapter per host, have achieved an effective reduction bandwidth of 19.6GB/s by integrating SHARP’s native streaming aggregation capability with Nvidia’s latest NCCL 2.4 library, which now takes full advantage of the bi-directional bandwidth available from the Mellanox interconnect.

This implementation is effectively two times higher bandwidth than Nvidia’s current tree-based implementation using the same hardware configuration.

In the more common setup for this configuration, four HCAs in each system host are used for balanced performance across a variety of workloads where the initial SHARP and NCCL results yielded an expected 70.3GB/s.

For more densely populated GPU-based systems, like Nvidia DGX-2, which houses 16 Nvidia V100 Tensor Core GPUs with NVLink in each system node, the in-network capabilities and available bidirectional bandwidth of the Mellanox fabric can be fully leveraged.

“Our long-standing collaboration with Nvidia has again delivered a robust solution that takes full advantage of the best-of-breed capabilities from Mellanox InfiniBand, including GPUDirect RDMA and now extending in-network computing to NCCL, which delivers two times better performance for AI,” said Gilad Shainer, vice president of Marketing at Mellanox Technologies. “HDR InfiniBand in-network computing acceleration engines, including the SHARP technology, provide the highest performance and scalability for HPC and AI workloads.”

“Mellanox solutions amplify Nvidia’s unmatched CUDA-X acceleration libraries using NCCL, our open source collective communication library,” said Ian Buck, vice president and general manager of Accelerated Computing at Nvidia. “Together, we offer solutions that ensure the most demanding AI applications in the data centre benefit from cutting-edge performance and scaling efficiency.”

Comment on this article below or via Twitter: @IoTNow_OR @jcIoTnow

RECENT ARTICLES

5th Edition Connected Africa announces Telecom Innovation & Excellence Awards 2024

Posted on: April 19, 2024

The International Center for Strategic Alliances (ICSA) has announced the 5th Edition Connected Africa- Telecom Innovation & Excellence Awards 2024, set to be held on 22 May 2024 in Johannesburg, South Africa. Under the theme “Building a Connected Global Economy,” the summit aims to influence the telecom in Africa. With a focus on fostering forward-thinking

Facilio launches refrigerant tracking and leak detection software

Posted on: April 19, 2024

Property operations software firm Facilio has announced the launch of its ready-to-deploy refrigerant tracking and leak detection software solution. This is meant for all grocery and convenience store operators who want to implement an automatic leak detection system to identify and mitigate potential refrigerant leaks to achieve 100% compliance.

FEATURED IoT STORIES

What is IoT? A Beginner’s Guide

Posted on: April 5, 2023

What is IoT? IoT, or the Internet of Things, refers to the connection of everyday objects, or “things,” to the internet, allowing them to collect, transmit, and share data. This interconnected network of devices transforms previously “dumb” objects, such as toasters or security cameras, into smart devices that can interact with each other and their

The IoT Adoption Boom – Everything You Need to Know

Posted on: September 28, 2022

In an age when we seem to go through technology boom after technology boom, it’s hard to imagine one sticking out. However, IoT adoption, or the Internet of Things adoption, is leading the charge to dominate the next decade’s discussion around business IT. Below, we’ll discuss the current boom, what’s driving it, where it’s going,

9 IoT applications that will change everything

Posted on: September 1, 2021

Whether you are a future-minded CEO, tech-driven CEO or IT leader, you’ve come across the term IoT before. It’s often used alongside superlatives regarding how it will revolutionize the way you work, play, and live. But is it just another buzzword, or is it the as-promised technological holy grail? The truth is that Internet of

IoT Now Enterprise Buyers’ Guide Which IoT Platform 2021

Which IoT Platform 2021? IoT Now Enterprise Buyers’ Guide

Posted on: August 30, 2021

There are several different parts in a complete IoT solution, all of which must work together to get the result needed, write IoT Now Enterprise Buyers’ Guide – Which IoT Platform 2021? authors Robin Duke-Woolley, the CEO and Bill Ingle, a senior analyst, at Beecham Research. Figure 1 shows these parts and, although not all

CAT-M1 vs NB-IoT – examining the real differences

Posted on: June 21, 2021

As industry players look to provide the next generation of IoT connectivity, two different standards have emerged under release 13 of 3GPP – CAT-M1 and NB-IoT.

internet of things isometric illustration of connected things

IoT and home automation: What does the future hold?

Posted on: June 10, 2020

Once a dream, home automation using iot is slowly but steadily becoming a part of daily lives around the world. In fact, it is believed that the global market for smart home automation will reach $40 billion by 2020.

5 challenges still facing the Internet of Things

Posted on: June 3, 2020

The Internet of Things (IoT) has quickly become a huge part of how people live, communicate and do business. All around the world, web-enabled devices are turning our world into a more switched-on place to live.

Out Now! IoT Now Magazine

Get the latest IoT news to your inbox

Join the IoTNow online community for FREE, to receive: