Performance Index

ID Date Classification
615781 09/24/2024 Public
Document Table of Contents

Intel® Gaudi® AI Accelerator

Series Presented Use Case Claim Processor Systems Measured Measurement Measurement Period Software
Intel® Gaudi® 3 accelerator Launch event 24th Sept 2024 Inference LLaMA 3 8B ~1.09X inference Throughput LLaMA 3 8B Intel Gaudi 3 AI Accelerator vs. H100 1* Intel Gaudi 3 Ai accelerator (128GB HBM), 1*Nvidia H100 GPU (80GB HBM) Data Source: : https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/performance/perf-overview.md input-output sequences: 128-2048tps on 1 accelerator/GPU Intel measured results obtained on September 9th 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H100 software details
Intel® Gaudi® 3 accelerator Launch event 24th Sept 2024 Inference LLaMA 3 8B 1.8X perf/$ Inference Throughput LLaMA 3 8B Intel Gaudi 3 Ai accelerator vs. H100 1* Intel Gaudi 3 Ai accelerator (128GB HBM), 1*Nvidia H100 GPU (80GB HBM) Data Source: : https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/performance/perf-overview.md input-output sequences: 128-2048tps on 1 accelerator/GPU Intel measured results obtained on September 9th 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H100 software details
Intel® Gaudi® 3 accelerator Launch event 24th Sept 2024 Inference LLAaMA 2 70B 1.19X Inference Throughput LLaMA 2 70B Intel Gaudi 3 AI accelerator vs H100 2* Intel Gaudi 3 Ai accelerator (128GB HBM), 2*Nvidia H100 GPU (80GB HBM) Data Source:https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/performance/perf-overview.md input-output sequences: 128-2048tps on 2 accelerators/GPUs Intel measured results obtained on September 9th 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H100 software details
Intel® Gaudi® 3 accelerator Launch event 24th Sept 2024 Inference LLAaMA 2 70B ~2X Inference perf?$ Throughput LLaMA 2 70B Intel Gaudi 3 AI accelerator vs H100 2* Intel Gaudi 3 Ai accelerator (128GB HBM), 2*Nvidia H100 GPU (80GB HBM) Data Source:https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/performance/perf-overview.md input-output sequences: 128-2048tps on 2 accelerators/GPUs Intel measured results obtained on September 9th 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H100 software details