Performance Index

ID 615781
Date 11/28/2022
Document Table of Contents

Intel® Data Center GPU Flex Series

Claim GPU(s) Configuration Measurement Period
Single Intel® Data Center GPU Flex 170 can achieve 28 streams of Knives Out at 1920x1080 60fps, 2 GPUs can scale up to 56 streams based on server configurations. Intel® Data Center GPU Flex 170 x 2 2x Intel® Xeon® Platinum 8360Y,1024 GB DDR4 3200, 2x Intel® Data Center GPU Flex 170 (ECC disabled), Ubuntu 20.04, kernel 5.10 Knivesout: Version: 1.271.479300 (with online resource update) Test scenario: offline training Res/Game Quality/Render FPS/Encode Codec/Encode FPS: 1080p/high/60/H.264/30 Test duration is 150sec, Test takes render FPS samples every 60 frames, encode FPS samples every 30 frames. Results are reported with FPS avg and FPS QoS. Single GPU 28 streams result summary : Render FPS Avg: 59.94 percentage Render FPS >= 54: 100% percentage Render FPS >= 48: 100% Encode FPS Avg: 28.92 2 GPU 56 streams result summary: Render FPS Avg: 59.81 percentage Render FPS >= 54: 99.81% percentage Render FPS >= 48: 100% Encode FPS Avg: 28.95 GPU Temperature 65C 8/20/2022
Single Intel® Data Center GPU Flex 170 can achieve 60 streams of Knives Out at 1280x720 30fps, 2 GPUs can scale up to 120 streams based on server configurations. Intel® Data Center GPU Flex 170 x 2 2x Intel® Xeon® Platinum 8360Y, 1024 GB DDR4 3200, 2x Intel® Data Center GPU Flex 170 (ECC disabled), Ubuntu 20.04, kernel 5.10 Knivesout: Version: 1.271.479300 (with online resource update) Test scenario: offline training Res/Game Quality/Render FPS/Encode Codec/Encode FPS: 720p/default/30/H.264/30 Test duration is 150sec, Test takes render FPS samples every 60 frames, encode FPS samples every 30 frames. Results are reported with FPS avg and FPS QoS. Single GPU 60 streams result Summary: Render FPS Avg: 30 percentage Render FPS >= 27: 100% percentage Render FPS >= 24: 100% Encode FPS Avg: 29.68 2 GPU 120 streams result Summary: Render FPS Avg: 29.6 percentage Render FPS >= 27: 97.92% percentage Render FPS >= 24: 99.54% Encode FPS Avg: 28.97 GPU Temperature 57C 8/20/2022
4U server configuration housing 6 Intel® Data Center GPU Flex 140 GPUs can achieve up to 216 streams of Riptide GP Renegade at 720p30 Intel® Data Center GPU Flex 140 x 6 2x Intel® Xeon® Platinum 8360Y, 256 GB DDR4 3200MT/s, 6x Intel® Data Center GPU Flex 140 (ECC disabled), Ubuntu 20.04, kernel 5.10 Riptide GP Renegade: Version: demo 1.0 Test scenario: login auto play Res/Render FPS/Encode Codec/Encode FPS: 720p/30/H.264/30 Test duration is 150sec, Test takes render FPS samples every 60 frames, encode FPS samples every 30 frames. Results are reported with FPS avg and FPS QoS. Result Summary: Render FPS Avg: 29.65 percentage Render FPS >= 27: 96.35% percentage Render FPS >= 24: 98.49% Encode FPS Avg: 29.75 Note: GPU temperature: 50-70C 8/20/2022
Intel® Data Center GPU Flex 170 single card can support up to: Honor of Kings 720p60 H264: 48 streams Honor of Kings 720p30 H264: 68 streams Asphalt9: Legends 720p30 H264: 29 streams Asphalt9: Legends 1080p30 H264: 23 streams KnivesOut 720p30 H264: 60 streams KnivesOut 1080p60 H264: 28 streams Riptide GP Renegade 720p30 H264: 60 streams Intel® Data Center GPU Flex 170 Configuration 1: 2x Intel® Xeon® Platinum 8360Y, 256 GB DDR4 3200, 1x Intel® Data Center GPU Flex 170 (ECC disabled), Ubuntu 20.04, kernel 5.10 Honor of Kings: Version: 3.74.1.44 Test scenario: Battle watch Res/Render FPS/Encode Codec/Encode FPS (1): 720p/60/H.264/30 Knives Out: Version: 1.271.479300 (with online resource update) Test scenario: offline training Res/Game Quality/Render FPS/Encode Codec/Encode FPS (1): 720p/default/30/H.264/30 Res/Game Quality/Render FPS/Encode Codec/Encode FPS (2): 1080p/high/60/H.264/30 Riptide GP Renegade: Version: demo 1.0 Test scenario: login auto play Res/Render FPS/Encode Codec/Encode FPS: 720p/30/H.264/30 Configuration 2: 2x Intel® Xeon® Platinum 8358, 512 GB DDR4 3200, 1x Intel® Data Center GPU Flex 170, Ubuntu 20.04, kernel 5.10 Honor of Kings: Version: 3.74.1.44 Test scenario: Battle watch Res/Render FPS/Encode Codec/Encode FPS (2): 720p/30/H.264/30 Asphalt9: Legends: Version: 3.5.2a Test scenario: Race Res/Game Quality/Render FPS/Encode Codec/Encode FPS (1): 720p/default/30/H.264/30 Res/Game Quality/Render FPS/Encode Codec/Encode FPS (2): 1080p/HD/30/H.264/30 Test duration is 150sec, Test takes render FPS samples every 60 frames, encode FPS samples every 30 frames. Results are reported with FPS avg and FPS QoS. Result Summary: Honor of Kings 720p60 H264 48 streams: Render FPS Avg: 59.37 percentage Render FPS >= 54: 97.62% percentage Render FPS >= 48: 99.27% Encode FPS Avg: 29.99 Honor of Kings 720p30 H264 68 streams: Render FPS Avg: 29.57 percentage Render FPS >= 27: 98.04% percentage Render FPS >= 24: 99.96% Encode FPS Avg: 29.89 Asphalt9: Legends 720p30 H264 29 streams: Render FPS Avg: 29.38 percentage Render FPS >= 27: 95.71% percentage Render FPS >= 24: 99.41% Encode FPS Avg: 29.94 Asphalt9: Legends 1080p30 H264 23 streams: Render FPS Avg: 29.97 percentage Render FPS >= 27: 100% percentage Render FPS >= 24: 100% Encode FPS Avg: 29.96 KnivesOut 720p30 H264 60 streams: Render FPS Avg: 30 percentage Render FPS >= 27: 100% percentage Render FPS >= 24: 100% Encode FPS Avg: 29.68 KnivesOut 1080p60 H264 28 streams: Render FPS Avg: 59.94 percentage Render FPS >= 54: 100% percentage Render FPS >= 48: 100% Encode FPS Avg: 28.92 Riptide GP Renegade 720p30 H264 60 streams: Render FPS Avg: 29.98 percentage Render FPS >= 27: 99.67% percentage Render FPS >= 24: 99.87% Encode FPS Avg: 28.86 GPU temperature: 43-65C 8/20/2022
Intel® Data Center GPU Flex 140 single card can support upto: Honor of Kings720p60 H264: 24 streams Honor of Kings 720p30 H264: 40 streams Asphalt9: Legends 720p30 H264: 20 streams Asphalt9: Legends 1080p30 H264: 12 streams KnivesOut 720p30 H264: 46 streams KnivesOut 1080p60 H264: 14 streams Riptide GP Renegade 720p30 H264: 40 streams Intel® Data Center GPU Flex 140 2x Intel® Xeon® Gold 6336Y, 256 GB DDR4 3200, 1x Intel® Data Center GPU Flex 140 (ECC disabled), Ubuntu 20.04, kernel 5.10 Honor of Kings: Version: 3.74.1.44 Test scenario: Battle watch Res/Render FPS/Encode Codec/Encode FPS (1): 720p/60/H.264/30 Res/Render FPS/Encode Codec/Encode FPS (2): 720p/30/H.264/30 Knives Out: Version: 1.271.479300 (with online resource update) Test scenario: offline training Res/Game Quality/Render FPS/Encode Codec/Encode FPS (1): 720p/default/30/H.264/30 Res/Game Quality/Render FPS/Encode Codec/Encode FPS (2): 1080p/high/60/H.264/30 Asphalt9: Legend: Version: 3.5.2a Test scenario: Race Res/Game Quality/Render FPS/Encode Codec/Encode FPS (1): 720p/default/30/H.264/30 Res/Game Quality/Render FPS/Encode Codec/Encode FPS (2): 1080p/HD/30/H.264/30 Riptide GP Renegade: Version: demo 1.0 Test scenario: login auto play Res/Render FPS/Encode Codec/Encode FPS : 720p/30/H.264/30 Test duration is 150sec, Test takes render FPS samples every 60 frames, encode FPS samples every 30 frames. Results are reported with FPS avg and FPS QoS. Result Summary: Honor of Kings 720p60 H264 24 streams: Render FPS Avg: 59.21 percentage Render FPS >= 54: 96.41% percentage Render FPS >= 48: 99.35% Encode FPS Avg: 29.95 Honor of Kings 720p30 H264 40 streams: Render FPS Avg: 29.6 percentage Render FPS >= 27: 96.91% percentage Render FPS >= 24: 99.55% Encode FPS Avg: 29.94 Asphalt9: Legends 720p30 H264 20 streams: Render FPS Avg: 29.33 percentage Render FPS >= 27: 96.51% percentage Render FPS >= 24: 99.86% Encode FPS Avg: 29.88 Asphalt9: Legends 1080p30 H264 12 streams: Render FPS Avg: 29.69 percentage Render FPS >= 27: 97.49% percentage Render FPS >= 24: 100% Encode FPS Avg: 29.86 KnivesOut 720p30 H264 46 streams: Render FPS Avg: 29.5 percentage Render FPS >= 27: 96.23% percentage Render FPS >= 24: 99.9% Encode FPS Avg: 29.9 KnivesOut 1080p60 H264 14 streams: Render FPS Avg: 59.29 percentage Render FPS >= 54: 99.61% percentage Render FPS >= 48: 100% Encode FPS Avg: 29.94 Riptide GP Renegade 720p30 H264 40 streams: Render FPS Avg: 29.64 percentage Render FPS >= 27: 97.37% percentage Render FPS >= 24: 99.41% Encode FPS Avg: 28.6 GPU temperature 48-53C 8/20/2022
AV1 Bandwidth or bitrate savings of 30% over AVC in low delay encoding Intel® Data Center GPU Flex 140 2x Intel® Xeon® 6336Y, 128GB DDR4 3200, 1x Intel Data Center GPU Flex 140, Ubuntu 20.04 Kernel: 5.10, intel-media-22.5.0 Quality claims are made on the basis of objective bitrate savings in ultra-low delay encoding scenarios, for Intel-selected high complexity content (4K video source is courtesy the "Wind" Collective on Artgrid.io). Subjective quality of AV1 low delay HW using "quality mode" encoding meets or exceeds HW AV1 low delay AVC encodes at 42% lower bitrate (15mbps AV1 vs 10.5 mbps AVC; both collected without B-frames and with restricted buffering); subjective assessments are provided in the video to third party inspection and validated by Intel. Across a larger, standardized set video sequences (VBENCH sequences, details at http://arcade.cs.columbia.edu/vbench), AV1 low delay averages 33% BDRATE PSNR-YUV gains compared to x264-medium (tune low latency, 1-pass, 4-threads) and 18% BDRATE PSNR-YUV compared to AVC HW acceleration. More details at https://dgpu-docs.intel.com/devices/iris-xe-max-graphics/guides/media.html. 8/20/2022
AV1 Bandwidth or bitrate savings of up-to 50% over AVC as shown in the Wild Animals Sequence (Video Demo Claim) Intel® Data Center GPU Flex 140 2x Intel® Xeon® 6336Y, 128GB DDR4 3200, 1x Intel Data Center GPU Flex 140, Ubuntu 20.04 Kernel: 5.10, intel-media-22.5.0 Quality claims are made on the basis of objective bitrate savings in ultra-low delay encoding scenarios, for Intel-selected high complexity content (4K video source is courtesy the "Wind" Collective on Artgrid.io). Subjective quality of AV1 low delay HW using "quality mode" encoding meets or exceeds HW AV1 low delay AVC encodes at 42% lower bitrate (15mbps AV1 vs 10.5 mbps AVC; both collected without B-frames and with restricted buffering); subjective assessments are provided in the video to third party inspection and validated by Intel. Across a larger, standardized set video sequences (VBENCH sequences, details at at http://arcade.cs.columbia.edu/vbench), AV1 low delay averages 33% BDRATE PSNR-YUV gains compared to x264-medium (tune low latency, 1-pass, 4-threads) and 18% BDRATE PSNR-YUV compared to AVC HW acceleration. More details at https://dgpu-docs.intel.com/devices/iris-xe-max-graphics/guides/media.html. 8/20/2022
8 4K60 streams per card; Up to 36 streams per card (1080p60 HEVC/AV1) Intel® Data Center GPU Flex Series 140 2x Intel® Xeon® 6336Y, 128GB DDR4 3200, 1x Intel Data Center GPU Flex 140 (ECC Enabled), Ubuntu 20.04 Kernel: 5.10, intel-media-22.5.0, pre-production FFMPEG For quality and performance measurement best practices on Intel Data Center GPU's, please see FFMPEG command lines and scripts at https://github.com/intel/media-delivery. We measure and report using pre-release versions of these tools, which have been expanded to include AV1 encode support. More details at https://dgpu-docs.intel.com/devices/iris-xe-max-graphics/guides/media.html. Results are reported for 2-second delay (random access) transcode including B-frames where applicable. Low delay performance is similar or better to claims. Results are rounded down to the nearest whole integer of streams per socket. GPU frequency will vary on production silicon. GPU frequency is sensitive to by thermal environment and workload. Data is reported on single card configurations with recommended airflow and cooling rates. 8 4K60 HEVC concurrent streams in performance mode 36 1080p60 HEVC concurrent streams in performance mode GPU temperature < 50C 8/20/2022
5X media transcode throughput at half the power (Intel Flex 140 compared to competition NVIDIA A10 - HEVC 1080p60) Intel® Data Center GPU Flex Series 140 Intel Configuration: 2x Intel® Xeon® 6336Y, 1024GB DDR4 3200, 1x Intel Data Center GPU Flex 140 (ECC Enabled), Ubuntu 20.04 Kernel: 5.10, intel-media-22.5.0, pre-production FFMPEG NVIDIA Configuration: 2x Intel® Xeon® 6336Y, 128GB DDR4 3200, 1x NVIDIA A10 (ECC Enabled), Ubuntu 20.04 Kernel: 5.10, FFMPEG N-107771-gf6a36c7cf9 For quality and performance measurement best practices on Intel Data Center GPU's, please see FFMPEG command lines and scripts at https://github.com/intel/media-delivery. We measure and report using pre-release versions of these tools, which have been expanded to include AV1 encode support. More details at https://dgpu-docs.intel.com/devices/iris-xe-max-graphics/guides/media.html. This data is obtained using 2-second delay (random access) encoding profiles including B-frames where applicable. Concurrent sessions' average fps used for performance. GPU temperature < 50C NVIDIA A10 supports 7 streams of HEVC 1080p60 transcode in performance mode (as measured by Intel, using ffmpeg NVEnc, -preset fast, vbr, -bf 3 -b_​ref_​mode 2 -vsync 0). 8/20/2022
2X decode throughput at half the power (Intel® Data Center GPU Flex 140 compared to competition NVIDIA A10 across HEVC, AV1, AVC, VP9) Details: Up to 208 stream 8-bit HEVC 1080p30 decode per card Up to 168 streams 8-bit AVC 1080p30 decode per card Up to 218 stream 8-bit AV1 1080p30 decode per card Up to 228 streams 8-bit VP9 1080p30 decode per card Intel® Data Center GPU Flex Series 140 Intel Configuration: 2x Intel® Xeon® 6336Y, 1024GB DDR4 3200, 1x Intel Data Center GPU Flex 140 (ECC Enabled), Ubuntu 20.04 Kernel: 5.10, intel-media-22.5.0, pre-production FFMPEG NVIDIA A10 results from https://developer.nvidia.com/nvidia-video-codec-sdk as of 8/20/2022 Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy. For quality and performance measurement best practices on Intel Data Center GPU's, please see FFMPEG command lines and scripts at https://github.com/intel/media-delivery. We measure and report using pre-release versions of these tools, which have been expanded to include AV1 encode support. More details at https://dgpu-docs.intel.com/devices/iris-xe-max-graphics/guides/media.html. Concurrent sessions' average fps used for performance reporting. Intel Data Center GPU Flex Series 140: 208 stream 8-bit HEVC 1080p30 decode 168 streams 8-bit AVC 1080p30 decode 218 stream 8-bit AV1 1080p30 decode 228 streams 8-bit VP9 1080p30 decode GPU temperature < 50C NVidia reported A10 decode density on https://developer.nvidia.com/nvidia-video-codec-sdk: 81 stream 8-bit HEVC 1080p30 decode 37 streams 8-bit AVC 1080p30 decode 49 stream 8-bit AV1 1080p30 decode 66 streams 8-bit VP9 1080p30 decode Note: GPU temperature < 50C 8/20/2022
Up to 360 streams HEVC-HEVC 1080p60 Transcode (in 10 cards in a 4U server config) Details: 10 Intel® Data Center GPU Flex 140 cards hosted on Super Micro Server Mobo 4U server can achieve 360 steams HEVC-HEVC 1080p60 Transcode Intel® Data Center GPU Flex Series 140 X 10 2x Intel® Xeon® 8360Y, 1024GB DDR4 3200, 10x Intel Data Center GPU Flex 140 (ECC Enabled), Ubuntu 20.04 Kernel: 5.10, intel-media-22.5.0, pre-production FFMPEG For quality and performance measurement best practices on Intel Data Center GPU's, please see FFMPEG command lines and scripts at https://github.com/intel/media-delivery. We measure and report using pre-release versions of these tools, which have been expanded to include AV1 encode support. More details at https://dgpu-docs.intel.com/devices/iris-xe-max-graphics/guides/media.html. Concurrent sessions' average fps used for performance reporting.10 Intel® Data Center GPU Flex 140 cards hosted on Super Micro 4U server can achieve 360 steams HEVC-HEVC 1080p60 Transcode in performance mode GPU temperature < 50C 8/20/2022
Intel® Deep Link Hyper Encode Transcode Performance: Hyperencode can achieve one 8K60 HDR AV1 transcode throughput per card Intel® Data Center GPU Flex Series 140 2x Intel® Xeon® 6336Y, 1024GB DDR4 3200, 1x Intel Data Center GPU Flex 140 (ECC Enabled), Ubuntu 20.04 Kernel: 5.10, intel-media-22.5.0, using pre-production Sample_​multi-transcode.exe Measured using a pre-release version of the OneVPL sample, "Sample_​Multi_​Transcode.exe", on customer-supplied content, Available in September github.intel.com/goto/media-delivery. Intel Data Center GPU Flex 140, using Hyperencode, can support one 8k60 AV1 stream in performance mode, with one GOP on each of two devices. Gop length of 30 is required to achieve peak latency of less than 1 second (0.5 second average). Data is reported as an average FPS over a long-duration run. AV1->AV1, 10-bit: 62 fps HEVC->HEVC 10-bit 60 fps HEVC->AV1, 10-bit, 63 fps GPU temperature < 50C 8/20/2022