Benchmark

Understanding STREAM: Benchmarking Memory Bandwidth for DRAM and CXL
In today’s Artificial Intelligence (AI), Machine Learning (ML), and high-performance computing (HPC) landscape, memory bandwidth is a critical factor in determining overall system performance. As workloads grow increasingly data-intensive, traditional DRAM-only setups are often insufficient, prompting the rise of new memory expansion technologies like Compute Express Link (CXL). To evaluate memory bandwidth across DRAM and CXL devices, we use a modified industry-standard tool called STREAM.
In this blog, we’ll explore what STREAM is, how it works, why it’s commonly used for benchmarking memory bandwidth, and how a modified version of STREAM can be used to measure performance in heterogeneous memory environments, including DRAM and CXL.
Read More
Benchmarking GPUs: Measuring Throughput between CPU and GPU
This article was inspired by a LinkedIn post by Dennis Kennetz . The CPU to GPU bandwidth check is available on GitHub which uses a specific flow to assess the data transfer rates. Like many in the industry, my focus is on AI and ML workloads and how we can improve efficiencies and performance using DRAM, CXL, CPU, GPUs, and software improvements.
In the rapidly evolving landscape of artificial intelligence (AI) and machine learning (ML), the ability to process vast amounts of data efficiently is paramount. As AI models grow in complexity and size, the demand for high-performance computing resources intensifies. At the heart of this demand lies the crucial task of optimizing data transfers between various components of a computing system, particularly from DRAM, CPU, and emerging technologies like CXL (Compute Express Link) to and from the GPU.
Read More