GPU

Linux 7.2 Seeds "Blackwell-Next": A Deep Dive into the nvgrace-gpu VFIO CXL DVSEC Change

Linux 7.2 Seeds "Blackwell-Next": A Deep Dive into the nvgrace-gpu VFIO CXL DVSEC Change

Linux 7.2’s VFIO pull request dropped a commit with a codename I hadn’t seen before: Blackwell-Next. A Phoronix post brought this to my attention - Linux 7.2 Begins Making Preparations For NVIDIA “Blackwell-Next” - which, on the face of it looks like a minor prep patch. It is — but it’s also a clean window into where NVIDIA is taking its CPU-coherent GPU stack, how CXL is quietly becoming the standard signaling interface for next-generation accelerators, and what that means if you’re building infrastructure or tooling on top of these platforms.

Read More
Benchmarking GPUs: Measuring Throughput between CPU and GPU

Benchmarking GPUs: Measuring Throughput between CPU and GPU

This article was inspired by a LinkedIn post by Dennis Kennetz . The CPU to GPU bandwidth check is available on GitHub which uses a specific flow to assess the data transfer rates. Like many in the industry, my focus is on AI and ML workloads and how we can improve efficiencies and performance using DRAM, CXL, CPU, GPUs, and software improvements.

In the rapidly evolving landscape of artificial intelligence (AI) and machine learning (ML), the ability to process vast amounts of data efficiently is paramount. As AI models grow in complexity and size, the demand for high-performance computing resources intensifies. At the heart of this demand lies the crucial task of optimizing data transfers between various components of a computing system, particularly from DRAM, CPU, and emerging technologies like CXL (Compute Express Link) to and from the GPU.

Read More