CXL

How to Build acpidump from Source and use it to Debug Complex CXL and PCI Issues

How to Build acpidump from Source and use it to Debug Complex CXL and PCI Issues

This article is a detailed guide on how to build the latest version of the acpidump tool from its source code. While many Linux distributions, like Ubuntu, offer a packaged version of this utility, it’s often outdated. For developers and enthusiasts working with modern hardware features, particularly those related to Compute Express Link (CXL), having the most current version is essential.

Before you begin, it’s important to remove any old, conflicting versions of the tools. If you have previously installed the acpica-tools package from your distribution’s repository, you should remove it to prevent conflicts.

Read More
Is Your Application Really Using Persistent Memory? Here’s How to Tell.

Is Your Application Really Using Persistent Memory? Here’s How to Tell.

Persistent memory (PMEM), especially when accessed via technologies like CXL, promises the best of both worlds: DRAM-like speed with the durability of an SSD. When you set up a filesystem like XFS or EXT4 in FSDAX (File System Direct Access) mode on a PMEM device, you’re paving a superhighway for your applications, allowing them to map files directly into their address space and bypass the kernel’s page cache entirely.

But here’s the crucial question: after all the setup and configuration, how do you prove that your application’s data is physically residing on the PMEM device and not just in regular RAM? I’ve run into this question myself, so I wrote a small Python script to get a definitive answer using SQLite3 as an example application. However, before we proceed with the script, let’s examine how you can verify this manually.

Read More
CXL Memory NUMA Node Mapping with Sub-NUMA Clustering (SNC) on Linux

CXL Memory NUMA Node Mapping with Sub-NUMA Clustering (SNC) on Linux

CXL (Compute Express Link) memory devices are revolutionizing server architectures, but they also introduce new NUMA complexity, especially when advanced memory configurations, such as Sub-NUMA Clustering (SNC), are enabled. One of the most confusing issues is the mismatch between NUMA node numbers reported by CXL sysfs attributes and those used by Linux memory management tools.

This blog post walks through a real-world scenario, complete with command outputs and diagrams, to help you understand and resolve the CXL NUMA node mapping issue with SNC enabled.

Read More
CXL Device & Fabric Buyer's Guide: A List of GA Components (2025)

CXL Device & Fabric Buyer's Guide: A List of GA Components (2025)

Last Updated: June 27, 2025

This guide provides a curated list of generally available (GA) Compute Express Link (CXL) devices, fabric components, and memory appliances. It is a technical resource for engineers, architects, and hardware specialists looking to identify and compare CXL memory expansion modules, switches, and full system-level appliances from leading vendors. The tables below detail market-ready components, focusing on the specifications required to design and build CXL-enabled infrastructure.

Read More
CXL Server Buyer's Guide: A Complete List of GA Platforms (Updated 2025)

CXL Server Buyer's Guide: A Complete List of GA Platforms (Updated 2025)

Last Updated: June 27, 2025

This quick reference guide provides a definitive, up-to-date list of generally available (GA) Compute Express Link (CXL) servers from major OEMs like Dell, HPE, Lenovo, and Supermicro. It is designed for data center architects, engineers, and IT decision-makers who need to identify and compare server platforms that support CXL 1.1 and CXL 2.0 for memory expansion and pooling. The tables below offer a direct comparison of server models, supported CPUs, CXL versions, and compatible CXL device form factors. The goal is to cut through the noise of announcements and roadmaps to provide a clear view of what you can deploy today.

Read More
Unlock Your CXL Memory: How to Switch from NUMA (System-RAM) to Direct Access (DAX) Mode

Unlock Your CXL Memory: How to Switch from NUMA (System-RAM) to Direct Access (DAX) Mode

As a Linux System Administrator working with Compute Express Link (CXL) memory devices, you should be aware that as of Linux Kernel 6.3, Type 3 CXL.mem devices are now automatically brought online as memory-only NUMA nodes. While this can be beneficial for most situations, it might not be ideal if your application is designed to directly manage the CXL memory as a DAX (Direct Access) device using mmap().

This blog post will explain this behavior and provide a step-by-step guide on how to convert a CXL memory device from a memory-only NUMA node back to DAX mode, allowing applications to mmap the underlying /dev/daxX.Y device. We’ll also cover troubleshooting steps if the memory is actively in use by the kernel or other processes.

Read More
How I Created a Custom ChatGPT Trained on the CXL Specification Documents

How I Created a Custom ChatGPT Trained on the CXL Specification Documents

If you’re working with Compute Express Link (CXL) and wish you had an AI assistant trained on all the different versions of the specification—1.0, 1.1, 2.0, 3.0, 3.1… you’re in luck.

Whether you’re a CXL device vendor, a firmware engineer, a Linux Kernel developer, a memory subsystem architect, a hardware validation engineer, or even an application developer working on CXL tools and utilities, chances are you’ve had to reference the CXL spec at some point. And if you have, you already know: these documents are dense, extremely technical, and constantly evolving.

Read More
Linux Kernel 6.14 is Released: This is What's New for Compute Express Link (CXL)

Linux Kernel 6.14 is Released: This is What's New for Compute Express Link (CXL)

The Linux Kernel 6.14 release brings several improvements and additions related to Compute Express Link (CXL) technology.

Here is the detailed list of all commits merged into the 6.14 Kernel for CXL and DAX. This list was generated by the Linux Kernel CXL Feature Tracker .

Read More
Building NDCTL Utilities from Source: A Comprehensive Guide

Building NDCTL Utilities from Source: A Comprehensive Guide

Building NDCTL with Meson on Ubuntu 24.04

The NDCTL package includes the cxl, daxctl, and ndctl utilities. It uses the Meson build system for streamlined compilation. This guide reflects the modern build process for managing NVDIMMs, CXL, and PMEM on Ubuntu 24.04.

If you do not install a more recent Kernel than the one provided by the distro, then it is not recommended to compile these utilities from source code. If you have installed a mainline Kernel, then you will likely require a newer version of these utilities that are compatible with your Kernel. See the NDCTL Releases as the Kernel support information is provided there.

Read More
Linux Kernel 6.13 is Released: This is What's New for Compute Express Link (CXL)

Linux Kernel 6.13 is Released: This is What's New for Compute Express Link (CXL)

The Linux Kernel 6.13 release brings several improvements and additions related to Compute Express Link (CXL) technology.

Here is the detailed list of all commits merged into the 6.13 Kernel for CXL and DAX. This list was generated by the Linux Kernel CXL Feature Tracker .

CXL related changes from Kernel v6.12 to v6.13:

Read More
Understanding STREAM: Benchmarking Memory Bandwidth for DRAM and CXL

Understanding STREAM: Benchmarking Memory Bandwidth for DRAM and CXL

In today’s Artificial Intelligence (AI), Machine Learning (ML), and high-performance computing (HPC) landscape, memory bandwidth is a critical factor in determining overall system performance. As workloads grow increasingly data-intensive, traditional DRAM-only setups are often insufficient, prompting the rise of new memory expansion technologies like Compute Express Link (CXL). To evaluate memory bandwidth across DRAM and CXL devices, we use a modified industry-standard tool called STREAM.

In this blog, we’ll explore what STREAM is, how it works, why it’s commonly used for benchmarking memory bandwidth, and how a modified version of STREAM can be used to measure performance in heterogeneous memory environments, including DRAM and CXL.

Read More
How Much RAM Could a Vector Database Use If a Vector Database Could Use RAM

How Much RAM Could a Vector Database Use If a Vector Database Could Use RAM

Featured image generated by ChatGPT 4o model: “a low poly woodchuck by a serene lake, surrounded by mountains and a forest with tree leaves made from DDR memory modules. The woodchuck is munching on a memory DIMM. The only memory DIMM in the image should be the one being eaten.”

How Much RAM Could a Vector Database Use If a Vector Database Could Use RAM?

Although the title is a punn from the famous “woodchuck rhyme,” the question is serious for LLM applications using vector databases. As large language models (LLMs) continue to evolve, leveraging vector databases to store and search embeddings is critical. Understanding the memory usage of these systems is essential for maintaining performance, response times, and ensuring system scalability.

Read More
Benchmarking GPUs: Measuring Throughput between CPU and GPU

Benchmarking GPUs: Measuring Throughput between CPU and GPU

This article was inspired by a LinkedIn post by Dennis Kennetz . The CPU to GPU bandwidth check is available on GitHub which uses a specific flow to assess the data transfer rates. Like many in the industry, my focus is on AI and ML workloads and how we can improve efficiencies and performance using DRAM, CXL, CPU, GPUs, and software improvements.

In the rapidly evolving landscape of artificial intelligence (AI) and machine learning (ML), the ability to process vast amounts of data efficiently is paramount. As AI models grow in complexity and size, the demand for high-performance computing resources intensifies. At the heart of this demand lies the crucial task of optimizing data transfers between various components of a computing system, particularly from DRAM, CPU, and emerging technologies like CXL (Compute Express Link) to and from the GPU.

Read More
Linux Kernel 6.10 is Released: This is What's New for Compute Express Link (CXL)

Linux Kernel 6.10 is Released: This is What's New for Compute Express Link (CXL)

The Linux Kernel 6.10 release brings several improvements and additions related to Compute Express Link (CXL) technology.

Here is the detailed list of all commits merged into the 6.10 Kernel for CXL and DAX. This list was generated by the Linux Kernel CXL Feature Tracker .

Read More
Linux Kernel 6.9 is Released: This is What's New for Compute Express Link (CXL)

Linux Kernel 6.9 is Released: This is What's New for Compute Express Link (CXL)

The Linux Kernel 6.9 release brings several improvements and additions related to Compute Express Link (CXL) technology.

New Features

Here is a list of new features for CXL:

Here is the detailed list of all commits merged into the 6.9 Kernel for CXL and DAX. This list was generated by the Linux Kernel CXL Feature Tracker .

Read More
Linux Kernel CXL Feature Tracker

Linux Kernel CXL Feature Tracker

I’m always watching the Linux Kernel for new and exciting features that are merged for Compute Express Link (CXL). There’s some great notes from the monthly developer meetup here , but the devil is always in the details, and not every commit is discussed in the meeting. So I wrote a simple Python script, called cxl_feature_tracker.py that looks in all commits to the Linus Torvalds Linux Kernel GitHub repository , and extracts any that mention “CXL” or “DAX”, or that make changes to the drivers/cxl or drivers/dax directories. The output is a very long list, but it has some gems amongst the list of fixes.

Read More
Using Linux Kernel Tiering with Compute Express Link (CXL) Memory

Using Linux Kernel Tiering with Compute Express Link (CXL) Memory

In this blog post, we will walk through the process of enabling the Linux Kernel Transparent Page Placement (TPP) feature with CXL memory mapped as NUMA nodes using the system-ram namespace. This feature allows the kernel to automatically place pages in different types of memory based on their usage patterns.

Prerequisites

This guide assumes that you are using a Fedora 36 system with Kernel 5.19.13, and that your system has a Samsung CXL device installed. You can confirm the presence of the CXL device with the following command:

Read More
Understanding Compute Express Link (CXL) and Its Alignment with the PCIe Specifications

Understanding Compute Express Link (CXL) and Its Alignment with the PCIe Specifications

How CXL Uses PCIe Electricals and Transport Layers

CXL utilizes the PCIe infrastructure, starting with the PCIe 5.0. This ensures compatibility with existing systems while introducing new features for device connectivity and memory coherency. CXL’s ability to maintain memory coherency across shared memory pools is a significant advancement, allowing for efficient resource sharing and operand movement between accelerators and target devices.

CXL builds upon the familiar foundation of PCIe, utilizing the same physical interfaces, transport layer, and electrical signaling. This shared foundation makes CXL integration with existing PCIe systems seamless. Here’s a breakdown of how it works:

Read More
A Practical Guide to Identify Compute Express Link (CXL) Devices in Your Server

A Practical Guide to Identify Compute Express Link (CXL) Devices in Your Server

In this article, we will provide four methods for identifying CXL devices in your server and how to determine which CPU socket and NUMA node each CXL device is connected. We will use CXL memory expansion (CXL.mem) devices for this article. The server was running Ubuntu 22.04.2 (Jammy Jellyfish) with Kernel 6.3 and ‘cxl-cli ’ version 75 built from source code. Many of the procedures will work on Kernel versions 5.16 or newer.

Read More