Using Linux Kernel Tiering with Compute Express Link (CXL) Memory

Using Linux Kernel Tiering with Compute Express Link (CXL) Memory

In this blog post, we will walk through the process of enabling the Linux Kernel Transparent Page Placement (TPP) feature with CXL memory mapped as NUMA nodes using the system-ram namespace. This feature allows the kernel to automatically place pages in different types of memory based on their usage patterns.

Prerequisites

This guide assumes that you are using a Fedora 36 system with Kernel 5.19.13, and that your system has a Samsung CXL device installed. You can confirm the presence of the CXL device with the following command:

lspci | grep CXL

Step 1: Verify Automatic Memory Onlining

First, we need to verify if the OS automatically onlines memory. This can be done with the following command:

grep CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE /boot/config-$(uname -r)

If the output is CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=y, then the OS is configured to automatically online memory.

Step 2: Change the Default Memory Zone

Next, we change the default memory zone when memory is onlined to ZONE_MOVABLE. This can be done with the following command:

sudo echo online_movable > /sys/devices/system/memory/auto_online_blocks

Step 3: Convert the Namespace

We then use daxctl to convert the namespace from devdax to system-ram for all CXL Devices. This can be done with the following command:

daxctl reconfigure-device --mode=system-ram --force all

Step 4: Verify NUMA Output

At this point, you should be able to see the single-CPU (NODE0) and Samsung CXL device (NODE1) in the NUMA output. You can check this with the following command:

numactl -H

Step 5: Display Memory Blocks by NUMA Node and Zone

You can display the memory blocks by NUMA node and Zone with the following command:

lsmem -o +NODE,ZONES

Step 6: Enable Kernel Transparent Page Placement (TPP)

Finally, we can enable Kernel Transparent Page Placement (TPP). First, check the default setting for page demotions:

cat /sys/kernel/mm/numa/demotion_enabled

If the output is false, enable it with the following command:

echo true > /sys/kernel/mm/numa/demotion_enabled

Then, enable promotions:

echo 2 > /proc/sys/kernel/numa_balancing

Lastly, do reclaim for each zone. This makes sure that demotion is run to maintain a minimum set of free pages in each NUMA node:

echo 1 > /proc/sys/vm/zone_reclaim_mode

And that’s it! You have now enabled the Linux Kernel Transparent Page Placement (TPP) feature with CXL memory mapped as NUMA nodes using the system-ram namespace.

Please note that this guide is based on a specific system configuration and may need to be adjusted based on your specific hardware and software setup. Always refer to the official documentation for the most accurate and up-to-date information.

Programming Persistent Memory: A Comprehensive Guide for Developers Book

Programming Persistent Memory: A Comprehensive Guide for Developers Book

After many months of hard work by everyone involved, I’m very pleased to announce that the book “Programming Persistent Memory: A Comprehensive Guide for Developers” is now available for download in digital PDF & ePUB formats from https://pmem.io/book , and Kindle & paperback through Amazon .

Beginner and experienced programmers will use this comprehensive guide to persistent memory programming. You will understand how persistent memory brings together several new software/hardware requirements, and offers great promise for better performance and faster application startup times―a huge leap forward in byte-addressable capacity compared with current DRAM offerings.
This revolutionary new technology gives applications significant performance and capacity improvements over existing technologies. It requires a new way of thinking and developing, which makes this highly disruptive to the IT/computing industry. The full spectrum of industry sectors that will benefit from this technology include, but are not limited to, in-memory and traditional databases, AI, analytics, HPC, virtualization, and big data.   
Programming Persistent Memory describes the technology and why it is exciting the industry. It covers the operating system and hardware requirements as well as how to create development environments using emulated or real persistent memory hardware. The book explains fundamental concepts; provides an introduction to persistent memory programming APIs for C, C++, JavaScript, and other languages; discusses RMDA with persistent memory; reviews security features; and presents many examples. Source code and examples that you can run on your own systems are included.
What You’ll Learn
- Understand what persistent memory is, what it does, and the value it brings to the industry
- Become familiar with the operating system and hardware requirements to use persistent memory
- Know the fundamentals of persistent memory programming: why it is different from current programming methods, and what developers need to keep in mind when programming for persistence
- Look at persistent memory application development by example using the Persistent Memory Development Kit (PMDK)
- Design and optimize data structures for persistent memory
- Study how real-world applications are modified to leverage persistent memory
- Utilize the tools available for persistent memory programming, application performance profiling, and debugging
Who This Book Is For
C, C++, Java, and Python developers, but will also be useful to software, cloud, and hardware architects across a broad spectrum of sectors, including cloud service providers, independent software vendors, high performance compute, artificial intelligence, data analytics, big data, etc. 

Read More
CXL Device & Fabric Buyer's Guide: A List of GA Components (2025)

CXL Device & Fabric Buyer's Guide: A List of GA Components (2025)

Last Updated: June 27, 2025

This guide provides a curated list of generally available (GA) Compute Express Link (CXL) devices, fabric components, and memory appliances. It is a technical resource for engineers, architects, and hardware specialists looking to identify and compare CXL memory expansion modules, switches, and full system-level appliances from leading vendors. The tables below detail market-ready components, focusing on the specifications required to design and build CXL-enabled infrastructure.

Read More
How To Set Linux CPU Scaling Governor to Max Performance

How To Set Linux CPU Scaling Governor to Max Performance

The majority of modern processors are capable of operating in a number of different clock frequency and voltage configurations, often referred to as Operating Performance Points or P-states (in ACPI terminology). As a rule, the higher the clock frequency and the higher the voltage, the more instructions can be retired by the CPU over a unit of time, but also the higher the clock frequency and the higher the voltage, the more energy is consumed over a unit of time (or the more power is drawn) by the CPU in the given P-state. Therefore there is a natural trade-off between the CPU capacity (the number of instructions that can be executed over a unit of time) and the power drawn by the CPU.

Read More