Intel Optane Persistent Memory Modules report "Non-functional" state in ipmctl

Issue

Executing ipmctl show-dimm to get device information shows the persistent memory modules in a ‘Non-functional’ health state, eg:

# ipmctl show -dimm

 DimmID | Capacity | HealthState    | ActionRequired | LockState | FWVersion
=============================================================================
 0x0001 | 0.0 GiB  | Non-functional | N/A            | N/A       | N/A
 0x0011 | 0.0 GiB  | Non-functional | N/A            | N/A       | N/A
 0x0021 | 0.0 GiB  | Non-functional | N/A            | N/A       | N/A
 0x0101 | 0.0 GiB  | Non-functional | N/A            | N/A       | N/A
 0x0111 | 0.0 GiB  | Non-functional | N/A            | N/A       | N/A
 0x0121 | 0.0 GiB  | Non-functional | N/A            | N/A       | N/A
 0x1001 | 0.0 GiB  | Non-functional | N/A            | N/A       | N/A
 0x1011 | 0.0 GiB  | Non-functional | N/A            | N/A       | N/A
 0x1021 | 0.0 GiB  | Non-functional | N/A            | N/A       | N/A
 0x1101 | 0.0 GiB  | Non-functional | N/A            | N/A       | N/A
 0x1111 | 0.0 GiB  | Non-functional | N/A            | N/A       | N/A
 0x1121 | 0.0 GiB  | Non-functional | N/A            | N/A       | N/A

Other ipmctl commands may fail and return “No functional DIMMs in the system.”, eg:

# ipmctl show -sensor health

No functional DIMMs in the system.

Applies To

This issue applies to systems with the following components:

  • Linux

  • Intel Optane Persistent Memory

  • ipmctl utility

Cause(s)

The ipmctl-show-device(1) man page has the following description for ‘Non-functional’:

  • Non-functional: The DCPMM is present but is non-responsive via the DDRT communication path. It may be possible to communicate with this DCPMM via SMBus for a subset of commands.

Note: Linux does not support the SMBus interface. It only supports the DDRT protocol (default). Attempts to use the -smbus option will fail with the following error:

# ipmctl show -smbus -dimm

The following protocol -smbus is unsupported on this OS. Please use -ddrt instead.

There are several possible causes for this issue, including:

  • The operating system does not support persistent memory. Refer to the Operating System Support Matrix to confirm compatibility.

  • The BIOS does not support NFIT (NVDIMM Firmware Interface Table)

  • The nfit and libnvdimm drivers are not loaded in Linux. This can occur when using custom kernels where the drivers have been manually excluded from the kernel config.

Solution(s)

Here are some troubleshooting steps to verify the working functionality of the persistent memory hardware and operating system.

Step 1 - Verify the Persistent Memory is seen by the BIOS

Reboot the host and enter the BIOS (usually F1 or F2 during power-on). Navigate to the Memory section and you should see both the DDR and Persistent Memory (DDRT) modules listed, eg:

BIOS Main Menu showing DDR and DCPM (Persistent Memory) capacity

BIOS -> Advanced -> Memory Configuration showing DIMM Information

If you have access to the BMC or equivalent, such as iDRAC on Dell systems, you can get the information without rebooting the host:

Intel BMC showing DIMM Information (DDR & DCPMM Persistent Memory)

Step 2 - Verify the NFIT bus is available in the Kernel

If you have the ndctl utility installed, use it to determine whether the NFIT bus is available. If so, you should see output similar to the following:

# ndctl list --buses
[
  {
    "provider":"ACPI.NFIT",
    "dev":"ndbus0",
    "scrub_state":"idle"
  }
]

If ndctl returns no output, then this is the cause of the ‘Non-functional’ health status. See ‘Step 3’ to verify the nfit driver is loaded.

Step 3 - Verify the ’nfit’ driver is loaded

There are several drives that will be loaded by the Kernel depending on the operational mode of Persistent Memory.

The following shows the loaded drivers for a Linux operating system when persistent memory is in AppDirect mode:

# lsmod | egrep -i "nv|pmem"
dax_pmem               16384  0
dax_pmem_core          16384  1 dax_pmem
nd_pmem                24576  0
nd_btt                 28672  1 nd_pmem
nvme                   49152  0
nvme_core             110592  1 nvme
libnvdimm             192512  5 dax_pmem,dax_pmem_core,nd_btt,nd_pmem,nfit

Here’s what to expect for a system in Memory Mode:

# lsmod | egrep -i "nv|pmem"
nvme                   49152  0
nvme_core             110592  1 nvme
libnvdimm             192512  1 nfit

If the ’nfit’ driver is not shown, it is not loaded. Both ndctl and ipmctl use the nfit driver to communicate with the persistent memory through the BIOS. A failure to communicate with the BIOS or modules will result in a ‘Non-functional’ health state.

Step 4 - Determine if the drivers are installed

For all Linux distros that support persistent memory, the drivers should be available and loadable. If you use a custom kernel, it’s very likely the kernel configuration file disabled the required NVDIMM feature and the drivers were not built. Confirm that your Linux distribution has the necessary ‘acpi’, ’nfit’, and *pmem* drivers installed.

Most Linux distros keep their drivers in /lib/modules/$(uname -r)/kernel/drivers. On Fedora, for example, the drivers are installed by default:

# ls /lib/modules/$(uname -r)/kernel/drivers/acpi/nfit
nfit.ko.xz

# cd /lib/modules/$(uname -r)/kernel/drivers
$ find . -print |
 egrep -i "nvdimm|pmem|dax|acpi"
./nvdimm
./nvdimm/nd_pmem.ko.xz
./nvdimm/nd_e820.ko.xz
./nvdimm/libnvdimm.ko.xz
./nvdimm/nd_blk.ko.xz
./nvdimm/nd_btt.ko.xz
./dax
./dax/dax_hmem.ko.xz
./dax/device_dax.ko.xz
./dax/kmem.ko.xz
./dax/pmem
./dax/pmem/dax_pmem.ko.xz
./dax/pmem/dax_pmem_core.ko.xz
./acpi
./acpi/sbs.ko.xz
./acpi/sbshc.ko.xz
./acpi/apei
./acpi/apei/einj.ko.xz
./acpi/acpi_tad.ko.xz
./acpi/video.ko.xz
./acpi/acpi_configfs.ko.xz
./acpi/ec_sys.ko.xz
./acpi/custom_method.ko.xz
./acpi/acpi_ipmi.ko.xz
./acpi/nfit
./acpi/nfit/nfit.ko.xz
./acpi/acpi_pad.ko.xz

If the acpi, nfit, nvdimm, dax, and pmem directories are missing, the drivers are missing or were disabled when the kernel was built. This will result in communication issues to the BIOS and persistent memory modules, resulting in unexpected errors from ndctl and ipmctl.

The drivers are delivered as part of the Kernel package. On Fedora31, for example, we see the nfit driver is delivered by the kernel itself:

# dnf provides /lib/modules/$(uname -r)/kernel/drivers/acpi/nfit/nfit.ko.xz
Last metadata expiration check: 1:20:08 ago on Fri 29 May 2020 03:21:17 PM MDT.
kernel-core-5.6.13-200.fc31.x86_64 : The Linux kernel
Repo        : @System
Matched from:
Filename    : /lib/modules/5.6.13-200.fc31.x86_64/kernel/drivers/acpi/nfit/nfit.ko.xz

kernel-core-5.6.13-200.fc31.x86_64 : The Linux kernel
Repo        : updates
Matched from:
Filename    : /lib/modules/5.6.13-200.fc31.x86_64/kernel/drivers/acpi/nfit/nfit.ko.xz

Step 5 - Determine if your kernel was configured to support persistent memory

See my previous post on ‘How To Verify Linux Kernel Support for Persistent Memory ’. The NVDIMM Wiki on Kernel.org also has a section to help custom kernel developers.

Set up the correct kernel configuration options for PMEM and DAX in .config. To use huge pages for mmapped files, you’ll need CONFIG_FS_DAX_PMD selected, which is done automatically if you have the prerequisites marked below.

Options in make menuconfig:

  • Device Drivers - NVDIMM (Non-Volatile Memory Device) Support

    • PMEM: Persistent memory block device support

    • BLK: Block data window (aperture) device support

    • BTT: Block Translation Table (atomic sector updates)

  • Enable the block layer

    • Block device DAX support
  • File systems

    • Direct Access (DAX) support
  • Processor type and features

    • Support non-standard NVDIMMs and ADR protected memory

    • Transparent Hugepage Support

    • Allow for memory hot-add

      • Allow for memory hot remove
    • Device memory (pmem, HMM, etc…) hotplug support

CONFIG_ZONE_DEVICE=y
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_ACPI_NFIT=m
CONFIG_X86_PMEM_LEGACY=m
CONFIG_OF_PMEM=m
CONFIG_LIBNVDIMM=m
CONFIG_BLK_DEV_PMEM=m
CONFIG_BTT=y
CONFIG_NVDIMM_PFN=y
CONFIG_NVDIMM_DAX=y
CONFIG_FS_DAX=y
CONFIG_DAX=y
CONFIG_DEV_DAX=m
CONFIG_DEV_DAX_PMEM=m
CONFIG_DEV_DAX_KMEM=m

You can check what options were used to configure the Kernel using the config file that usually resides in the /boot file system. There are many more kernel options available than listed in the Wiki. For example, on Kernel 5.6.13, we have:

# egrep -i 'zone|memory|hugepage|nfit|pmem|nvdimm|btt|dax' /boot/config-$(uname -r)
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_ZONE_DMA32=y
CONFIG_ZONE_DMA=y
CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
# CONFIG_ARCH_MEMORY_PROBE is not set
CONFIG_X86_PMEM_LEGACY_DEVICE=y
CONFIG_X86_PMEM_LEGACY=m
# CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK is not set
CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y
CONFIG_DYNAMIC_MEMORY_LAYOUT=y
CONFIG_RANDOMIZE_MEMORY=y
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0xa
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_ACPI_HOTPLUG_MEMORY=y
CONFIG_ACPI_NFIT=m
# CONFIG_NFIT_SECURITY_DEBUG is not set
CONFIG_ACPI_APEI_MEMORY_FAILURE=y
CONFIG_ARCH_HAS_SET_MEMORY=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
CONFIG_BLK_DEV_ZONED=y
CONFIG_BLK_DEBUG_FS_ZONED=y
# Memory Management options
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_MEMORY_ISOLATION=y
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTPLUG_SPARSE=y
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=y
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_MEMORY_BALLOON=y
CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
CONFIG_MEMORY_FAILURE=y
CONFIG_TRANSPARENT_HUGEPAGE=y
# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
CONFIG_ZONE_DEVICE=y
# end of Memory Management options
CONFIG_NF_CONNTRACK_ZONES=y
# LPDDR & LPDDR2 PCM memory drivers
# end of LPDDR & LPDDR2 PCM memory drivers
# CONFIG_ZRAM_MEMORY_TRACKING is not set
CONFIG_DM_ZONED=m
# Memory mapped GPIO drivers
# end of Memory mapped GPIO drivers
# MemoryStick drivers
# MemoryStick Host Controller Drivers
# CONFIG_VIRTIO_PMEM is not set
# CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is not set
# CONFIG_MEMORY is not set
CONFIG_LIBNVDIMM=m
CONFIG_BLK_DEV_PMEM=m
CONFIG_ND_BTT=m
CONFIG_BTT=y
CONFIG_NVDIMM_PFN=y
CONFIG_NVDIMM_DAX=y
CONFIG_NVDIMM_KEYS=y
CONFIG_DAX_DRIVER=y
CONFIG_DAX=y
CONFIG_DEV_DAX=m
CONFIG_DEV_DAX_PMEM=m
CONFIG_DEV_DAX_HMEM=m
CONFIG_DEV_DAX_KMEM=m
# CONFIG_DEV_DAX_PMEM_COMPAT is not set
# CONFIG_ZONEFS_FS is not set
CONFIG_FS_DAX=y
CONFIG_FS_DAX_PMD=y
# Memory initialization
# end of Memory initialization
# Default contiguous memory area size:
CONFIG_ARCH_HAS_PMEM_API=y
# Memory Debugging
CONFIG_DEBUG_MEMORY_INIT=y
# end of Memory Debugging

How To Set Linux CPU Scaling Governor to Max Performance

How To Set Linux CPU Scaling Governor to Max Performance

The majority of modern processors are capable of operating in a number of different clock frequency and voltage configurations, often referred to as Operating Performance Points or P-states (in ACPI terminology).

Read More
A Practical Guide to Identify Compute Express Link (CXL) Devices in Your Server

A Practical Guide to Identify Compute Express Link (CXL) Devices in Your Server

In this article, we will provide four methods for identifying CXL devices in your server and how to determine which CPU socket and NUMA node each CXL device is connected.

Read More
Understanding Compute Express Link (CXL) and Its Alignment with the PCIe Specifications

Understanding Compute Express Link (CXL) and Its Alignment with the PCIe Specifications

How CXL Uses PCIe Electricals and Transport Layers CXL utilizes the PCIe infrastructure, starting with the PCIe 5.

Read More