Resolving commands 'Killed' on GCP f1-micro Compute Engine instances

Resolving commands 'Killed' on GCP f1-micro Compute Engine instances

When I want to perform a quick task, I generally spin up a Google GCP Compute Engine instance as they’re cheap. However, they have limited resources, particularly memory. When refreshing the package repositories, it’s quite easy to encounter an Out-of-Memory (OOM) situation which results in the command - yum or dnf - is ‘killed’. For example:

$ sudo dnf update 
CentOS Stream 8 - AppStream                                                                                                  8.3 MB/s |  18 MB     00:02    
CentOS Stream 8 - BaseOS                                                                                                      13 MB/s |  16 MB     00:01    
CentOS Stream 8 - Extras                                                                                                      69 kB/s |  16 kB     00:00    
Google Compute Engine                                                                                                         20 kB/s | 9.4 kB     00:00    
Google Cloud SDK                                                                                                              24 MB/s |  43 MB     00:01    
Killed

dmesg has a lot of information about the situation, but the key line to confirm dnf caused the OOM event, is:

[ 1156.249100] Out of memory: Killed process 1538 (dnf) total-vm:638020kB, anon-rss:290432kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:1244kB oom_score_adj:0

Many of the OS images provided by GCP and other cloud providers, often do not provide a swap device which is fine for the larger instances but may be required on the smaller memory instances.

To resolve the situation, create a swap device for the instance. The following adds 1GB which is typically enough for the dnf and yum commands.

sudo fallocate -l 1G /swapfile
sudo mkswap /swapfile
sudo chmod 0600 /swapfile
sudo swapon /swapfile

Note: The above is not permanent, so you’ll want to add an entry to the /etc/fstab to ensure the swap device is added on each boot, eg:

/swapfile none                    swap    defaults        0 0

Building NDCTL Utilities from Source: A Comprehensive Guide

Building NDCTL Utilities from Source: A Comprehensive Guide

Building NDCTL with Meson on Ubuntu 24.04

The NDCTL package includes the cxl, daxctl, and ndctl utilities. It uses the Meson build system for streamlined compilation. This guide reflects the modern build process for managing NVDIMMs, CXL, and PMEM on Ubuntu 24.04.

If you do not install a more recent Kernel than the one provided by the distro, then it is not recommended to compile these utilities from source code. If you have installed a mainline Kernel, then you will likely require a newer version of these utilities that are compatible with your Kernel. See the NDCTL Releases as the Kernel support information is provided there.

Read More
Benchmarking GPUs: Measuring Throughput between CPU and GPU

Benchmarking GPUs: Measuring Throughput between CPU and GPU

This article was inspired by a LinkedIn post by Dennis Kennetz . The CPU to GPU bandwidth check is available on GitHub which uses a specific flow to assess the data transfer rates. Like many in the industry, my focus is on AI and ML workloads and how we can improve efficiencies and performance using DRAM, CXL, CPU, GPUs, and software improvements.

In the rapidly evolving landscape of artificial intelligence (AI) and machine learning (ML), the ability to process vast amounts of data efficiently is paramount. As AI models grow in complexity and size, the demand for high-performance computing resources intensifies. At the heart of this demand lies the crucial task of optimizing data transfers between various components of a computing system, particularly from DRAM, CPU, and emerging technologies like CXL (Compute Express Link) to and from the GPU.

Read More
Running Open WebUI and Ollama on Ubuntu 22.04 for a Local ChatGPT Experience

Running Open WebUI and Ollama on Ubuntu 22.04 for a Local ChatGPT Experience

Introduction

Open WebUI and Ollama are powerful tools that allow you to create a local chat experience using GPT models. Whether you’re experimenting with natural language understanding or building your own conversational AI, these tools provide a user-friendly interface for interacting with language models. In this guide, we’ll walk you through the installation process step by step.

Ollama is a cutting-edge platform designed to run open-source large language models locally on your machine. It simplifies the complexities involved in deploying and managing these models, making it an attractive choice for researchers, developers, and anyone who wants to experiment with language models1. Ollama provides a user-friendly interface for running large language models (LLMs) locally, specifically on MacOS and Linux (with Windows support on the horizon).

Read More