Skip to main content

Out-of-Memory (OOM) Event

The Linux kernel constantly monitors a system's memory usage. If usage reaches a critical threshold, the kernel will issue an "out of memory" routine.

In order to avoid instability of the entire system, this process will typically terminate the most memory-intensive processes at this point in time. The routine does not stop processes in a coordinated manner. This may have a negative effect on data integrity, for example, in database services.

If these "out of memory" events happen regularly, we strongly recommend to upgrade the system memory.

Notifications

We will notify you of all "out of memory" events that occur on your system. You will receive a separate information for each process (e.g., Java, MySQL, PHP) that caused an "out of memory" event in the past six hours.

Often Affected Processes

Since the Linux kernel prefers to terminate processes that use a lot of memory, some processes are affected more than others:

  • MySQL
  • Java
  • User space processes, for example Atlassian software
  • OpenSearch / Elasticsearch
  • PHP-FPM in the webserver context
  • PHP CLI processes, for example PHP executed by cron jobs

Please take into consideration that the resource usage of these processes depends on the amount of accesses to your web page or the amount of data processed. In the vast majority the root cause of these "out of memory" events will be found within the application environment.

Linux Memory Usage and Memory Usage Display

When using tools like (h)top or free to check the memory usage, there is a fair chance of being misled by the metrics shown. These are:

  • total: Total memory available
  • used: Memory used by services "directly"
  • free: Unused memory, excluding buff/cache
  • shared: Mostly irrelevant for systems managed by Nine
  • buff/cache: Memory used by kernel buffers, file and page caches and shared memory segments
  • available: Total minus used memory

One often neglected metric is buff/cache. Some services don't directly allocate memory in a way the kernel shows it in the used category. Notable examples include PostgreSQL and NFS. These services almost only allocate file and page caches. This might be misleading as it could indicate that a system is oversized.

As an example, one of our customer servers with 128 GB memory running PostgreSQL shows these metrics:

root@redacted:~ # free -m
total used free shared buff/cache available
Mem: 128752 9823 4959 19814 113970 98170
Swap: 7811 75 7736

The fact that less than 10 GB of memory is used, yet there are over 110 GB in the buffers and cache, demonstrates how deceptive it can be to only look at one of these metrics.

Another system with 512 GB memory running MySQL shows these metrics:

root@redacted:~ # free -m
total used free shared buff/cache available
Mem: 515773 392181 7565 3 120592 123591
Swap: 8191 1714 6477

In a stark contrast to the PostgreSQL system, the majority of memory usage is found in the used area, while still making intense use of the buff/cache area.

While the kernel can potentially free up some memory in the buff/cache area, it's important to understand that this comes with side effects, such as higher latencies caused by more hard disk reads and writes.

A reasonably sized environment should always have some spare room in the form of used buff/cache. Very low buff/cache usage and a used memory amount close to the total memory often indicate a shortage without room for growth and unforeseen events. In fact, these are the systems on which we see "out of memory events" most regularly.

High Swap usage also indicates an issue, especially for smaller sized environments.

This usually becomes an issue if the Swap area usage exceeds ~50%, the used memory is close to the total memory and there is little to no buff/cache used. Using Swap heavily will cause high latencies as Swap content needs to be written to and read from disk, which generally should be avoided.

A slight usage of Swap in the area of a few mega byte isn't concerning though. The kernel might decide to reallocate used memory to the Swap area, for example if the algorithm sees very little use for a part of the used memory.

We know it's challenging to interpret and understand metrics within each system's context, especially when a system serves more than one purpose. We are happy to guide you through this process and will recommend a sizing that's right for your application and system.

Database Service Memory Usage

Database services are a critical part of application performance. Regardless of the chosen database engine, database services aim to minimize hard disk reads for requested data by using internal caches.

Database developers recommend allocating up to 70% of a system's memory for caching. Ideally, the caches are large enough to contain all database contents. With a database growing in size, this might not be possible or not feasible within a given budget.

This recommendation applies to systems that only run a database service.

In most cases, a system will run multiple services alongside the database service, such as a web server, a PHP environment and a key value store. Nine therefore automatically adjusts the database cache configuration to fit the chosen system size.

It is common to see database services use between 40% and 60% of a system's memory.

High memory utilization by a database service does not indicate a performance issue. In fact, it's the desired state, as performance for every application depending on the database service could otherwise significantly degrade.

Show Current Memory Usage

You can get an overview of the memory usage with the following shell command:

www-data@server:~ # ps -eo pid,cmd,%cpu,%mem --sort=-%mem | head -n 11

PID CMD %CPU %MEM
986 /usr/sbin/mysqld --daemoniz 0.1 7.3
125872 ruby2.5 /usr/lib/hello-worl 0.0 2.0
234301 ruby2.5 /usr/lib/find-file- 0.0 1.4
208475 ruby2.5 /usr/lib/find-dir-a 0.0 1.2
310 ruby2.5 /usr/lib/find-dir-b 0.0 1.0
1325 ruby2.5 /usr/lib/find-dir-c 0.0 0.9
125826 ruby2.5 /usr/lib/find-dir-d 0.0 0.9
126039 ruby2.5 /usr/lib/find-dir-e 0.1 0.9
2089 ruby2.5 /usr/lib/find-file- 0.1 0.8
166352 ruby2.5 /usr/lib/exec-comma 0.4 0.7

This will show the 10 processes using the most memory. Please mind that this will be a snapshot of this very moment and depending on the application or service, the memory usage might vary by a significant margin in a short time frame.

Order Additional Memory

If you need additional memory or have questions about this message, please feel free to contact us: