Out-of-Memory (OOM) Event
The Linux kernel constantly monitors a system's memory usage. If usage reaches a critical threshold, the kernel will issue an "out of memory" routine.
In order to avoid instability of the entire system, this process will typically terminate the most memory-intensive processes at this point in time. The routine does not stop processes in a coordinated manner. This may have a negative effect on data integrity, for example, in database services.
If these "out of memory" events happen regularly, we strongly recommend to upgrade the system memory.
Notifications
We will notify you of all "out of memory" events that occur on your system. You will receive a separate information for each process (e.g., Java, MySQL, PHP) that caused an "out of memory" event in the past six hours.
Often Affected Processes
Since the Linux kernel prefers to terminate processes that use a lot of memory, some processes are affected more than others:
- MySQL
- Java
- User space processes, for example Atlassian software
- OpenSearch / Elasticsearch
- PHP-FPM in the webserver context
- PHP CLI processes, for example PHP executed by cron jobs
Please take into consideration that the resource usage of these processes depends on the amount of accesses to your web page or the amount of data processed. In the vast majority the root cause of these "out of memory" events will be found within the application environment.
Linux Memory Usage and Memory Usage Display
When using tools like (h)top
or free
to check the memory usage, there is a fair chance of being misled by the metrics
shown. These are:
total
: Total memory availableused
: Memory used by services "directly"free
: Unused memory, excludingbuff/cache
shared
: Mostly irrelevant for systems managed by Ninebuff/cache
: Memory used by kernel buffers, file and page caches and shared memory segmentsavailable
: Total minus used memory
One often neglected metric is buff/cache
. Some services don't directly allocate memory in a way the kernel shows
it in the used
category. Notable examples include PostgreSQL and NFS. These services almost only allocate file and
page caches. This might be misleading as it could indicate that a system is oversized.
As an example, one of our customer servers with 128 GB memory running PostgreSQL shows these metrics:
root@redacted:~ # free -m
total used free shared buff/cache available
Mem: 128752 9823 4959 19814 113970 98170
Swap: 7811 75 7736
The fact that less than 10 GB of memory is used, yet there are over 110 GB in the buffers and cache, demonstrates how deceptive it can be to only look at one of these metrics.
Another system with 512 GB memory running MySQL shows these metrics:
root@redacted:~ # free -m
total used free shared buff/cache available
Mem: 515773 392181 7565 3 120592 123591
Swap: 8191 1714 6477
In a stark contrast to the PostgreSQL system, the majority of memory usage is found in the used
area, while still
making intense use of the buff/cache
area.
While the kernel can potentially free up some memory in the buff/cache
area, it's important to understand that this
comes with side effects, such as higher latencies caused by more hard disk reads and writes.
A reasonably sized environment should always have some spare room in the form of used buff/cache
.
Very low buff/cache
usage and a used
memory amount close to the total
memory often indicate a shortage without
room for growth and unforeseen events. In fact, these are the systems on which we see "out of memory events" most regularly.
High Swap
usage also indicates an issue, especially for smaller sized environments.
This usually becomes an issue if the Swap
area usage exceeds ~50%, the used
memory is close to the total
memory and there is little to no buff/cache
used. Using Swap
heavily will cause high latencies as Swap
content
needs to be written to and read from disk, which generally should be avoided.
A slight usage of Swap
in the area of a few mega byte isn't concerning though. The kernel might decide to
reallocate used memory to the Swap
area, for example if the algorithm sees very little use for a part of the used
memory.
We know it's challenging to interpret and understand metrics within each system's context, especially when a system serves more than one purpose. We are happy to guide you through this process and will recommend a sizing that's right for your application and system.
Database Service Memory Usage
Database services are a critical part of application performance. Regardless of the chosen database engine, database services aim to minimize hard disk reads for requested data by using internal caches.
Database developers recommend allocating up to 70% of a system's memory for caching. Ideally, the caches are large enough to contain all database contents. With a database growing in size, this might not be possible or not feasible within a given budget.
This recommendation applies to systems that only run a database service.
In most cases, a system will run multiple services alongside the database service, such as a web server, a PHP environment and a key value store. Nine therefore automatically adjusts the database cache configuration to fit the chosen system size.
It is common to see database services use between 40% and 60% of a system's memory.
High memory utilization by a database service does not indicate a performance issue. In fact, it's the desired state, as performance for every application depending on the database service could otherwise significantly degrade.
Show Current Memory Usage
You can get an overview of the memory usage with the following shell command:
www-data@server:~ # ps -eo pid,cmd,%cpu,%mem --sort=-%mem | head -n 11
PID CMD %CPU %MEM
986 /usr/sbin/mysqld --daemoniz 0.1 7.3
125872 ruby2.5 /usr/lib/hello-worl 0.0 2.0
234301 ruby2.5 /usr/lib/find-file- 0.0 1.4
208475 ruby2.5 /usr/lib/find-dir-a 0.0 1.2
310 ruby2.5 /usr/lib/find-dir-b 0.0 1.0
1325 ruby2.5 /usr/lib/find-dir-c 0.0 0.9
125826 ruby2.5 /usr/lib/find-dir-d 0.0 0.9
126039 ruby2.5 /usr/lib/find-dir-e 0.1 0.9
2089 ruby2.5 /usr/lib/find-file- 0.1 0.8
166352 ruby2.5 /usr/lib/exec-comma 0.4 0.7
This will show the 10 processes using the most memory. Please mind that this will be a snapshot of this very moment and depending on the application or service, the memory usage might vary by a significant margin in a short time frame.
Order Additional Memory
If you need additional memory or have questions about this message, please feel free to contact us: