Enhance Off-heap memory fragmentation visibility

To be Reviewed By: February 23th, 2022

Authors: Alberto Gomez

Status: Discussion

Superseded by: N/A

Related: N/A

Problem

Geode supports the storage of data in off-heap memory by using the Java Unsafe class.

At startup, each Geode server with off-heap memory configured creates several off-heap fragments (each with a maximum of 2GB of size) so that any request to allocate an object could get a piece of any of the available fragments to store the object.
A list of tiny and huge memory chunks is also maintained by Geode so that off-heap memory freed when deleting an off-heap object will be put in one of these two lists (depending on the amount of allocated memory) for further allocation requests of objects of the same size.

The possibility to store an object in off-heap memory not only depends on the amount of free off-heap memory available but also on how fragmented the off-heap memory is, i.e. what is the size available in each of the fragments, how many elements are there in the tiny and huge lists and what is the size of each.
Depending on how fragmented the off-heap memory is, it might be that objects of a certain size will not be able to be stored in it no matter what the total available off-heap memory is.

Even though Geode offers several stats related to the status of the off-heap memory area (usedMemory, freeMemory, fragmentation, largestFragment and fragments), the ones that provide information about the level of the fragmentation of the off-heap memory (fragmentation, largestFragment and fragments) are only updated when defragmentation is executed.

Given that the defragmentation of the off-heap memory area is only run -automatically- whenever a request to store an object cannot be satisfied, the fragmentation status (given by the above stats) is only known right after the server has been started (no fragmentation) or right after the memory has been defragmented.

Running defragmentation can be a very CPU intensive task (and Java GC demanding) and it blocks incoming off-heap memory allocation requests. Therefore, being timely aware of the fragmentation status of the off-heap memory can allow Geode users to plan for measures to avoid an automatic defragmentation to kick in at an inconvenient time. A possible measure could be to restart the servers to defragment the off-heap memory at a convenient time.

Anti-Goals

This RFC does not propose to add the possibility to run off-heap defragmentations on demand or to improve the defragmentation algorithm. Nevertheless, some ideas in that direction will be proposed in another RFC.

Solution

This RFC proposes to make timely visible the off-heap fragmentation status by:

Updating the largestFragment stat periodically.
Adding a new off-heap stat called "freedChunks" that will provide the number of elements in the tiny and huge lists. This stat will be periodically updated.

Ideally, the two stats above should be updated in real time. Nevertheless, given that the calculation could be very costly, it is proposed that it is done periodically, in a background thread, in order to limit the CPU consumption and to not affect the latency of allocations and deallocations.

These stats would be available via JMX and gfsh.

These stats could be described as follows in the Geode documentation:

largestFragment: Largest off-heap memory fragment that can be used to allocate an object. When an object is allocated from a fragment, the size of the fragment is decreased with the size of the object plus some heading and possibly some alignment bytes.
freedChunks: Number of off-heap memory chunks that have been freed since the last defragmentation and that are not currently being used to store an object in the off-heap memory space.
There are two types of chunks depending on the size of them: tiny if the size is lower than OFF_HEAP_ALIGNMENT* OFF_HEAP_FREE_LIST_COUNT (default = 512K) and huge if the size is greater.
These chunks can also be used to allocate objects in the off-heap but the size of the freedChunk must comply with some conditions with respect to the object to be allocated. In order to allocate an object of tiny size between X and X-OFF_HEAP_ALIGMENT-1, a tinty freedChunk of size X must be found. In order to allocate an object of huge size between X and X-256-1, a huge freedChunk of size X must be found.

Other information that could be added to the off-heap documentation could be:

When an off-heap object is freed, a freedChunk entry is created in either the tiny or the huge list depending on the freed memory.
When defragmentation is run, Geode takes all freedChunks and remains of fragments and creates a new set of fragments by concatenating adjacent freedChunks and fragments.

Changes and Additions to Public Interfaces

A new off-heap stat will be available: freedChunks

This stat and the current largestFragment one will be available via JMX and via gfsh.

Space shortcuts

Page tree

Problem

Anti-Goals

Solution

Changes and Additions to Public Interfaces

Performance Impact

Backwards Compatibility and Upgrade Path

Prior Art

FAQ

Errata

9 Comments

Darrel Schneider

Alberto Gomez

Darrel Schneider

Alberto Gomez

Darrel Schneider

Alberto Gomez

Alberto Gomez

Darrel Schneider

Alberto Gomez