To be Reviewed By: March 31, 2022

Authors: Darrel Schneider

Status: Draft | Discussion | Active | Dropped | Superseded

Superseded by: N/A

Related: N/A

Problem

The Geode ResourceManager needs to obtain information from the JVM about how big the heap is and how much of it is in use. Some JVM vendors may have proprietary APIs that if the ResourceManager used them would cause the ResourceManager to perform its job better. Or Geode users may want to run on a newer release of java that has better ways to obtain this information. But because Geode is built with the standard Java 8 release, it is not able to have calls to APIs that are not part of Java 8. 

Anti-Goals

This proposal is not adding support for new garbage collectors or improving how Geode uses the ResourceManager to do eviction or prevent out of memory exceptions. 

Solution

The ResourceManager only needs two items from the JVM: the maximum size of the heap and the amount of the heap in use. To allow Geode to call APIs that did not exist in Java 8, we will add a single interface, HeapUsageProvider that a user can implement with a class that must have a zero-argument constructor. The user will specify that it wants to use its class by setting the system property "geode.resource-manager.heap-usage-provider.class-name" to the name of their class. This system property would need to be set on each server that they want it to be used on. Their class will also need to be made loadable on the server by adding it to the server's class path or deploying it to the server. Geode will load the class when the Cache is created. If the class fails to load, or a new instance can not be created, or the instance does not implement HeapUsageProvider, then the Cache creation will fail with an exception describing the problem. The single instance created when the Cache is created will be used by Geode until the Cache is closed at which point the HeapUsageProvider close method will be called.

HeapUsageProvider implementations must return meaningful values when asked for the maximum and used heap amounts. It can optionally send notifications to Geode when the used memory changes.

By default Geode supplies its own internal HeapUsageProvider which will be used if the user does not set the "geode.resource-manager.heap-usage-provider.class-name" system property. Before implementing your own HeapUsageProvider it would be wise to look at the Geode source of its internal HeapUsageProvider to see if it can be used instead.

A draft PR of this solution can be found here: https://github.com/apache/geode/pull/7456

Changes and Additions to Public Interfaces

One new public Interface will be added:

package org.apache.geode.cache.control;
import java.util.function.LongConsumer;
/**
* This interface can be implemented with a class
* that is then specified using the {@link #HEAP_USAGE_PROVIDER_CLASS_NAME}
* system property and will then be used by geode to determine how much heap memory
* is in use and what the maximum heap size is.
* This interface provides the heap usage in two ways.
* It must implement {@link #getBytesUsed()} which is used by geode to directly
* ask this provider for the current used memory.
* It can also provide notifications when the usage has changed by calling
* {@link LongConsumer#accept(long)} on the listener passed to startNotifications.
* It should only do this after startNotifications has been
* called and should stop notifications after stopNotifications has been called.
* Notifications are optional. If the underlying JVM does not support notifications
* then startNotifications and stopNotifications can have empty implementations.
*/
public interface HeapUsageProvider {
/**
* This is the name of a system property but this String must be prefixed with either
* "geode." or "gemfire.".
* If set then its value must be the name of a class that
* implements the {@link HeapUsageProvider} interface and has a zero-arg public constructor.
* If the class can not be found or an instance can not be constructed than the Cache startup
* will fail with an exception.
* Otherwise, a single instance will be created and used by Geode's ResourceManager
* to determine if the eviction-heap-threshold or the critical-heap-threshold are exceeded.
* This single instance will be used until the Geode Cache is closed.
*/
String HEAP_USAGE_PROVIDER_CLASS_NAME = "resource-manager.heap-usage-provider.class-name";
/**
* Called by geode when this provider should start providing notifications
* to the given listener.
*/
void startNotifications(LongConsumer listener);
/**
* Called by geode when this provider should stop providing notifications.
*/
void stopNotifications();
/**
* Called by geode when it wants to know what the maximum amount of heap is.
* Geode does not expect this value to change, so it tends to call it once and
* cache the returned value. Provider implementations must implement this method
* to return a reasonable value.
*
* @return the maximum amount of heap memory in bytes
*/
long getMaxMemory();
/**
* Called by geode when it needs to know how many bytes are current used of heap memory.
* Geode will call this method periodically (by default every 500 milliseconds which can
* be changed using the "gemfire.heapPollerInterval" system property).
* Provider implementations must implement this method
* to return a reasonable value.
*
* @return the number of heap memory bytes currently in use
*/
long getBytesUsed();
/**
* Called when this provider will no longer be used by Geode.
*/
void close();
}


Performance Impact

No performance impact is expected.

Backwards Compatibility and Upgrade Path

No problem with rolling upgrades or backward compatibility.

It is worth noting that if the user specifies their own HeapUsageProvider then they will not be using the one that Geode uses by default. Part of the default HeapUsageProvider to check for the "gemfire.ResourceManager.HEAP_POOL" system property. So by specifying your own HeapUsageProvider you cause Geode to ignore the HEAP_POOL property.

Prior Art

The HEAP_POOL system property was Geode's previous attempt to allow customization of the ResourceManager's interaction with the JVM. The problem is it only allows the memory pool name to be customized. It does now allow customization of the APIs that will be called on that MemoryPoolMXBean. 

FAQ

One thing that was considered was to use the standard jdk ServiceLoader to create an instance of this class. But since this solution only wants a single instance of HeapUsageProvider the service solution was going to require both the service definition and some way to specify which of the possible multiple service instances implementing HeapUsageProvider to use. So it seemed better to only allow a single class to be specified.

Errata

What are minor adjustments that had to be made to the proposal since it was approved?

  • No labels

16 Comments

  1. I don't agree with the format of the property name used. I think we should really start considering adding contextual namespaces that would be helpful for a user to understand the scope of the property. "geode.heapUsageProviderClassName" feels too generic. It feels like we did not know where to put this property so it was just lumped into a single large bucket of properties. I would suggest that we look at supporting more contextually relevant information in our property names. see Common Spring properties for examples. "geode.resource-manager.heap-usage-provider.className"  is far more descriptive than just "heapUsageProviderClassName". Also, by providing context things like "geode.resource-manager.heap-usage-provider.name" could be used to describe which HeapUsageProvider is to be used. This is provider name will be package structure agnostic and would follow a pattern closer to that of Java's Cipher class "Cipher.getInstance("AES")".

    In addition to the obvious benefits, providing context allows for possibly adding configuration properties for a specific implementations (which might not be 100% relevant in this case).

    1. Okay. Lets go with "geode.resource-manager.heap-usage-provider.className". The only question I have for you about it is the last "className" part. Did you intentionally want the last component to be camel case or should it be "class-name"?

      1. Sorry... "className" could really be "class-name" or more simply "classname"..... I blame muscle memory overriding brain function

        1. lets go with "class-name" since a quick google search seemed to also have it as two words not one.

  2. Only by reading through the source code can I find any reference to the "default" HeapUsageProvider implementation if the property is not explicitly set. I think it needs to be called out that if the "heapUsageProviderClassName" property is not set, that there would be a default. In this case "MemoryPoolMXBeanHeapUsageProvider". Also, it is not immediately clear what the downside would be if I use the default for non-Java8 JVMs. Are there any side effects?

    1. I added a note about Geode's default behavior. But keep in mind that this RFC is about describing how to customer the ResourceManager not to document its default behavior that existed in the previous release. I still did not mention the name "MemoryPoolMXBeanHeapUsageProvider" since it is internal and free to change. Also keep in mind that this RFC is to allow users to have a HeapUsageProvider that calls APIs not available on java8. But this does not mean that if you are running on java17 that you have to provide your own HeapUsageProvider. It is beyond the scope of this RFC to say which java versions Geode supports running on.

    2. PS: the backwards compatibility section did calls this out: "It is worth noting that if the user specifies their own HeapUsageProvider then they will not be using the one that Geode uses by default."

      1. I think that this needs to be called out in the solution. The backwards compatibility section is more aimed at "have you thought about backwards compatibility" and how the proposed solution will impact it. If the proposed solution states that Geode will have a default implementation if one is not specified, then the backward compatibility section can reference that aspect.

        You are correct that possibly the current implementation does not need to mentioned by name/classname, so that statement of "if none is specified a default will be provided" will be sufficient.

  3. The "getBytesUsed" method throws an IllegalStateException, looking at the code, it seems that the interface defines the IllegalStateException due to the default implementation throwing an IllegalStateException. To the point, do we expect each implementation of the HeapUsageProvider to throw IllegalStateExceptions or would it not make more sense that there is a generic failure exception that is thrown, documented and correctly handled by the "HeapMemoryMonitor".

    Would it also not make more sense that the construction/instance creation of the HeapUsageProvider were to fail if there is a failure scenario rather than a downstream/ almost unrelated method call. Handling an error related to the creation of the HeapUsageProvider at runtime/usage of the instance is too late and feels wrong.

    1. This was not due to the default implementation of HeapUsageProvider, but it was due to the existing contract the ResourceManager implementation has. The RFC clearly stated that if the HeapUsageProvider constructor throws on exception that it will cause the Cache creation to fail. So if an implementor of HeapUsageProvider wants to fail fast they have them option clearly spelled out.

      But if they instead only want to fail if the HeapUsageProvider is actually used by Geode (it is possible to have a HeapUsageProvider but to not enable the eviction-heap-threshold or the critical-heap-threshold on the ResourceManager in which case Geode will never use the provider) then they can do this by throwing IllegalStateException from getBytesUsed.

      I think you are right that this should not be part of the contract we promise to uphold for this feature. I think anyone implementing this interface will only be doing it if they are also enabling the ResourceManager. So for this RFC the constructor contract is all we need. I'll remove the IllegalStateException from the interface.

      1. I think that something so fundamentally important (my opinion), needs to fail at start up time, if there is a failure. Eviction can be configured at runtime, having a failure at runtime would be an experience that I would hate to have. Possibly restarting the system might be out of my control at that time, which means that one could not use eviction if failure detection is at the time of use. UNLESS you had a way to fix the problem, without having to restart the system. Which is a feature I don't believe we are considering supporting at this time.

  4. Overall this RFC feels a little "light" on detail. I feel that we need a little more detail relating to overall design of this feature. Without reading code, or having prior knowledge/context, it is hard to evaluate if this approach is "good". Could you please add detail around that area. A simple "Component Diagram" could easily describe the relationships. Because this RFC does not describe anything outside of the HeapUsageProvider. Missing things like:

    • How this relates to the HeapMemoryMonitor
    • How there is a specific 1:1 relationship with a HeapUsageProvider
    • Who creates HeapUsageProvider? In the PR there was mention of a possible Factory, which is not presented here.
    1. I wanted this RFC to focus on what a user needs to know about implementing and using this new interface. I added some info to make clear that only a single instance of the HeapUsageProvider will be created when the cache is created and I added a close method that will be called when the instance will no longer be used by Geode.

      The RFC is clear that the HeapUsageProvider is created when the Cache is by calling the class's zero-arg constructor. I don't think details of how Geode internally calls this zero-arg constructor, or the constructor it uses for its internal default HeapUsageProvider or it should instead use an internal factory should be part of this RFC. It will be part of the PR review but so will many other things related to having clean and testable code that are all beyond the scope of this RFC (I think).

      So I also think the internal HeapMemoryMonitor does not need to be mentioned since it is just part of the existing internal implementation of the ResourceManager and could go away without needing to make any changes to this RFC.

      I think getting into all those internal implementation details just makes it harder to focus in this RFC on the HeapUsageProvider which is really what this RFC is all about.

      1. I think describing the eco system in which a component will function does not detract from the solution. A component diagram can easily describe the 1:1 relationship of HeapMemoryMonitor and HeapUsageProvider. That said, I agree it might not be a requirement to mention the HeapMemoryManager, but I think it provides more clarity around usage.

        But that said, you might be right that discussing factories, etc might be more implementation detail than an abstract description of the solution.

  5. In the case of the point made in the FAQ, I think using ServiceLoader could address all the concerns that you have raised. The only difference here is that a ServiceLoader will instantiate every service it finds. So to avoid having a "heavy" default constructor, it might be better served to have an "init" method on the service interface to avoid unnecessary initialization of the HeapUsageProvider. As for the "targeted" implementation, this can be achieved with a simple.


    if(heapUsageProviderInstance.getClass().getName().equals("targetClassName")){
      heapUsageProvider.init();
    }

    OR when using an identifier

    if(heapUsageProviderInstance.getIdentifier().equals("targetIdentifier")){
      heapUsageProvider.init();
    }
  6. I agree that we could use the ServiceLoader by making the contract with the user more complex. They need to do at least two additional things:

    1. Define the service in a provider-configuration file in the resource directory META-INF/services. This is simple but another thing for them to do.
    2. implement the init method. They still need a zero-arg constructor but now they have to think about what they should do in the constructor vs. the init method.

    If an identifier is added then that is a 3rd thing they need to implement.

    These are not show stopper issues but they do add things the user could get wrong. If this additional complexity also gave us some benefit then I'd say we should do it. But the typical user will not add their own HeapUsageProvider. Users that do add one will only add one. Even if they want to have multiple on their class path that is not a problem with the RFC proposal. So the only benefit I see is if in our implementation we needed to write new code that might get loading and constructing the class wrong. But Geode already has that code and has been using it for this same application.