Background

Geode allows querying data stored in the regions through the usage of a query language, namely OQL. This query language provides, between other things, the ability to invoke methods on the objects stored within the region as part of the query itself.

This particular ability was not limited to specific calls, but it allowed any method invocation on the objects, including setters (which, by definition, change the object itself) and, through reflection, internal Geode and JDK methods available on the classpath. Due to the range of methods that could be invoked, several security risks were identified that could impact the integrity of the data, the integrity of the region, and/or the integrity of the platform running Geode. As part of the effort to mitigate these security flaws, GEODE-3247 was created and fixed in Geode 1.3.0 such that the OQL engine now operates through an all or nothing approach: when a SecurityManager is enabled Geode throws a NotAuthorizedException whenever a method, not belonging to the following whitelist, is invoked:

  • String object: any method.
  • Boolean object: booleanValue.
  • Region.Entry object: getKey or getValue.
  • Any Object: equals, compareTo, or toString.
  • Number object: intValue, longValue, shortValue, etc.
  • Date or Timestamp object: after, before, getNanos, or getTime.
  • Map, Collection, or Region object: keySet, entrySet, values, containsKey or get.

This approach has proven to be way too restrictive (no getter methods can be invoked on deployed objects, as an example) and lacks the ability the customize the behavior, leaving the user stuck with the all or nothing configuration.

Objectives / Goals

- Enhance the OQL engine to support customizing the whitelist of allowed methods.
- Enable the cluster configuration service to store, delete and change the customized whitelist through gfsh commands.

Current Behavior

There's a single internal interface, MethodInvocationAuthorizer, responsible of verifying whether a method can be invoked on a certain object for a particular region. The authorizer is tied to the QueryService and accessed / invoked by the classes MethodDispatch and AttributeDescriptor every time the query engine needs to invoke a object's method as part of the OQL execution.

Geode instantiates the DefaultQueryService class every time the user requests a QueryService for executing OQL queries, within the initialization:

  • If the security manager is not enabled, or if the flag QueryService.allowUntrustedMethodInvocation is set as true, the MethodInvocationAuthorizer is configured as a no-op class.
  • If the security manager is enabled and the flag QueryService.allowUntrustedMethodInvocation is set as false (default), the RestrictedMethodInvocationAuthorizer is configured. Whenever invoked, this class first verifies that the method is included within the list of whitelisted methods and, if it is, it checks whether the user has the required privileges to execute queries on that particular region (DATA:READ:regionName).

Proposal

Query Engine

Split the MethodInvocationAuthorizer in two different interfaces:

  1. RegionInvocationAuthorizer: internal with a default implementation provided out of the box. It should just check whether the user can execute queries on the region (DATA:READ:regionName).
  2. MethodInvocationAuthorizer: public, the default implementation out of the box will be RestrictedMethodInvocationAuthorizer. This interface is intended to be implemented by users whenever they want a more fine grained control about which methods should be whitelisted, it can extend the default implementation or totally override it. The interface will have only one method and it should throw NotAuthorizedException (non-checked exception) whenever it detects that the user should not be allowed to execute that particular method during the OQL execution. The whole information about the method to invoke and the target class can be obtained from the Method class itself, so the signature won't have the region as part of the parameters.

      public interface MethodInvocationAuthorizer extends Declarable {

        /**
        * Checks whether the method is authorized to be executed or not.
        * The implementation should only consider the method and the object on which it will be executed,
        * region access has been verified earlier.
        *
        * @param method The method to authorize.
        * @throws NotAuthorizedException If the method can't be executed due to security restrictions.
        */
        void authorizeMethodInvocation(Method method) throws NotAuthorizedException;

            }

The DefaultQueryService will use an empty implementation of both interfaces if security is disabled or the flag QueryService.allowUntrustedMethodInvocation is set as true (as it works right now). If security is enabled the default RegionInvocationAuthorizer will be used, and the MethodInvocationAuthorizer will be set as the default (blacklists everything) or, if present, the custom implementation will be used. In runtime, the RegionInvocationAuthorizer will be always invoked first; if it succeeds and the user has privileges to query the region, the configured implementation of MethodInvocationAuthorizer will be executed.

Once the split is done, Geode could provide several implementations out of the box to match common use cases (like whitelisting methods specified by custom patterns), and even integrate some or all of those implementations with gfsh commands to manage the behavior in runtime.

Configuration

Add <query-service> as a new element to the schema definition.

The <query-service> element must be defined at the cache level and it will contain any configuration related to OQL, including the custom MethodInvocationAuthorizer. This will also allow further additions and configuration options in the future, or even replace the current system properties used to configure the QueryService with new child elements and/or attributes in the <query-service>. The element would look like the following (the properties are just examples):

<cache>
  <query-engine>
    <queryVerbose>true</queryVerbose>
    <queryHeterogeneousObjects>true</queryHeterogeneousObjects>
    <allowUntrustedMethodInvocation>false</allowUntrustedMethodInvocation>
    <method-authorizer>
      <class-name>com.customer.MyMethodAuthorizer</class-name>
        <parameter name="url">
          <string>jdbc:mysql://myHost/whitelistDatabase</string>
        </parameter>
    </method-authorizer>
  </query-engine>
<cache>

This new element and its properties should be stored as part of the cluster configuration service, and should also be modifiable through gfsh commands.

  • No labels

14 Comments

  1. I like having the notion of method authorizer to have user define what methods can invoked and what not. It's the configuration that's a bit complicated. User needs to deploy a class to the server (or start the server with that jar on the classpath) first and then configure it using gfsh or cache.xml. I don't think we want user to change this configuration after the server is running. What about making query.service.method.authorizer a gemfire system property?

    1. Hello Jinmei Liao,

      Having the system property is doable, but that also requires deploying the class to the server and modifying the startup scripts (or java source code) to set the environment accordingly anyway, so there's no gain in usability using that approach. Having a separate configuration element for the query engine seems the right approach if you ask me, even if we might not want to change the method authorizer in runtime at the moment, that could change in the future and, moreover, we could use the element to entirely configure other aspects of the query engine in the future (new features or flags as example), so introducing the new element now will ease the addition of new flags/features.

      Just my two cents.

  2. Have you considered using annotations instead of configuration?

    1. No I haven't... do you mean creating a new annotation, let's say MethodAuthorizer, and scan every deployed jar to check whether the class is supposed to be used as the custom method authorizer implementation?. Or adding a new annotation, let's say AuthorizedMethod, that a user must add to the allowed methods on their domain model and somehow keep that information on the objects so the OQL engine knows that the method can be invoked in runtime?.

  3. This really comes down to where we draw the trust boundary.  At the same time we need to balance usability and security.

    I'm not sure that allowing any get* method invocation is safe.  On the other hand forcing the user to specify configuration via XML, callbacks, or annotations is an additional usability cost.

    Can we automatically detect that a jar was deployed by a privileged user and therefore trusted?  I think trying to protect against malicious code deployed by a user who has appropriate privileges is unattainable.

  4. Hello Anthony Baker,

    I guess we could try detecting all jars deployed by the user and directly trust any get* method on those objects, but what if the user has some custom accessor methods, no just plain get*, that they want to execute during an OQL query?, a simple calculation based on the object's attributes as an example, or anything else they want to allow during the query execution. These use cases are completely left out if we choose to trust only getters.

    Agreed about the usability... however, the two currently offered approaches will still be there: allow everything or deny almost everything (RestrictedMethodInvocationAuthorizer), and they require practically nothing on the user side. This new configuration should only be added whenever the user wants a more fine grained control about the method authorization, and we can deliver an out of the box implementation (like with PDX) to cover some basic use cases (allow only methods from objects deployed through gfsh deploy, allow methods that match a regular expression, etc.).

    Cheers.

  5. We can separate the implementation approach from the end-user UX.  As a developer I want zero-config, no intrusion on my domain objects.  Queries should just work (smile).

    I agree that get* is an arbitrary and probably insufficient convention that is also potentially exploitable.  What I'm suggesting is just trusting user-deployed code.  Any method (maybe excepting getClass) is invokable as long as we know the code was deployed by a user with correct privs.

    1. Hello Anthony Baker,

      Sounds good to me, but I still believe some uses cases are left out if we don't provide a way for users to inject their own implementation of the MethodInvocationAuthorizer interface... as Jason Huynhpreviously said: customers can deploy a bunch of domain classes to the servers and store them within the regions, but they might also want to restrict the list of methods from those objects that can be executed within an OQL (queries are widely used through PULSE and gfsh by operators instead of developers).

      What about adding an out of the box implementation of the MethodInvocationAuthorizer to only trust get* on user-deployed code and those methods already whitelisted in RestrictedMethodInvocationAuthorizer, using this implementation as the default whenever a security-manager is enabled?. This will guarantee zero-config + no intrusion on customer's domain objects, and queries will just work out of the box using the defaults. If the user wants to customize further the behavior, they can just provide a new implementation of MethodInvocationAuthorizer and do whatever is needed.

      Thoughts?.


  6. I am a bit worried about allowing any method on a deployed class.  Mostly due to setters and any mutators.  A trusted user may deploy a domain object with getters/setters but not necessarily expect a user with read privileges to be able to invoke a setter.

    I do like the annotations idea, allowing a developer to annotate methods they allow anyone with read/query privileges to invoke.  Not sure if there are any downsides to that approach.  


  7. How would annotations work if the user never intends to load the class, like for pdx?  If the user did puts with pdx objects and does not have the class on the server, the oql engine would not have annotations to determine what fields are callable.  I guess that's one downside of annotations?

    1. If the domain classes don't exist on the server, we don't need to worry about preventing access right?  The trust boundary continues to exclude the domain classes because they don't exist.  However, queries that use PDX serialized data should still work because  PDXInstance is inside the trust boundary.  It's only when the domain objects are present that we need to define the trust boundary.

    2. It is a moot point anyway since you cannot invoke a method on a PDX serialized object without causing a deserialization first.

      For example, I might have a Person class with an "age" property defined as...

      class Person {
      
      
        LocalDateTime birthDate;
      
      
        public int getAge() {
          return Period.between(birthDate.toLocalDate(), LocalDate.now()).getYears();
        }
      }

      Clearly the "age" property is not going to have a "field" in the PdxInstance.  So, in order to determine a person's age in a query...

      SELECT p.getAge() FROM /People p WHERE p.firstName = $1 AND p.lastName = $2 AND ...

      Then PDX must deserialize PDX bytes to a `Person` type first.

  8. Jason Huynh Anthony Baker : any updates on this?, how should we proceed regarding the proposal?.

  9. I don't think we've got consensus yet on what the new trust boundary should be and whether that needs to allow user configuration.