Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Top-Level Goal

The top-level goal is a single API for managing cluster configuration.

The beneficiaries of this work are those who want to change the configuration of the cluster (create/destroy regions, indices or gateway receivers/senders etc), and have these changes replicated on all the applicable servers and persisted in the cluster configuration service. In addition to developers building Geode-based applications, the target user group includes developers working on different parts of the Geode code such as Spring Data for Apache, queries for Lucene index, or storage for the JDBC connector.

Problem Statement

In the current implementation:

  • Most cluster configuration tasks are possible, but only by coordinating XML file-based configuration files, properties files, and gfsh commands. 
  • Many of the desired outcomes are achievable through multiple paths.
  • Establishing a consistent configuration and persisting it across the cluster is difficult, sometimes impossible.

Product Goals 

The developer should be able to:

  • Create regions/indices on the fly.

  • Persist the configuration and apply it to the cluster (when a new node joins, it has the config; when the server restarts, it has the config)

  • Obtain a consistent view of the current configuration

  • Apply the same change to the cluster in the same way

  • Be able to change the configuration in one place

  • Obtain this configuration without being on the cluster

Proposed Solution

The proposed solution includes:

  • Address the multiple path issue by presenting a single public API for configuring the cluster, including such tasks as creating a region  destroying an index, or update an async event queue.
  • Provide a means to persist the change in the cluster configuration.
  • Save a configuration to the Cluster Management Service without having to restart the servers
  • Obtain the cluster management service from a cache when calling from a client or a server
  • Pass a config object to the cluster management service
  • Use CRUD operations to manage config objects

This solution should meet the following requirements:

  • The user needs to be authenticated and authorized for each API call based on the resource he/she is trying to access.

  • User can call the API from either the client side or the server side.

  • The outcome (behavior) is the same on both client and server:

    •  affects cluster wide

    •  idempotent

What We Have Now

Our admin rest API "sort of" already serves this purpose, but it has these shortcomings:

  1. It's not a public API
  2. The API is restricted to the operations implemented as gfsh commands, as the argument to the API is a gfsh command string.
  3. Each command does similar things, yet commands may not be consistent with each other.

Below is a diagram of the current state of things:

Gliffy Diagram
namecommands
pagePin3

From the current state of commands, It's not easy to extract a common interface for all the commands. And developers do not want to use gfsh command strings as a "makeshift" API to call into the command. We are in need of a unified interface and a unified workflow for all the commands.

Proposal

We propose a new Cluster Management Service (CMS) which has two responsibilities:

  • Update runtime configuration of servers (if any running)
  • Persist configuration (has to be enabled to use CMS)

Note that in order to use this API, Cluster Configuration needs to be enabled.


Gliffy Diagram
namehighlevel
pagePin2

The CMS API is exposed as a new endpoint as part of "Admin REST APIs", accepting configuration objects (JSON) that need to be applied to the cluster. CMS adheres to the standard REST semantics, so users can use POST, PATCH, DELETE and GET to create, update, delete or read, respectively. The API returns a JSON body that contains a message describing the result along with standard HTTP status codes.

Management REST API

Create region End Point (implemented)

APIStatus CodeResponse Body

Endpoint:http://locator:8080/geode-management/v2/regions

Method: POST

Headers: Authorization

Permission Required: DATA:MANAGE

Body:

Code Block
languagejava
titleRequest Body
{
  "regionConfig": {
      "name": "Foo",
      "type": "REPLICATE",
      "group": "optional-group-name" 
  }
}

Types supported by this Rest API is defined in RegionType:

Code Block
languagejava
titleRequest Body
public enum RegionType {
  PARTITION,
  PARTITION_REDUNDANT,
  PARTITION_PERSISTENT,
  PARTITION_REDUNDANT_PERSISTENT,
  PARTITION_OVERFLOW,
  PARTITION_REDUNDANT_OVERFLOW,
  PARTITION_PERSISTENT_OVERFLOW,
  PARTITION_REDUNDANT_PERSISTENT_OVERFLOW,
  PARTITION_HEAP_LRU,
  PARTITION_REDUNDANT_HEAP_LRU,

  PARTITION_PROXY,
  PARTITION_PROXY_REDUNDANT,

  REPLICATE,
  REPLICATE_PERSISTENT,
  REPLICATE_OVERFLOW,
  REPLICATE_PERSISTENT_OVERFLOW,
  REPLICATE_HEAP_LRU,

  REPLICATE_PROXY
}
200
Code Block
languagejava
titleSuccess Response
{
  "memberStatuses" : {
    "server-1" : {
      "success" : true,
      "message" : "success"
    }
  },
  "statusCode" : "OK",
  "statusMessage" : "successfully persisted config for cluster",
  "successful" : true
}


409
Code Block
languagejava
titleName conflict
{
  "memberStatuses" : { },
  "statusCode" : "ENTITY_EXISTS",
  "statusMessage" : "cache element Foo already exists.",
  "successful" : false
}
400
Code Block
languagejava
titleError Response - missing required parameter
{
  "memberStatuses" : { },
  "statusCode" : "ILLEGAL_ARGUMENT",
  "statusMessage" : "Name of the region has to be specified.",
  "successful" : false
}
Code Block
languagejava
titleError Response - invalid parameter
{
  "memberStatuses" : { },
  "statusCode" : "ILLEGAL_ARGUMENT",
  "statusMessage" : "Region names may not begin with a double-underscore: __Foo__",
  "successful" : false
}
401
Code Block
languagejava
titleError Response
{
  "memberStatuses" : { },
  "statusCode" : "UNAUTHENTICATED",
  "statusMessage" : "Authentication error. Please check your credentials",
  "successful" : false
}
403
Code Block
languagejava
titleError Response
{
  "memberStatuses" : { },
  "statusCode" : "UNAUTHORIZED",
  "statusMessage" : "user not authorized for DATA:MANAGE",
  "successful" : false
}
500
Code Block
languagejava
titleError Response
{
  "memberStatuses" : { },
  "statusCode" : "ERROR",
  "statusMessage" : "cluster persistence service is not running",
  "successful" : false
}

Notes:

  • the CREATE[POST] endpoint is not idempotent, you will receive a 409 when creating the a region with the same name the 2nd time.
  • if group name is "cluster" or omitted, the region will be created on all the data members in this cluster.

401 and 403 responses are omitted for the rest of the end points.

List members end point (Implemented) 

...

Endpoint:http://locator:8080/geode-management/v2/members

Method: GET

Headers: Authorization

Permission Required: CLUSTER:READ

...

Code Block
languagejava
titleSuccess Response
{
	"memberStatuses": {},
	"statusCode": "OK",
	"statusMessage": null,
	"result": [{
		"class": "org.apache.geode.management.configuration.MemberConfig",
		"id": "locator-0",
		"host": "10.118.19.10",
		"pid": "51876",
		"cacheServers": [{...}],
		"locator": true,
		"coordinator": true
	}, {
		"class": "org.apache.geode.management.configuration.MemberConfig",
		"id": "server-1",
		"host": "10.118.19.10",
		"pid": "51877",
		"cacheServers": [{...}],
		"locator": false,
		"coordinator": false
	}]
}

...

Endpoint:http://locator:8080/geode-management/v2/members?id=server-1

Method: GET

Headers: Authorization

Permission Required: CLUSTER:READ

...

Code Block
languagejava
titleSuccess Response
{
	"memberStatuses": {},
	"statusCode": "OK",
	"statusMessage": null,
	"result": [{
		"class": "org.apache.geode.management.configuration.MemberConfig",
		"id": "server-1",
		"host": "10.118.19.10",
		"pid": "51877",
		"cacheServers": [{...}],
		"locator": false,
		"coordinator": false
	}]
}

...

Endpoint:http://locator:8080/geode-management/v2/members?id=Non-Existent

Method: GET

Headers: Authorization

Permission Required: CLUSTER:READ

...

Code Block
languagejava
titleSuccess Response
{
	"memberStatuses": {},
	"statusCode": "OK",
	"statusMessage": null,
	"result": []
}

Get members end point (Implemented) 

...

Endpoint:http://locator:8080/geode-management/v2/members/server-1

Method: GET

Headers: Authorization

Permission Required: CLUSTER:READ

...

Code Block
languagejava
titleSuccess Response
{
	"memberStatuses": {},
	"statusCode": "OK",
	"statusMessage": null,
	"result": [{
		"class": "org.apache.geode.management.configuration.MemberConfig",
		"id": "server-1",
		"host": "10.118.19.10",
		"pid": "51877",
		"cacheServers": [{...}],
		"locator": false,
		"coordinator": false
	}]
}

...

Endpoint:http://locator:8080/geode-management/v2/members/Non-Existent

Method: GET

Headers: Authorization

Permission Required: CLUSTER:READ

...

Code Block
languagejava
titleSuccess Response
{
	"memberStatuses": {},
	"statusCode": "ENTITY_NOT_FOUND",
	"statusMessage": "Unable to find the member with id = Non-Existent",
	"result": []
}

Root End Point (Not implemented)

...

Endpoint: http://locator:8080/geode-management/v2

Method: GET

Headers: Authorization

...

200

Code Block
languagejava
titleSuccess Response
{
    "number_of_locators": 3,
	"number_of_servers": 8,
	"region_url": "/geode/v2/regions",
	"gateway_receiver_url": "/geode/v2/gwr",
	"gateway_sender_url": "/geode/v2/gws"
}



...

Code Block
languagejava
titleError Response
{
    "message": "Missing authentication credential header(s)"
}

...

Code Block
languagejava
titleError Response
{
    "message": "User1 not authorized for CLUSTER:READ"
}

List End Point (not implemented)

...

Endpoint: http://locator:8080/geode-management/v2/regions

Method: GET

Headers: Authorization

...

200

How does it work


On the locator side, the configuration service framework will just handle the workflow. It's up to each individual ClusterConfigElement to implement how it needs to be persisted and applied. 

Gliffy Diagram
nameOverview
pagePin10


This is what happens inside the LocatorClusterManagementService for a create operation: 

Gliffy Diagram
namecreate flow chart
pagePin2


This is what happens inside the LocatorClusterManagementService for a list operation: 

Gliffy Diagram
namelist flow chart
pagePin2

Code Block
languagejava
titleSuccess Response
{
    "Total_results": 10,
    "Regions" : [
     {
       "Name": "Foo",
       "Url": "/geode/v2/regions/Foo"
     },
     ...
     ]
}

...

Code Block
languagejava
titleError Response
{
    "message": "Missing authentication credential header(s)"
}

...

Code Block
languagejava
titleError Response
{
    "message": "User1 not authorized for CLUSTER:READ"
}

Describe End Point (Not Implemented)

...

Endpoint: http://locator:8080/geode-management/v2/regions/Foo

Method: GET

Headers: Authorization

...

200

Code Block
languagejava
titleSuccess Response
{
    "Name": "Foo",
    "Data_Policy": "partition",
    "Hosting_Members": [
      "s1",
      "s2",
      "s3"
      ],
    "Size": 0,
    "Indices": [
     {
     "Id": 111,
     "Url": "/geode/v2/regions/Customer/index/111"
     }
    ]

}

...

Code Block
languagejava
titleError Response
{
    "message": "Missing authentication credential header(s)"
}

...

Code Block
languagejava
titleError Response
{
    "message": "User1 not authorized for CLUSTER:READ"
}

...

Code Block
languagejava
titleError Response
{
     "message": "Region with name '/Foo' does not exist"
}

Update End Point (not implemented)

...

Endpoint: http://locator:8080/geode-management/v2/regions/Foo

Method: PATCH

Headers: Authorization

Body:

Code Block
languagejava
titleRequest Body
{
  "regionConfig": {
      "gateway_sender_id": ["1","2"]
  }
}

...

200

Code Block
languagejava
titleSuccess Response
{
  "Metadata": {
    "Url": "/geode/v2/regions/Foo"
  }
}

...

Code Block
languagejava
titleError Response
{
    "message": "Invalid parameter specified"
}

...

Code Block
languagejava
titleError Response
{
    "message": "Missing authentication credential header(s)"
}

...

Code Block
languagejava
titleError Response
{
    "message": "User1 not authorized for DATA:MANAGE"
}

...

Code Block
languagejava
titleError Response
{
    "message": "Region with name '/Foo' does not exist"
}

...

Code Block
languagejava
titleError Response
{
    "message": "Failed to update region /Foo because of <reason>"
}

Delete End Point (Not Implemented)

...

Endpoint: http://locator:8080/geode-management/v2/regions/Foo

Method: DELETE

Headers: Authorization

...

204

...

<Successful deletion>

...

Code Block
languagejava
titleError Response
{
    "message": "Region with name '/Foo' does not exist"
}

...

Code Block
languagejava
titleError Response
{
    "message": "Missing authentication credential header(s)"
}

...

Code Block
languagejava
titleError Response
{
    "message": "User1 not authorized for DATA:MANAGE"
}

...

Code Block
languagejava
titleError Response
{
    "message": "Failed to delete region /Foo because of <reason>"
}

Note that the DELETE endpoint is idempotent – i.e. it should be a NOOP if the region does not exist.

Let's look at some code to see how users can use this service. The below example shows how to create a region using CMS.

Curl (any standard REST client)

Code Block
languagejava
titleCurl
curl [-v] [-u user[:password]] -H "Content-Type: application/json" http://<locator.host>:7070/geode-management/v2/regions -XPOST -d '
{
  "name": "Foo",
  "type": "PARTITION",
  "group": "optional-group-name"
}'


Sample to copy/paste:
curl -H "Content-Type: application/json" http://localhost:7070/geode-management/v2/regions -XPOST -d '{"name": "Foo","type": "PARTITION"}'

curl -H "Content-Type: application/json" http://localhost:7070/geode-management/v2/regions -XPOST -d '{"name": "Foo","type": "PARTITION", "group": "optional-group-name"}'





Java Client

To ease the interaction with the rest end point, we provided a java client version of Cluster Management Service. Here is an example to get an instance of this service and use it in any java client code. You will need to have geode-management.jar in your classpath.

About the definition of Region type , can refer to the following class:

org.apache.geode.cache.configuration.RegionType
Code Block
languagejava
titleJava client
public static void main(String[] args) {
  String regionName = args[0];

  ClusterManagementService cms = ClusterManagementServiceProvider.getService("localhost", 7070);

  BasicRegionConfig config = new BasicRegionConfig();
  config.setName(regionName);
  config.setType(RegionType.PARTITION);
  config.setGroup("optional-group-name");

  ClusterManagementResult result = cms.create(config);

  if (!result.isSuccessful()) {
    throw new RuntimeException(
        "Failure creating region: " + result.getStatusMessage());
  }
}

The above example is for interacting with the Cluster Management Service's REST end point which has no ssl nor security turned on. To manage a cluster that has security and SSL enabled, you will need to provide a SSLContext and credentials when getting the service:

Code Block
languagejava
titleJava Client with security
public static void main(String[] args) {
  String regionName = args[0];
  SSLContext sslContext = SSLContext.getDefault();
  HostnameVerifier hostnameVerifier = new NoopHostnameVerifier();
  ClusterManagementService cms = ClusterManagementServiceProvider.getService("localhost", 7070, sslContext, hostnameVerifier, "username", "password");
  .....
}

Note: In the context of Geode client, an instance of the ClusterManagementService can be retrieved be calling ClusterManagementServiceProvider.getService() with providing any parameters. This will attempt to use any existing security or SSL configuration to determine the CMS REST endpoint. For this to automagically work If a SecurityManager is enabled, the Geode properties security-username and security-password must be set.

You can use this java client when authoring server side code as well. Here is how one can use CMS on a server, 

Code Block
languagejava
titleServer Side
public class MyFunction implements Function<String> {
  @Override
  public void execute(FunctionContext context) {
    //1. Get the service instance. You don't need to provide url or port or ssl information since all that information is deduced by the server automatically.
    // but you will need to provide a username/password if the cluster is secured.
    ClusterManagementService cms = ClusterManagementServiceProvider.getService();
    
    //2. Create the config object, these are just JAXB generated POJOs
    BasicRegionConfig regionConfig = new BasicRegionConfig(); //These are JAXB generated configuration objects
    regionConfig.setName("ACCOUNTS");
    regionConfig.setType(RegionType.REPLICATE);
    
    //3. Invoke create, update, delete or get depending on what you want to do.
    ClusterManagementResult result = cms.create(regionConfig); 
  }
}

ClusterManagementService Interface

The primary ClusterManagementService interface is as follows:

Code Block
languagejava
titleClusterManagementService
public interface ClusterManagementService {
  ClusterManagementResult create(CacheElement config);
  ClusterManagementResult delete(CacheElement config);
  ClusterManagementResult update(CacheElement config);
  ClusterManagementResult list(CacheElement config);
}

The methods on this interface all interact with simple Java classes that map directly to elements in the Geode cache.xml file. For example, region actions will use the RegionConfig class. When creating Geode components, these classes are used as input to the API. When querying Geode, these classes (or subclasses) are returned as the response to those queries. See ClusterManagementResult below. 

When creating or updating a Geode component, the method arguments are the state to be created or applied respectively. When deleting or list(ing) a component, the argument will act as a filter for the respective method. For example, to retrieve the confi

ClusterManagementResult

ClusterManagementResult is the result object you get when you invoke a method using cluster management service. Here is an instance of this object in json format:

Code Block
languagetext
titlejson
{
  "memberStatuses" : {
    "server-1" : {
      "success" : true,
      "message" : "success"
    }
  },
  "statusCode" : "OK",
  "statusMessage" : "successfully persisted config for cluster",
  "successful" : true
}

Here is an explanation of each of the fields in the result object:

"successful": a boolean value indicating the overall success/failure status of the service call. it will be true if and only if the "statusCode" value is "OK".

"statusCode": a enum field indicating the result status. Here is a list of possible values in this field:

...

"StatusMessage": a detailed message about the result of the operation

"memberStatus": information about the operation status on each server.

Behind the scenes

...


Pros and Cons:

Pros:

  1. A common interface to call either on the locator/server/client side
  2. A common workflow to enforce behavior consistency
  3. Modularized implementation. The configuration object needs to implement the additional interfaces in order to be used in this API. This allows us to add functionality gradually and per function groups.

Cons:

  1. Existing gfsh commands need to be refactored to use this API as well, otherwise we would have duplicate implementations, or have different behaviors between this API and gfsh commands.
  2. When refactoring gfsh commands, some commands' behaviors will change if they want to strictly follow this workflow, unless we add additional APIs for specific configuration objects.

Migration Strategy:

Our current commands uses numerous options to configure the behavior of the commands. We will have to follow these steps to refactor the commands.

  1. Combine all the command options into one configuration object inside the command itself.
  2. Have the command execution call the public API if the command conforms to the new workflow. In this step, the config objects needs to implement the ClusterConfigElement.
  3. If the command can't use the common workflow, make a special method in the API for that specific configuration object. (We need to evaluate carefully - we don't want to make too many exceptions to the common workflow.)

The above work can be divided into functional groups so that different groups can share the workload.

Once all the commands are converted using the ClusterManagementService API, each command class can be reduced to a facade that collects the options and their values, builds the config object and calls into the API. At this point, the command objects can exist only on the gfsh client.

The end architecture would look like this: Gliffy DiagramnamemigrationpagePin3


Project Milestones

  1. API is clearly defined
  2. All commands are converted using this API
  3. Command classes exist only on a gfsh client. The GfshHttpInvoker uses the REST API to call this ClusterConfigurationService with the configuration objects directly.