Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

To be Reviewed By: August

...

20th, 2020

Authors: Patrick Johnson

Status: Draft Draft | Discussion | Active | Dropped | Superseded

...

Related: ClassLoader Isolation

Problem

Geode is comprised of multiple sub-projects, many of which depend on other sub-projects at runtime. Normally, all of these sub-projects to work, for example, geode-core may depend on geode-management and geode-membership, which may, in turn, depend on other sub-projects themselves. Normally, this is not a problem because all of Geode's projects are present on the Java classpath at runtime, allowing sub-projects to access classes from other sub-projects as required. However, the changes proposed by the ClassLoader Isolation RFC will result in all sub-projects being loaded as separate modules with different ClassLoaders and no longer being on the system classpath, as they were before.

Once these modularizing changes go into effect, modules will no longer be able to access classes from other modules unless they explicitly create a dependency on them. In order to create dependencies between related modules, we need a way to determine at runtime which modules depended on which other modules, which, currently, we do not have.  In addition to creating dependencies between modules, we may also need to load modules that are not already loaded, when loading a module that depends on them. For example, imagine the scenario where module A depends on modules B and C which have dependencies of their own represented by the below graphic.

Gliffy Diagram
size600
namedependency-hierarchy
pagePin6

, as well as any libraries they use, are put on the system classpath where they can be easily accessed at runtime. Below is the runtime classpath from the geode-dependencies jar's manifest file:

Class-Path: geode-common-1.14.0-build.0.jar geode-common-services-1.14.0-build.0.jar geode-connectors-1.14.0-build.0.jar geode-core-1.14.0-build.0.jar geode-cq-1.14.0-build.0.jar
geode-gfsh-1.14.0-build.0.jar geode-log4j-1.14.0-build.0.jar geode-logging-1.14.0-build.0.jar geode-lucene-1.14.0-build.0.jar geode-memcached-1.14.0-build.0.jar geode-modules-1.14.0-build.0.jar
geode-old-client-support-1.14.0-build.0.jar geode-protobuf-1.14.0-build.0.jar geode-protobuf-messages-1.14.0-build.0.jar geode-rebalancer-1.14.0-build.0.jar geode-redis-1.14.0-build.0.jar
geode-serialization-1.14.0-build.0.jar geode-tcp-server-1.14.0-build.0.jar geode-wan-1.14.0-build.0.jar geode-module-management-1.14.0-build.0.jar geode-module-bootstrapping-1.14.0-build.0.jar
geode-management-1.14.0-build.0.jar jackson-databind-2.10.1.jar commons-lang3-3.10.jar jackson-annotations-2.10.1.jar jackson-core-2.10.1.jar log4j-api-2.13.1.jar
geode-membership-1.14.0-build.0.jar geode-http-service-1.14.0-build.0.jar geode-unsafe-1.14.0-build.0.jar httpclient-4.5.12.jar httpcore-4.4.13.jar HikariCP-3.4.2.jar jaxb-api-2.3.1.jar
log4j-jcl-2.13.1.jar spring-shell-1.2.0.RELEASE.jar rmiio-2.1.2.jar antlr-2.7.7.jar javax.activation-1.2.0.jar istack-commons-runtime-3.0.11.jar jaxb-impl-2.3.2.jar commons-validator-1.6.jar
shiro-core-1.5.3.jar shiro-config-ogdl-1.5.3.jar commons-beanutils-1.9.4.jar commons-codec-1.14.jar commons-collections-3.2.2.jar commons-io-2.6.jar commons-logging-1.2.jar
classgraph-4.8.52.jar micrometer-core-1.4.1.jar swagger-annotations-1.5.23.jar fastutil-8.3.1.jar javax.resource-api-1.7.1.jar jetty-webapp-9.4.21.v20190926.jar
jetty-servlet-9.4.21.v20190926.jar jetty-security-9.4.21.v20190926.jar jetty-server-9.4.21.v20190926.jar javax.servlet-api-3.1.0.jar jna-platform-5.5.0.jar jna-5.5.0.jar jopt-simple-5.0.4.jar
snappy-0.4.jar jgroups-3.6.14.Final.jar shiro-cache-1.5.3.jar shiro-crypto-hash-1.5.3.jar shiro-crypto-cipher-1.5.3.jar shiro-config-core-1.5.3.jar shiro-event-1.5.3.jar
shiro-crypto-core-1.5.3.jar shiro-lang-1.5.3.jar slf4j-api-1.7.30.jar spring-core-5.2.5.RELEASE.jar javax.activation-api-1.2.0.jar jline-2.12.jar HdrHistogram-2.1.12.jar LatencyUtils-2.0.3.jar
javax.transaction-api-1.3.jar spring-jcl-5.2.5.RELEASE.jar jetty-http-9.4.21.v20190926.jar jetty-io-9.4.21.v20190926.jar jetty-xml-9.4.21.v20190926.jar jetty-util-9.4.21.v20190926.jar
log4j-slf4j-impl-2.13.1.jar log4j-core-2.13.1.jar log4j-jul-2.13.1.jar lucene-analyzers-phonetic-6.6.6.jar lucene-analyzers-common-6.6.6.jar lucene-queryparser-6.6.6.jar lucene-core-6.6.6.jar
lucene-queries-6.6.6.jar jboss-modules-1.10.1.Final.jar vavr-0.10.3.jar vavr-match-0.10.3.jar protobuf-java-3.11.4.jar geo-0.7.7.jar netty-all-4.1.48.Final.jar

The ClassLoader Isolation RFC proposes that each sub-project of Geode be loaded as a separate module using JBoss Modules. This means that the classpath shown above will become much shorter as the majority of sub-projects, along with their library dependencies, are removed from it and put into modules with separate ClassLoaders. Note that this is only the case for servers and only if you opt-in; the current way of working will remain default behavior and clients will be unaffected. The new classpath when opting for the new modular behavior will look something like this:

Class-Path: geode-common-1.14.0-build.0.jar geode-common-services-1.14.0-build.0.jar geode-modules-1.14.0-build.0.jar geode-module-management-1.14.0-build.0.jar geode-module-bootstrapping-1.14.0-build.0.jar

Everything from the first classpath that is missing from the second one will now be encapsulated inside of the new modules instead. Without everything being on the classpath, modules will be unable to access other modules that they require to function correctly. To solve this, JBoss Modules allows modules to link themselves to other modules that they depend on at runtime, allowing them to access the classes within, but that requires knowing what modules depend on what other modules. We have that information at build-time because sub-projects declare their dependencies in their build.gradle files. Unfortunately, the Gradle file is not available to us at runtime when we need to load and link modules. We need a way to determine at runtime what other sub-projects each sub-project requires access to, so it can be linked correctlyLoading module A would require all of the modules in its dependency hierarchy (shown above) to be loaded. If the modules required by module A are not already loaded, they would have to be loaded for module A to function correctly. Without knowledge of this hierarchy at runtime, we would be unable to ensure that all the modules required by module A are loaded when it is.


Anti-Goals

This proposal is intended only to solve the above problem of determining at runtime, which sub-projects/modules depend on which other sub-projects/modules and is not concerned with...

  • The implementation of modules in Geode
  • Refactoring Geode's sub-projects to break or change any existing dependencies between them
  • Adding or removing sub-projects

Solution

...

Since we have the necessary dependency information at build-time, we can write a Gradle task to process the runtime classpath of each sub-project and write it to a file that is included in the built jar file, where it can be read at runtime. The file could take the form {sub-project name}-dependencies.txt, where each file contains a list of sub-projects that are required by the associated sub-project, as per its build.gradle file. Then, at runtime when loading a module, the file could be read and used to determine which modules need to be loaded and linked for the module to function correctly.


While the above approach would work, it is not necessary to add a new file to the jar just for this information; we can instead add a new attribute to the existing manifest file which is already built into each sub-project's jar file. This attribute can be called "Dependent-Modules" and contain a list of sub-projects required

...

by the module, as defined by the module's build.gradle file. For example, these dependencies in a sub-project's build.gradle file:

compile(platform(project(':boms:geode-all-bom')))

api(project(':geode-core'))
api('org.apache.lucene:lucene-core')

compile(project(':geode-gfsh'))
implementation(project(':geode-logging'))
implementation(project(':geode-membership'))
implementation(project(':geode-serialization'))

implementation('org.apache.commons:commons-lang3')
implementation('org.apache.logging.log4j:log4j-api')

compileOnly(platform(project(':boms:geode-all-bom')))
compileOnly(project(':geode-common-services'))
compileOnly('com.fasterxml.jackson.core:jackson-annotations')

runtimeOnly('org.apache.lucene:lucene-analyzers-phonetic')

testImplementation(project(':geode-junit'))
testImplementation('org.apache.lucene:lucene-test-framework')

Would result in a "Dependent-Modules" attribute like this: 

Dependent-Modules: geode-gfsh-1.14.0-build.0 geode-core-1.14.0-build.0 geode-membership-1.14.0-build.0 geode-tcp-server

...

-1.14.0-build.0 geode-http-service-1.14.0-build.0
geode-logging-1.14.0-build.0 geode-serialization-1.14.0-build.0 geode-common-1.14.0-build.0 geode-management-1.14.0-build.0 geode-unsafe-1.14.0-

...

build.0

Notice that only project dependencies and only those scoped to be required at runtime (runtime, runtimeOnly, compile, implementation, and api) are included in "Dependent-Modules". This is because the list is generated from the r. Other library dependencies are excluded because they will not be loaded as separate modules (though that may change in future iterations) and dependencies scoped as compileOnly or test are excluded because they are not required in production. You may also notice that the "Dependent-Modules" list contains modules that are not explicitly included in Gradle, specifically geode-tcp-server, geode-http-service, geode-common, geode-management, and geode-unsafe. These modules are included in the list because they are required by other modules that are explicitly called out in Gradle and therefore, are part of the module's runtime classpath. Generating the "Dependent-Modules" list like this results in the following relationships between modules: 

Image Added

While this diagram accurately represents the requirements of each module, it resembles a tangled ball of yarn, making it hard to determine any kind of meaningful hierarchy, or even what sub-projects are involved; a structure like this is difficult to read and unnecessarily complicated for creating dependencies between modules. Since a module can access other modules indirectly via its dependencies (e.g. if A depends on B and B depends on C, then A can access C via B ), there only needs to be a single path to a module for it to be accessible, regardless of how many hops it takes. The above dependency graph can be simplified at build-time by removing dependencies that are reachable via other dependencies from the dependency list of each sub-project using the following algorithm:

For each sub-project:
	Get the runtime classpath from the project
	Add all classpath dependencies that begin with “geode-“ to a list
	For each dependency in the list:
		Get its runtime classpath
		Remove all elements in the classpath from the parent's list of dependencies
	Write the remaining list to the “Dependent-Modules” attribute in the manifest


By doing this, the above graph can be transformed to look more like the following:

Image Added




This simplified dependency structure has significantly fewer paths, making it easier to read and reducing the number of links required when loading a module. Despite the difference in appearance between the above graph and the original, both allow the same access between modules.  Using this method of simplifying dependencies, the sample "Dependent-Modules" attribute shown earlier would be reduced to this: 

Dependent-Modules: geode-gfsh-1.14.0-build.0

The reduced version of this attribute may be minimal, but it provides us all the information necessary to load and link the module.

Use Cases

Scenario 1: Sub-project A has no dependencies.

Expected Behavior: The "Dependent-Modules" attribute in modules A's manifest file is empty.


Scenario 2: Sub-project A depends on sub-project B in its build.gradle file. Sub-project B has no dependencies.

Expected Behavior: The "Dependent-Modules" attribute in modules A's manifest file contains sub-project B. Sub-project B's "Dependent-Modules" attribute is empty.


Scenario 3: Sub-project A depends on sub-project B and sub-project B depends on sub-project C. Sub-project C has no dependencies.

Expected Behavior: The "Dependent-Modules" attribute in modules A's manifest file contains sub-project B and sub-project B's "Dependent-Modules" attribute contains sub-project C. Sub-project C's "Dependent-Modules" attribute is empty.


Scenario 4: Sub-project A depends on sub-projects B and C. Sub-project B depends on sub-project C.  Sub-project C has no dependencies.

Expected Behavior: The "Dependent-Modules" attribute in modules A's manifest file contains sub-project B and sub-project B's "Dependent-Modules" attribute contains sub-project C. Sub-project C's "Dependent-Modules" attribute is empty.


Scenario 5: Sub-project A depends on sub-projects B, C, and D. Sub-project B depends on sub-project C. Sub-projects C and D have no dependencies.

Expected Behavior: The "Dependent-Modules" attribute in modules A's manifest file contains sub-project B and sub-project D. Sub-project B's "Dependent-Modules" attribute contains sub-project C. Sub-project C's and sub-project D's "Dependent-Modules" attributes are empty.


Changes and Additions to Public Interfaces

No anticipated changes to public interfaces.

Performance Impact

No anticipated performance impact.

Backwards Compatibility and Upgrade Path

No backward compatibility impact.

Prior Art

...

There are two alternatives to this proposal

...

.

  1. Load every sub-project as a module at startup and create dependencies between every module and every other module. While this may solve the problem, it would also largely defeat the purpose of modularizing Geode in the first place.
  2. Provide the dependency information in a separate file instead of the manifest. Other than the location, nothing else would differ from this proposal.

FAQ

Answers to questions you’ve commonly been asked after requesting comments for this proposal.

Errata

What are minor adjustments that had to be made to the proposal since it was approved?

...