Overview

The goal of this proposal is to define clear lines between modules of geode. The goal of is that each module should

  1. Be mockable - so that that other modules can write tests that don't depend on that module
  2. Be tested within the module. In other words, if WAN depends on the Region module, I should not need to run any WAN tests to test all of the Code in LocalRegion.

In order to accomplish these goals, each module needs to have a clear API for the rest of the system to use, and tests within that module that cover all of the features of the API. Each module should have at least it's own package(s), if not be in a separate gradle module. Anything that is not part of the API for that module should not be accessible outside of the module.

Here's a start at listing the proposed modules and their dependencies. These are proposed dependencies; there currently there many cyclical dependencies. Part of the required changes will breaking some of these cyclic dependencies by replacing hard coded references to other modules with plugins and callbacks that are part of the well defined API for each module.

  Geode Package Dependency Graph Extensions BasePackages Core Redis MemcacheD SpringDataGemfire HttpSession Hibernate Spark Lucene Statistics Logging Serialization Events DUnit Rest Management Gfsh Geode Core Package Dependency Graph WAN CacheServer Messaging Client Locator ClientSubscriptions Regions DLock FunctionService PDX Persistence Versioning Querying Indexing OffHeap ResourceManager Eviction Snapshots

Packages

 

Client

Package: internal.cache.client

API interfaces/classes: AbstractOp, ExecutablePool

Required Changes: Operations (The client side code for client-> server messages) for other modules should be moved to their respective packages.

ClientSubscriptions

Package:internal.cache.ha (change this?)

API interface/classes: ?

Required changes: ?

DLock

Package:internal.locks

API interface/classes: InternalDistributedLockService (new interface)

Required changes: ?

 

Dunit

Package:dunit

API interface/classes: DistributedTestCase,CacheTestCase

Required changes: This code should be moved into it's own gradle module.

 


Events

Package: internal.cache.event (new package)

API interface/classes: InternalEntryEvent. RegionEntry

Required changes: ? These events are passed everywhere, which is why it would be nice to refactor this code into a separate package that other packages can depend on.

Eviction

Package: internal.cache.lru

API interface/classes: EnableLRU, LRUClockHand (new interface)

Required changes: ?

FunctionService

Package: internal.cache.execute

API interfaces/classes: FunctionService, Execution, Function

Required changes: Region code should be refactored to not have direct dependencies on this package. For example, LocalRegion should not have function execution code in it.

Indexing

Package: query.internal.index

API interface/classes: ?

Required changes: ?

Logging

Package: internal.logging

API interface/classes: ?

Required changes: ?

Locator

Package: distributed.internal.tcpserver

API interface/classes: TcpHandler, ?

Required changes: ?

Messaging

Package:distributed.internal, distributed.internal.advisor

API interfaces/classes: InternalDistributedMember, DM, InternalDistributedSystem, MembershipListener, DistributionAdvisor, DistributionAdvisee

Required Changes: InternalDistributedMember and InternalDistributedSystem should be interfaces, not concrete classes. They should only have the methods that are required by the rest of the system. The concrete classes like DistributionManager, the old InternalDistributedSystem class, etc. should not be referenced outside this package.

TBD - The proposal here is that membership is another module that is hidden behind the messaging layer as far as the rest of the system is concerned. The membership layer has it's own interface that should hide its internals from the messaging layer. Advisors are lumped in here to reduce the complexity of the high level graph.

  Messaging Packages OtherComponents Regions WAN Messaging Advisors Membership

OffHeap

Package: internal.offheap

API interface/classes: ?

Required changes: ?

PDX

Package: pdx.internal

API interface/classes: ?

Required changes: ?

Persistence

Package: internal.cache.persistence

API interfaces/classes: InternalDiskStore, DiskStore, DiskRegionView, DiskId

Required Changes: InternalDiskStore is new interface for DiskStoreImpl.

Querying

Package: query.internal

API interface/classes: QueryService, ?

Required changes: ?

Regions

Package: internal.cache.region

API interfaces/classes:  InternalRegionService, InternalRegion,

Required Changes:

InternalRegion is new interface for LocalRegion. RegionService is a new interface that has the functionality from GemfireCacheImpl to manage regions. LocalRegion, etc. should not be used outside this package.

Direct references to other modules, for example LocalRegion.notifyGatewaySender, should be turned into callbacks that other modules plug into the Region interface. Those callbacks should be tested within the region module.

This is still the big ball of string that needs to get untangled further. We need to split out  expiration, conflict detection, GII, transactions, partitioning, etc.

ResourceManager

Package: internal.cache.control

API interface/classes: InternalResourceManager, MemoryEvent, ResourceEvent, ResourceListener

Required changes: ? Move rebalancing related classes to the region package?

Serialization

Package:internal.serialization (new package)

API interfaces/classes: InternalDataSerializer (interface?), DataSerializableFixedID

Required Changes: ?

Server

Package:internal.cache.tier.sockets (change this?)

API interfaces/classes: CacheServer, CommandInitializer, BaseCommand

Required Changes: Commands (the server side code for a client-> server message) for other modules (eg WAN) should be moved to their respective packages and registered with CommandInitializer.

Snapshots

Package: internal.cache.snapshot

API interface/classes: SnapshotService

Required changes:

Statistics

Package:internal.statistics (new package)

API interfaces/classes: Statistics, StatisticsFactory, StatisticsManager

Required changes: Move into a separate package. Pull the code out of InternalDistributedSystem (it currently implements StatisticsFactory) into a separate class

Versioning

Package: internal.cache.versions

API interface/classes: RegionVersionVector (new interface), VersionTag, VersionStamp

Required changes: ?

WAN

Package: internal.cache.wan

API interfaces/classes: AsyncEventQueue, GatewaySender

Required Changes: Region code should be refactored to not have direct dependencies on this package. For example, AsyncEventQueues should be notified through a listener installed on the region. The listener interface will be part of the region package.

 

Questions/Issues

What to do with GemfireCacheImpl,Cache?

The Cache interface currently has dependencies on almost all of the modules of geode because it has methods like getQueryService, getGatewaySenders (WAN). Unfortunately, Cache, InternalCache, or GemfireCacheImpl is used as a context object that also passed to almost all modules of Geode.

We need to rework how we inject dependencies into all of these modules. If we want a context object, it should be something that is generic that does not pull in dependencies on all other services, something likes spring's BeanFactory.  But it might be better if the specific dependencies for each module were passed into that module.

Modularity in the public API

We've already started creating a few separate modules at the external level - for example the lucene integration or the auto rebalancer. We need to nail down how these extensions are accessed by the user. One option might be to add a method to Cache like Cache.getService(Class<T> serviceInterface). Maybe we should remove methods like Cache.getQueryService, Cache.getGatewaySenders, etc. in favor of not hardcoding all of the services on Cache?

How to enforce dependencies/interface

Probably not every package list here should be it's own gradle module. How will will enforce the dependencies and the use of the package interfaces?

Package naming scheme

We still have a mix of two different conventions for where to put internal classes. Some things are in gemfire.internal.cache, and some things are in packages like cache.asyncqueue.internal. We should settle on one convention.

Interface and concrete class naming scheme

We seem to have a few conventions that are sometimes used. We should agree on what conventions we want to stick to.

  • Having a public interface Cache and an internal interface InternalCache.
  • Naming the implementation of an interface *Impl
  • No labels

2 Comments

  1. +1 for this effort.

    For Injection please see https://github.com/google/guice makes sense.

    Is there any plans to make Region and internal storage pluggable. Currently Geode stores everything in CustomEntryConcurrentHashMap.

    May be I want to store data in Bucket sorted, which I cannot do write now, in order to change this map to say ConcurrentSkipListMap, I will have to go through lot of code changes in the core.

     

  2. Hi Avinash Lakshman

    This proposal is talking about a bit bigger pieces than down at that level. But having an extension point to override the underlying map used by the region sounds like a good idea.

    Using a DI container also sounds like a good idea for coupling all of these components together.