Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

When Geode was first developed as a technology storage was fairly straightforward--Disk was the persistent store. Today, however, we see an increasingly diverse storage environment. Distributed stores, such as HDFS, HBase, Cassandra, and others are increasingly used to persist data for analysis and recovery. New persistent memory technologies are adding a new type of non-volatile storage to the mix. Having the core Geode team address every possible use case or store type that may be desired does not scale, nor should it be a gating function on the user community for enabling a new persistence layer. Instead we propose that by developing a properly defined pluggable interface and specification for pluggable persistent stores the Geode persistence layer becomes easily extensible and allow user and the community to add and contribute pluggable use case specific stores. Whether the same applies to over-flow stores is an open question for the community, as the requirements here are different, so we leave that as an open question.

...

Overflow has different potential requirements hence it is an open question as to whether is should be included in the pluggable interface. While one can see storage class memory (SCM) used for overflow this type of memory is likely to be exposed as a file system interface similar to any other disk. It is difficult to imagine other types of storage being used for overflow due to performance and latency requirements. The main use case seems to be for those using proprietary SCM interfaces to provide very fast low latency overflow. The open questions are (1) should overflow be included? (2) what is the incremental increase to the difficulty of implementation of pluggable stores by including overflow? (3) are the requirements similar enough to combine persistence and overflow, or should overflow, if done, be a separate proposal?

Batch Operations

Support for batch operations should be provided, and the cache store interface should have a method to check if the operations are supported natively (via the store's characteristics). A cache store may support batch operations for writes by collecting operations and using a timed and amount based approach to batch then transparently to the underlying store for performance and/or by supporting putAll(), getAll(), and deleteAll() operations.

Basic putAll(), getAll(), and deleteAll() functionality should be supported, and the store's characteristics should indicate of these are supported natively by the underlying store.

Store Characteristics

When multiple store types are supported it is important that applications be able walk the list of available options and select the store that most closely matches its needs. Cache stores should support:

Proposed required characteristics:

  • redundancy type (none, local-redundant, distributed-redundant)
    • effective redundancy (if type is not none)
  • available capacity
  • batch operations supported natively (boolean)

Proposed optional characteristics:

  • redundancy detail type (none, multi-disk, multi-machine, multi-rack, multi-datacenter)

Proposed measured characteristics:

  • average latency
  • average throughput
  • expected reliability

Measured characteristics should be maintained by the storage manager.

Scan and Query support

It is possible to eventually support a scan of data in persistent store for all items matching a given filter. Query support would extend this to OQL. 

Expiration and Eviction

It has been proposed in GEODE-1209 that expiration and eviction operations be propagated to underlying cache stores. In the case of an eviction or expiration that is intended only to conserve memory space by removing least used keys we do not see a reason to support propagation of these events to the underlying cache store. In the case where data has a time to live that applies globally (both in memory and in persistent storage), and that TTL expires, it is desirable to support a method for the expiration event to be propagated to the cache store.

Technical Proposal

Technical proposal is added at 

...