Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Secondary indexes on data archived on HDFS.
  2. Eviction logic will be based on LRU.
  3. Eviction may cause keys to be evicted from memory. Supporting features depending on in-memory keys may not work
  4. Replicated HdfsRegions

Design

Geode will provide a new type of persistence store, HdfsStore. Regions using HdfsStore will be referred to as HdfsRegions in this document. Data update operations related to HdfsRegions will be buffered in an asynchronous queue, referred as HdfsBuffer. Periodically the buffer data will be written on Hdfs. For reliability HdfsBuffers can be persisted on local disks. Region data will also be cached in memory for performant access. Since memory is finite, records will be evicted from memory when needed. However these records are never destroyed from HDFS. Hence full record of all events is present on HDFS. In case of a cache miss, a lookup on HDFS is executed and the data is loaded in memory.

...

PlantUML
participant User
participant HdfsRegion
participant HdfsBuffer
participant HDFS
 
activate HdfsRegion
User->HdfsRegion: Put KV
activate HdfsBuffer
HdfsRegion->HdfsBuffer: Get old value

HdfsBuffer-->HDFS>X HDFS: ReadSkip HDFS read
HdfsBuffer->HdfsRegion: Return Old V*
HdfsRegion->HdfsBuffer: Write New V
HdfsBuffer->HdfsRegion:
HdfsBuffer-->HDFS: Asynchronous write
deactivate HdfsBuffer
HdfsRegion->User: Return v*
HdfsRegion-->HdfsRegion: Asynchronous LRU Eviction
 

...