...
- Secondary indexes on data archived on HDFS.
- Eviction logic will be based on LRU.
- Eviction may cause keys to be evicted from memory. Supporting features depending on in-memory keys may not work
- Replicated HdfsRegions
Design
Geode will provide a new type of persistence store, HdfsStore. Regions using HdfsStore will be referred to as HdfsRegions in this document. Data update operations related to HdfsRegions will be buffered in an asynchronous queue, referred as HdfsBuffer. Periodically the buffer data will be written on Hdfs. For reliability HdfsBuffers can be persisted on local disks. Region data will also be cached in memory for performant access. Since memory is finite, records will be evicted from memory when needed. However these records are never destroyed from HDFS. Hence full record of all events is present on HDFS. In case of a cache miss, a lookup on HDFS is executed and the data is loaded in memory.
...
PlantUML |
---|
participant User participant HdfsRegion participant HdfsBuffer participant HDFS activate HdfsRegion User->HdfsRegion: Put KV activate HdfsBuffer HdfsRegion->HdfsBuffer: Get old value HdfsBuffer-->HDFS>X HDFS: ReadSkip HDFS read HdfsBuffer->HdfsRegion: Return Old V* HdfsRegion->HdfsBuffer: Write New V HdfsBuffer->HdfsRegion: HdfsBuffer-->HDFS: Asynchronous write deactivate HdfsBuffer HdfsRegion->User: Return v* HdfsRegion-->HdfsRegion: Asynchronous LRU Eviction |
...