Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

 

PlantUML
[LuceneIndex] --> [RegionDirectory]
() "User"
node "Colocated PR or Replicated Region" {
  () User --> [User Data Region] : Puts
  [User Data Region] --> [Async Queue]
  [Async Queue] --> [LuceneIndex] : Batch Writes
  [RegionDirectory] --> [Lucene Regions]
}
 

dd

Inside LuceneIndex

PlantUML
node "LuceneIndex" {
  [Reflective fields]
  [AEQ listener]
  [RegionDirectory array (one per bucket)]
  [Query objects]
}
 

A closer look at Partitioned region data flow

PlantUML
() User -down-> [CacheUser Data Region] : PUTs
node cluster {
[User database {
 () "indexBucket1Primary"
 }

 database {
 () "indexBucket1Secondary"
 }

[CacheData Region] ..> [Bucket 1]
 [Bucket 1] -down-> [Async Queue Bucket 1]
node LuceneIndex {
[Async Queue Bucket 1] -down-> [FSDirectoryBucket1RegionDirectory1] : Batch Write
[FSDirectoryBucket1] -> indexBucket1Primary
indexBucket1Primary
[RegionDirectory1] -rightdown-> indexBucket1Secondary [file region bucket 1]

[file databaseregion {
 () "indexBucket2Primary"
 }

 database {
 () "indexBucket2Secondary"
 }

[Cachebucket 1] -down-> [chunk region bucket 1]
}
 
[User Data Region] ..> [Bucket 2]
 [Bucket 2] -down-> [Async Queue Bucket 2]
node LuceneIndex {
[Async Queue Bucket 2] -down-> [FSDirectoryBucket2RegionDirectory2] : Batch Write
 [FSDirectoryBucket2RegionDirectory2] -> indexBucket2Primary
 indexBucket2Primary -rightdown-> indexBucket2Secondary 
}
PlantUML
node "LuceneIndex" {
  [Reflective fields]
  [AEQ listener]
  [RegionDirectory array (one per bucket)]
  [Query objects]
}
[file region bucket 2]
[file region bucket 2] -down-> [chunk region bucket 2]
}


In a partition region every bucket in the region will have its own GeodeFSDirectory to store the lucene indexes. The GeodeFSDirectory implements a file system using 2 regions 
  • FileRegion : holds the meta data about indexing files
  • ChunkRegion : Holds the actual data chunks for a given index file. 

The FileRegion and ChunkRegion will be collocated with the data region which is to be indexed. The GeodeFSDirectory will have a key that contains the bucket id for file metadata chunks. The FileRegion and ChunkRegion will have partition resolver that looks at the bucket id part of the key only.
In AsyncEventListener, when a data entry is processed
  1. determine the bucket id of the entry.
  2. Get the directory for that bucket, do the indexing operation into that instance.

...