You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

A place to record analysis and observations about the design represented by the Cas-obj prototype.

aspectsub
aspect 
picturespace/time tradeoffs, locality-of-reference (L1/L2/L3 memory caching)backwards
compatibility 
alternativesnotes   
Data Storage

Where: Each FS's storage is represented by values as part of 1 Java object

  • can be GC'd
  • No central CAS "Heap"
 

More space:

  • always have Java cover object (vs possibility of no Java objects)
  • Java cover object: 3 object overheads / FS (vs 1)
  • Java cover object has denormalized shared additional fields

Faster: locality of reference high.

 

for reduced space / FS could:

  • avoid java object overheads for Obj & int arrays (but gives up GC by individual object)
  • share cas ref, type ref, typesystem ref.
denormalized: each has cas ref, type ref, typesystem ref   
Data Storage

"fs-id" - an int (dense) representing the unique ID of a FS.

  • assigned lazily, not all FSs might have these (question)
  • not reused (question) in case FS is garbage collected
        
Data Storage

Feature Structure representation: as 3 Java objects:

  • array of Ints
  • array of Objects
  • container of above, with additional refs to
    • cas
    • typesystem
    • type
    • an int array representing offsets in the top two arrays, indexed by a sequentially generated incrementing from 0
 

The offset array is an object that roughly corresponds to the _Type object in the JCas, in that provides a way to get from a designated field to the offset. The JCas provides this as special named fields, part of the _Type object. The CasObj provides this as an int array object.

The cas ref is used for "addToIndexes" to locate the view containing the indexes to be added to.

The offset array is shared among all FS associated with a particular type system, with some exceptions (e.g. SourceDocumentInformation) - but I think this is just a quick-fix anomaly(question)

  

cas ref is to one view; used for add/remove-indexes, getView, get the "fs-id"

   
Data Storage

get set of features

  • some builtin hard-coded offsets
  • some (e.g. SourceDocumentInformation) have extra int[] offsets (in each FS instance - probably needs to be shared (question) )
        
Data StorageJCAS _Type classes These are not used, but are "supported" for backwards compatibility. Support includes their low-level APIs (question)      
Data Storagelow level API support, including C++, binary (de)serialization partially started, remainder TBD      
Views         
Indexes         

 

  • No labels