You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

Feature Structures

These are implemented using Java objects, one per FeatureStructure.  They can be Garbage Collected.

There is a generic Java class for these, plus (optional) specific classes for JCas style access. 


APIs for creating Feature Structures, and setting / getting Feature Values in them

There are several kinds of APIs for this.


  • Basic: this was the original API, and makes use of UIMA Feature and Type objects as arguments.
  • JCas: this is an API that uses common Java idioms for creating, getting, and setting. 
  • LowLevel: this was like Basic, but substituted an int-valued address for the Java Feature Structure object, and in general, avoided created Java objects.
    • In V3, it is dangerous to create FS using the low level API, because the resulting FS is identified only by an int, and if the Java Garbage Collector runs before any reference is created referring to the newly created FS, it will disappear (due to garbage collection).  So the low level APIs in Version 3 are depreciated.


 Descriptioncreate exampleget a valueset a value

Type and Feature



XX was the type. 


fs.get(index) when fs is
one of the built-in arrays 

fs.setFloatValue(aFeature, value)

fs.set(index, value) when fs is
one of the built-in arrays 

JCasFollows Java conventions,
Types and features must
be known at compile time 
new MyType()
  • can have additional constructors 


when the value of myArrayFeature
one of the built-in arrays  

fs.get(index) when fs is
one of the built-in arrays 


fs.setMyArrayFeature(index, value)

fs.set(index, value)


In version 2 this allowed
CAS access without making
any Java objects; there was
much less "checking" and
it was for high-performance
cases. Feature Structures
were referred to by their
int address in the internal heap.

API: LowLevelCAS 

These had the same name as the
Plain API, except prefixed with
"ll_", e.g.

Instead of returning a Java object
representing the FS, these return
lowLvlCas.ll_getIntValue(addr, feat)
where the addr and feat are both

lowLvlCas.ll_setFloatValue(addr, feat)

lowLvlCas.ll_setBooleanArrayValue(addr, index, value)

Getting and setting Feature values in V3

The JCas style of getting / setting feature values requires that the feature names be known at compile time, so you can write getXXXX where XXXX is the known-at-compile-time name of the feature.

The Plain style does not need this information; instead the range must be known, and calls are made like getIntValue(featureValue), where featureValue can be dynamically computed at run time. 

Plain style APIs bypass any JCas getter or setter customization

The plain style APIs do not invoke the JCas style getters and setters, even if those are present and perhaps customized.  This is a design decision made to follow the V2 implementation, and also for performance reasons.  So, if you have customized a getter or a setter in JCas, you must use the JCas APIs to run the customizations.

xxx_Type JCas classes removed in V3

These are eliminated in v3.  They served 2 purposes:

  • save one slot per feature structure - instead of a casImpl ref and a typeImpl ref, there was just one ref to the _Type instance, which in turn, and these two refs
  • provided a place for the low level accessors; these are accessors that take the "address" (now "id") of the FS as the way to designate which FS is being used.  There are 2 varieties of these low level accessors - those implemented in the CASImpl, and those implemented in the JCAS Type classes.  The latter has methods like "myShared_TypeInstance.setXXX(address, value)".  These are instance methods on the shared xxx_type instance, and were intended to permit access without creating the Java cover object for the FS.

The performance reason for using the low level accessors is not present in V3; in fact, these, if implemented, would be slower than the other APIs.


JCas Class sharing


JCas cover classes now come in single classes, rather than in pairs.  These classes are either built-in or are generated; built-in ones cannot be generated

JCas classes are associated with a class loader.  Except for the built-in types which always have JCas Classes, other JCas classes are optional. Furthermore, JCas classes may define only a subset of the features of the fully merged type system. So, even when a JCas class is present, it may not have getters and setters for some features of the corresponding UIMA type. These features can be accessed of course using the plain APIs (see above). 

When a UIMA type is instantiated in V3, the Java class used is the most specific instance of a JCas class for that type that is found.  For example, if you have a type Foo, with superType Bar, which in turn is a subtype of Annotation, and have no JCas classes defined, then when you create an instance of Foo (using the plain API: casView.createFS(fooType) because you can't do the JCas style of new Foo, because you haven't got a JCas class for type Foo), it will create an instance of Annotation as the implementing Java class.

One set of JCas classes per class loader may be used (even simultaneously) for multiple different type systems.  This can occur sequentially, for example, in the use case where a sequence of CASs and their type systems are being deserialized and worked on, sequentially; it can also occur when running multiple different pipelines under one class loader. When committing a type system, a check is made for each type to see if there is a corresponding JCas class, and if found, that any defined features have the proper range.

It is possible to run multiple pipelines with non-compatible type systems and JCas classes by running each one under its own class loader; in this scenario, each pipeline will load its own copy of JCas classes from its own classloader's classpath.

Connecting Instances with Type and Feature information

Information about types and features is stored in TypeImpl and FeatureImpl instances.  These are unique per type system.  However, multiple type system instances created using the same (merged) definition, and therefore "equal", are recognized at type system commit time, and the existing type system implementation is reused in this case.  This is different from V2, and may require updating code which gets references to types and features prior to type system commit; that code needs to be updated to re-acquire those references after type system commit, because the Type and Feature instances may be replaced with a shared version if the type system is equal to one already committed.

Locating the corresponding UIMA Type when creating a JCas type using the "new" operator

When a JCas instance is created using the "new" operator, it locates the type using information in a JCasRegistry.  The type cannot be statically kept in the JCas class definition, since one JCas class might be used by multiple different type systems.  Instead, each JCas class, when it is loaded, is assigned a unique incrementing number; this number is kept with the static (one per class loader) information for TypeSystemImpl.

At instance creation time, a lookup is done, using the instance of the type system, to get the actual type associated with the registry number.  This mechanism is encapsulated within the JCasRegistry.

Locating the corresponding UIMA Feature when accessing a feature

The generated getter or setter code for a JCas feature need to access the feature instance associated with the feature being set.  This cannot be statically compiled into the JCas class for the same reasons as above. A similar registry is used, and a unique incrementing integer is assigned to each feature referenced in a JCas class; these int values are kept as static final int values in the loaded class, and are similarly looked up when needed.

These are used in the trampoline methods of the generated getters and setters.  A getBegin() gets converted into a getIntValue(_getFeat(_FI_begin)), where getIntValue is the plain API call and _getFeat is a method taking the registered unique identifier for this feature and looking up the FeatureImpl instance for it.




UIMA v2 supports specially-named arrays of primitives (+ string), e.g. BooleanArray. 

UIMA v2 supports arrays of Feature Structures, using FSArray (JCas) or ArrayFS (Generic).  

For v3, support (not yet done, TBD?)

  • new notation (arrays):  aligned with Java: TOP[] or Annotation[] or MyType[] or short[]
  • new notation (collections): aligned with Java generics: List<TOP> or ArrayList<Annotation> or HashSet<MyType>

Use Java fully qualified names as the UIMA type name. 

Extend idea of "component type" to include multiple generics.

  • limit (initially) generic spec to only simple type names, no support for extends, ?, etc.  Use TOP for "Object".




Keep special UIMA String type for compatibility and subtyping.



  • No labels