Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The performance reason for using the low level accessors is not present in V3; in fact, these, if implemented, would be slower than the other APIs.

 

JCas Class sharing

...

JCas cover classes now come in single classes , rather than in pairs.  These classes are either built-in or are generated; built-in ones cannot be generatedJCas classes are associated with a class loader.  Except for the built-in types which always have JCas Classes, other JCas classes are optional. Furthermore, JCas classes may define only a subset of the features of the fully merged type system. So, even when a JCas class is present, it may not have getters and setters for some features of the corresponding UIMA type. These features can be accessed of course using the plain APIs (see above). 

When .  When a UIMA type is instantiated in V3, the Java class used is the most specific instance of a JCas class for that type that is found.  For example, if you have a type Foo, with superType Bar, which in turn is a subtype of Annotation, and have no JCas classes defined, then when you create an instance of Foo (using the plain API: casView.createFS(fooType) because you can't do the JCas style of new Foo, because you haven't got a JCas class for type Foo), it will create an instance of Annotation as the implementing Java class.

One set of JCas classes per class loader may be used (even simultaneously) for multiple different type systems.  This can occur sequentially, for example, if in the use case where a sequence of CASs and their type systems are being deserialized and worked on, sequentially.  The type hierarchy defined by the JCas classes must match the UIMA type system(s) where it is being used.  To run using multiple different JCas definitions, class loader isolation must be used.

When generated, they are specific to one (merged) type system, except for shared, common, built-in class definitions.  To allow for multiple type systems within one JVM simultaneously, class loader isolation is used.

 

; it can also occur when running multiple different pipelines under one class loader. When committing a type system, a check is made for each type to see if there is a corresponding JCas class, and if found, that any defined features have the proper range.

It is possible to run multiple pipelines with non-compatible type systems and JCas classes by running each one under its own class loader; in this scenario, each pipeline will load its own copy of JCas classes from its own classloader's classpath.

JCas Class and UIMA Type conformance

JCas Classes have static final fields computed at load time. Each type system commit loads corresponding JCas classes (the load only happens the first time, per class loader).

A particular type system instance is being committed when a JCas class is loaded.  At load time, these rules are checked:

  • Construct the supertype chain of the class being loaded.  It must be the case that, scanning upwards, there is a supertype that has a corresponding UIMA type.
    • It is OK if there are UIMA types between this and the found corresponding supertype - that just means there were no JCas types defined for those.
    • It is OK for the supertype chain to pass through supertypes which are not UIMA types, as long as the JCas supertypes are abstract (can't be instantiated)
  • For each feature
    • the feature offset assigned to the class's static final value must match the feature offset
    • the feature's range must match
    • JCas-defined features which do not exist in the 1st type system loading this JCas class will result in invalid getters and setters for that feature, if an attempt is made in some code to get/set those features.

How JCas feature offsets are computed or validated at type-system-commit time

The type system is walked in subsumption order, and offsets are assigned to all features.  Then the JCas classes are loaded - the corresponding features are used to set the static final int offset values in the JCas class, if they are actually loaded.  If they are already loaded, the existing values are checked to insure that they match the type system assigned values. A mismatch can occur if multiple different type systems are being used. Mismatches (which cannot happen if only one type system is in use) result in a fatal error.

Connecting Instances with Type and Feature information

...

Information about types and features is stored in TypeImpl and FeatureImpl instances.  These are unique per type system.  However, multiple type system instances created using the same (merged) definition, and therefore "equal", are recognized at type system commit time, and the existing type system implementation is reused in this case.

When creating a new instance of a UIMA Type, the JCas class for that type is loaded (if not already, and if available).

  This is different from V2, and may require updating code which gets references to types and features prior to type system commit; that code needs to be updated to re-acquire those references after type system commit, because the Type and Feature instances may be replaced with a shared version if the type system is equal to one already committed.

Locating the corresponding UIMA Type when creating a JCas type using the "new" operator

...

When a JCas instance is created using the "new" operator, it locates the type using information in a JCasRegistry.  The type cannot be statically kept in the JCas class definition, since one JCas class might be used by multiple different type systems.  Instead, each JCas class, when it is loaded, is assigned a unique incrementing number; this number is kept with the static (one per class loader) information for TypeSystemImpl.

At instance creation time, a lookup is done, using the instance of the type system, to get the actual type associated with the registry number.  This mechanism is encapsulated within the JCasRegistry class.

Locating the corresponding UIMA Feature when accessing a feature using JCas APIs

The generated getter or setter code for a JCas feature need to access needs the feature instance associated with stored-feature-offset-index information for the feature being set.  This cannot be statically compiled into the JCas class for the same reasons as above. A similar registry is used, and a unique incrementing integer is assigned to each feature referenced in a JCas class; these int values are kept as static final int values in the loaded class, and are similarly looked up when needed.

These are used in the trampoline methods of the generated getters and setters.  A getBegin() gets converted into a getIntValue(_getFeat(_FI_begin)), where getIntValue is the plain API call and _getFeat is a method taking the registered unique identifier for this feature and looking up the FeatureImpl instance for it.

 

Collections

...

accessed.  In the use-case of having multiple type systems for one JCas class set loaded under one class loader, each type system might have a different number for this; this design would make it necessary to have all accesses go thru one level of indirection to get the particular type system's offset for a feature.

This is avoided using the following technique that assigns the offsets to match already assigned ones:

  • The first time a JCas class is loaded at type system commit time, it defines a final static int constant of the pre-computed offset.
  • The 2nd time a JCas class is accessed at type system commit time, the first value stored is read and is used for the offset.

This requires that no JCas class access is done prior to type system commit, since the static final value can only be assigned once at resolution time.  This is normally the case, since it would be invalid to do something with a JCas class before the pipeline is set up.

Collections

UIMA v2 supports specially-named arrays of primitives (+ string), e.g. BooleanArray. 

...

  • limit (initially) generic spec to only simple type names, no support for extends, ?, etc.  Use TOP for "Object".

 

Strings

 

...

  • .

...