Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: add ideas about other frameworks, collections/maps

...

Currently users may customize their JCas cover classes.  PEAR classpath isolation allows the use case where different customizations are present in one pipeline.  The current implementation supports this, and switches the set of JCas cover classes as Pear boundaries are crossed.  The idea of a Feature Structure being an instance of its cover class breaks down when multiple definitions of this exist.  Some ideas for fixing this.

Consider ideas from other popular big-data frameworks: Hadoop, Spark

These typically have approaches to type systems that use user-defined Java types, and allow any kind of Java objects in the fields.  There are new kinds of Serialization / Deserialization that work for all kinds of Java objects, but are much more efficient than Java reflection-based approaches (e.g. Kryo used by Spark).  

Add support for Collections and Maps

Users have wanted these kinds of objects; some implementations I've seen have tried to implement Sets using a combination of HashSet and UIMA FSLists, duplicating the data and keeping things in sync, which was very inefficient.  

More concurrency

Support parallel running of pipeline components.

...