...
Currently users may customize their JCas cover classes. PEAR classpath isolation allows the use case where different customizations are present in one pipeline. The current implementation supports this, and switches the set of JCas cover classes as Pear boundaries are crossed. The idea of a Feature Structure being an instance of its cover class breaks down when multiple definitions of this exist. Some ideas for fixing this.
Consider ideas from other popular big-data frameworks: Hadoop, Spark
These typically have approaches to type systems that use user-defined Java types, and allow any kind of Java objects in the fields. There are new kinds of Serialization / Deserialization that work for all kinds of Java objects, but are much more efficient than Java reflection-based approaches (e.g. Kryo used by Spark).
Add support for Collections and Maps
Users have wanted these kinds of objects; some implementations I've seen have tried to implement Sets using a combination of HashSet and UIMA FSLists, duplicating the data and keeping things in sync, which was very inefficient.
More concurrency
Support parallel running of pipeline components.
...