You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

This page collects value propositions of UIMA, with a focus on describing them from a data centric point of view, for version 3 (or 4 or...)

Keep UIMA worthy of having people invest in it

When users spend time/energy creating reusable components, these components should have long and useful lives.  If migration to newer things is needed, migration tools that preserve this investment are valuable.

Data focus

Driven by the environment in 2015 for big data analytics:

  • multiple languages, including Scala, Python, Go, 
  • looking for easy interoperability among languages
  • Simpler for simple things, yet supports more complexity for more complex things

OASIS spec is based on XMI serialization.  This is not popular in today's world; more popular are JSON like representations.

What UIMA is / is-not trying to do

It can't be all things to all people, without diluting the value it has.

More focus on:

  • component reuse, component combinations 
  • significantly complex data representations - inputs and outputs
    • single-inheritance type system
    • stand-off style of annotations
    • references (ability to construct graphs)
    • collections (arrays, lists, maybe others)
  • data representation in transit, data representation in permanent storage (DBs, etc.)
  • interoperability with other frameworks / systems (apps (Lucene), scaleout frameworks (Spark), databases)

Less suited for:

  • simple computations
  • little reuse
  • trivial data representations

Using the web better for data typing

Type systems could be instantiated as web objects, and an ecosystem built around these.

  • Imagine a maven-style repository of type systems, with versioning
  • going to the web-site could return html docs for the type system
  • going to the web-site with a REST API could return the type system metadata
  • Could be a strong element for enabling data reuse

Existing Type Systems could be used, e.g. scheme.org

 

  • No labels