You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 14 Next »

Unknown macro: {style}

body {
margin-top: 1em;
margin-bottom: 1em;
margin-left: 1em;
}
p {
font-family: "Palatino Linotype", "Times New Roman", Times, serif;
font-size: 12pt !important;
margin-left: 3em !important;
}

ul, ol

Unknown macro: { margin-left}

h1

Unknown macro: { border-top}

h2

Unknown macro: { margin-left}

h3

Unknown macro: { margin-left}

h4

Unknown macro: { margin-left}

This page was created to gather UIMA requirements from users. Feel free to add your topics here.

Deployment support for uima-as services and pipelines over clusters for processing large amounts of work

Although we have a deployment descriptor, setting it up and tuning it to potentially varying workloads, optimizing various targets (throughput, latency, recoverability, etc.) is a manual and difficult process.

Improving transparency of UIMA pipeline operations

Currently a lot of statistical information on the operation of UIMA is available, but difficult to access. This could be fixed by developing a "console" kind of application, perhaps like a web-site, with just-in-time tutorial, overview, and drill-down capabilities that would make the operations, bottlenecks, tradeoffs etc., more obvious to interested parties.

UIMA Class Loading Extension

This page discusses a suggestion for adding classpath information to a descriptor.
An alternative might be to use other standard and widely adopted approaches for this; I'm thinking that OSGi provides this capability, along with specifying "versions" and enabling the use of repositories.

General API improvements

Improvements of FSList/FSArray management

  • make it easier to add elements
  • make it easier to iterate FSList

More support for collections of CASs

Additional Class for Collection of CASs called "CCAS"

  • CCAS will have common index for all CASs. There are faster techniques for regular expression
    based annotation on collection of documents using inverted index which can be applied on CCAS.
  • CCAS can have some kind of integration with Hadoop Distributed File System so that it
    is easier to write Map-Reduce task in Hadoop. It can be a way towards integrating UIMA
    with Hadoop.

Supporting more modularity / interoperability

Conforming to widely adopted standards (e.g., OSGi, Maven)

Versioning of Annotators, TypeSystems

Dependency specifications (including versioning)

Packaging of classpath dependencies (already in PEAR, extensions to non-Pear environments)?

Using repositories of artifacts

  • e.g. Maven or P2 repositories
  • If an artifact is referenced via it's "name" and "version", be able to retrieve that from repository if not available locally
  • use maven or maven-like local cache

    Security: signing of artifacts

Efficient CAS persistent store and loading

Currently we can serialize/deserialize CASes in xmi, xcas (old), or binary formats.

  • need to search collections of CASes with various kinds of searches
  • maybe good to persist in relational database or RDF style tables
  • need to load subset of CAS, efficiently (for small subset of large CAS)
  • No labels