Page History

...

Jira

server	ASF JIRA
serverId	5aa69414-a9e9-3523-82ec-879b028fb15b
key	ATLAS-1757

Apache Atlas uses a the JanusGraph graph database at the heart of its metadata repository. This graph is used to show the interconnected relationships between data sources; the data sets they host; the business meaning of the data elements within each data set; the classification of these elements in terms of quality, confidentiality, retention; who (people and processes) are using them and for which purposes.

The current implementation of the graph db is Titan 0.5. This is a fairly back level version of Titan and there has been some work to provide support for Titan 1.0 by adding a graph abstraction layer. There is still work to do to complete this abstraction later, particularly in the catalog service which is using a back-level of ThinkerPop/Germlin that is not supported by Titan 1.0.

In the meantime a new graph initiative call JanusGraph has been spawned from Titan to take the code-base forward.

So, what should our graph strategy be? Do we focus on a single graph database, if so which one? or do we allow a range of graph databases that can be used depending on the deployment? If we support a range of grpah databases, can standard abstraction layers such as Apache TinkerPop be used?

JanusGraph uses a pluggable persistence store to save the metadata content and a search index for its search API. Apache Atlas can take advantage of this configurability to support a range of size, scalability and performance requirements

Jira

server	ASF JIRA
serverId	5aa69414-a9e9-3523-82ec-879b028fb15b
key	ATLAS-1757

is where the discussion about our graph strategy is occurring and it will be used to coordinate the implementation of whatever is decided. All welcome ...

...

Page tree

Versions Compared

Old Version 1

New Version Current

Key