Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Asset Apache Atlas as reference implementation of Egeria

...

Delivering this capability as open source is a critical part of the project since multiple vendors must buy into this ecosystem.  They are not going to do this if one organization dominates the technology base.  Thus the open metadata and governance technology must be freely available with an open source governance model that allows a community of organizations and practitioners to develop and evolve the base and then use it in their offerings and deployments.

The proposal is to use Apache Atlas as the open source reference implementation for open metadata and governance.  Apache Alas would support an open metadata and governance compliant repository plus provide the adapters and interchange formats to allow other metadata repositories to connect into the ecosystem.

This presentation provides an overview of the project: https://ibm.box.com/s/ofr6eelz39up4wkj9qlihsf55fsh2hft.

 

The open metadata and governance project is divided into the following pieces:

  • Common types for open metadata - these types are built from the Apache Atlas type system and define the types stored in the graph database as well as payloads for notifications and APIs
  • Open Metadata Repository Services (OMRS) - Open metadata repository APIs and notifications to enable metadata repositories to exchange metadata in a peer-to-peer metadata repository cluster.  This capability is referred to as the "metadata highway".
  • Open Metadata Access Services (OMAS) - Consumer-centric APIs and notifications for specific classes of tools and applications.  The OMAS services call the OMRS to access metadata from any open metadata repository.
  • New frameworks to complement the Atlas Hooks and Bridges
    • Open Connector Framework (OCF) - provides factories for connectors with access APIs for data resources and metadata together.  The OMRS is also built as a set of metadata repository connectors and the OMAS services use the OCF to connect to the appropriate OMRS connector.
    • Open Discovery Framework (ODF) - provides management for automated processes and analytics to analyze the content of data resources and update the metadata about them.
    • Governance Action Framework (GAF) - provides audit logging and governance enforcement services for implementing enforcement points in data engines, security managers such as Apache Ranger, and APIs.  It also adds stewardship services for analyzing audit logs and resolving issues identified in exceptions raised by the enforcement services.
  • Open Metadata Graph Repository - A set of stores linked together with a graph database.  These stores provide linkage between business, technical and operational metadata along with logs for auditing, operational lineage, metering and exception management..
  • Open Lineage Services - Services for collecting and querying lineage information across multiple heterogeneous metadata repositories

Figure 1 shows how this could look if it were implemented in Apache Atlas.

 

 

Image Removed

Figure 1: Open Metadata and Governance Components implemented in Apache Atlas

 

At this current time, there is a huge investment into Apache Atlas to add the open metadata and governance features plus also work on adoption across the data industry.  The wiki pages about this project represent the current stage of the architecture and design.  The JIRAs that are linked to are used for managing the code delivery.  Typically the wiki pages are ahead of the code since they are used to communicate the design as part of the code development cycle.

 

Integrating into the Open Metadata and Governance Ecosystem

With these frameworks and APIs in place, Apache Atlas becomes the reference implementation for the open metadata and governance APIs as well as offering the integration capability for the metadata cluster.   Its function is divided into different packages to allow technology partners to connect into the open metadata and governance ecosystem.   The integration options are described as 4 patterns (native, caller, adapter and plug-in) here.

To this end, the open metadata standards and core implementation is being developed through the Egeria ODPi open source project (https://github.com/odpi/egeria).  This provides the APIs, adapters and interchange formats to allow metadata repositories to connect into the ecosystem.  Apache Atlas provides the open source reference implementation for a native open metadata repository.

To understand more about the aims of the Egeria project see: https://github.com/odpi/egeria/blob/master/open-metadata-publication/website/README.md 


...