Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Network and security parameters for accessing the data resources are managed in Apache Atlas the open metadata repository as part of a named connection.  The application need only supply the name of the connection and provided they have the appropriate security credentials then a connector is returned to them for use. 
    • There is no need to hard code user ids and passwords in the application code - nor manage keystores for this sensitive information since Apache Atlas the open metadata and governance server handles this.
    • If the location of the data changes, then the named connection configuration is changed in Apache Atlas the open metadata repository and the application will be connected to the new location the next time they request a connector.
  • The OCF connector provides two sets of APIs.  The first set provides access to the data resource and the second set provides access to the metadata that Apache Atlas the open metadata repository has about the data resource.  This provides applications and tools with a simple mechanism to make use of metadata as they process a data resource.  This is particularly useful for data science tools where the metadata can help guide the end user in the use of the data resource.
  • OCF connectors are not limited to representing data resources as they are physically implemented.   An OCF connector can represent a simplified logical (virtual) data resource that is designed for the needs of a specific application or tool.  This type of connector delegates the requests it receives to to one or more physical data resources.  The virtual data connector is an example of this type of connector.

...

  • There are many existing connectors and connector framework in the industry today.  It is important that these connectors can be incorporated into the OCF.  Thus the OCF includes placeholders for interface definitions that can be used as adapters to external connector providers and connectors.
  • Application developers will only adopt a connector framework if it is easy to use.  Thus the connector interfaces allow for the use of native data APIs to minimize the effort an application developer has to take in order to use the OCF connectors.
  • Governance enforcement is a complex topic, typically managed externally to the application development team.   As a result, a separate framework called the Governance Action Framework (GAF) manages governance enforcement and capabilities such as audit logging.  The role of the OCF is to bridge from the data resource access requests to the GAF.
  • Access to the metadata about a connector and its associated data resource should benefit from the breadth of metadata about the data resource in the open metadata repositories.  Thus the OCF is dependent on the there are is an Open Metadata Access Services Service (OMASs)OMAS) called Connected Asset OMAS that integrates with OCF and provides metadata to all connectors.


Key Concepts

Connection

The connection is a metadata entity that defines the set of parameters needed to access a specific data resource.  Each connection has a unique name.  An application can request a connector instance from the OCF using the name of a connection.  (See model

...

A connector directory provides a list of related connections.  Connections can belong to multiple connector directories and are not deleted when a connector directory they are linked to is deleted.  A tool may create a connector directory to manage the list of connections they are using.  Administrators set up connector directories to group related connections together for different groups of users.  The connector directories are managed in Apache Atlas through the Connector Directory OMAS.

Connector Directory OMAS

The Connector Directory OMAS is one of the Open Metadata Access Services (OMAS) that manages the configuration for connections and connector directories.

Connector Provider

A connector provider is the factory for a particular type of connector.  The connection stored in Apache Atlas will identify the connector provider. 

...


 

  1. An application requests a connector to the data store by calling the Connector Broker and passing the name of the connection.
  2. The connector broker looks up the the connection details in the Open Metadata Repository.
  3. The connection details identifies the Connector Provider and the parameters it needs to create a Connector
  4. A connector is a java object. It is returned to the application by the connector broker
  5. The application is able to access data, metadata and an audit log through the connector.
  6. The connector extracts data from the data store.
  7. The connector extracts metadata from the open metadata repository.  It .  This is managed by the Connected Asset OMAS which is plugged into the OCF if the OCF is accessed through another OMAS such as the Asset Consumer OMAS.  Connected Asset OMAS knows which asset metadata to return because it is linked to the connection details in metadata repository (see model 0105 0205 in Area 12 model).
Figure 1: Open Connector Framework - Overview of Operation

 

The OCF provides:

  • the A Java implementation of the connector broker - there is both a Java and a RESTful API for the Connector Broker
  • Java the APIs for a connector provider and connector instances
  • implementation Java base classes for a connector provider adapter provider  and connector instance adapter.

The OCF is dependent on:

  • Java POJO implementations of the properties about a connected asset
  • the metadata model for the connector directory and connection metadata in Apache Atlas metadata repository - see Area 2 - Assets and Connectors
  • the APIs to enable the metadata to be maintained and accessed:
  • Connector Directory Open Metadata Access Service (OMAS) API - for managing connection metadata and organizing them into connector directories for convenience and access management
  • Connected Asset OMAS API - for accessing an maintaining metadata about an asset.

Once the OCF is in place Apache Atlas will provide support for JDBC connectors (see Virtual Data Connector (VDC)) and open metadata repositories (see  OMRS Connectors).   Other vendors or open source projects may supply connector providers what are able to create connectors for different types of data assets.  

 

...

Scope and value of the OCF to Apache Atlas (and Open Metadata)

The OCF offers a simple but powerful mechanism to intercept requests to access data resources and inject metadata and governance into these requests.  It is designed to embrace existing connector frameworks and support new connector implementations that connect to new types of data resources.  This includes connectors to composite or virtual data resources that delegate to existing physical data resources.

...

  • Local Atlas OMRS Connector – this is the connector to a local Apache Atlas metadata repository.
  • OMRS REST Connector – this is a connector to a remote Apache Atlas repository (or any other metadata repository that supports the OMRS REST APIs).
  • IGC OMRS Connector – this is the connector for IBM’s Information Governance Catalog
  • Enterprise OMRS Connector – this connector can federate multiple metadata repositories by aggregating the results of calls to their OMRS connectors.

...

...

In the longer term, we will extend this approach to all system resources such as connection they use (set in the OMAS Scope) will determine which implementation of the connector is used and hence which metadata repository/ repositories are called.In the longer term, as Apache Atlas becomes an open platform, it can use the connector framework for connecting to the data stores it is using in the metadata repository.  For example:

  • A TinkerPop connector for the graph database
  • A log connector for the exception, operational lineage, meters and audit logs
  • A keystore connector for the keystore

...

The value of this approach is that it becomes easy to support different types of data stores for Atlas, and may of the connectors developed for the Atlas metadata repository Open Metadata and Governance reference implementation will be useful for other applications.

...