Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: More descriptions, removed UserIdentities model

...

Figure 4: Adding custom properties to any referenceable entity

 

 

User Identities

Most metadata repositories are run in a secure mode requiring incoming requests to include the requestor’s security credentials.  Therefore we have an identifier for each unique logged on security identity (aka userId).  The UserIdentity can capture this identity.

 

Image Removed

Figure 5: Understanding the actors working on the metadata and data assets

 

 

Locations

It is important to understand where assets are located to ensure they are properly protected and comply with data sovereignty laws.  The open metadata model allows location information to be captured at many levels of granularity.

Image RemovedImage Added

Figure 65: Understanding where data assets and services are located

The NestedLocation relationship allows hierarchical grouping of locations to be represented.  Notice that locations can be organized into multiple hierarchies.

The AdjacentLocation relationship links locations that touch one another.

The notion of a location is variable and the classifications PhysicalLocation, SecureLocation and MobileLocation help to clarify the nature of the location.

  • PhysicalLocation means that the location represents a physical place where, for example, Hosts (see 0030 below), servers (see 0040 below) and hence data may be located.  This could be an area of a data center, the building the data center is located in, or even the country where the server/data is located.
  • SecureLocation indicates that there is restricted access to the location
  • MobileLocation means that the Host (see 0030 below) is mobile.  An example of such a host would be a smart phone or IoT enabled vehicle.

 

Hosts and Platforms

The host and platform metadata entities provide a simple model for the system infrastructure (nodes, computers, etc) that data resources are hosted on.

Figure 76: Defining the platform that the data assets and services run on

 The host can be linked to its location through the HostLocation relationship.

 

Complex Hosts

The complex hosts handle environments where many nodes are acting together as a cluster, and where virtualized containers (such as Docker) are being used.

Figure 87: Supporting server clusters and server virtualization (server containers)

A HostCluster describes a collection of hosts that together are providing a service.  Clusters are often used to provide horizontal scaling of services.

A VirtualContainer provides the services of a host to the servers deployed on it (see 0040 below).  When the server makes requests for storage, network access etc, the VirtualContainer delegates the requests to the equivalent services of the actual host it is deployed on.

VirtualContainers can be hosted on other VirtualContainers, but to actually run they need to ultimately be deployed on to a real physical Host.

 

 

Servers

Servers describe the middleware servers (such as application servers, data movement engines and database servers) that run on the Hosts.   Within the server model we capture its userId. Most metadata repositories are run in a secure mode requiring incoming requests to include the requester’s security credentials.  Therefore we have an identifier for each unique logged on security identity (aka userId).  This identity is recorded within specific entities and relationships when they are created or updated.  By storing the user identifier for the server, it is possible to correlate the server with the changes to the metadata (and related data assets) that it makes. 

See model 0310 Actors in Area 3 for details of how user identifiers are correlated with people and teams).

Figure 98: Servers and their connectivity and capabilities

Open metadata may also capture the network endpoint(s) that the server is connected to and the host it is deployed to.

The endpoint defines the parameters needed to connect to the server.  It also features in the Connection model used by applications and tools to call the server.  Thus through the endpoint entity it is possible to link the connection to the underlying server.

Within the server are many capabilities.  These range from full applications (see 0060 below) to security plugins to logging and encryption libraries.  Different organizations and tools can choose the granularity in which the capabilities are captured in order to provide appropriate context to data assets and the decisions made around them.

 

Data Stores and Data Sets

The base model introduced the concept of a data set.  The data store definition shows how the data set relates to the server that it is hosted on.  In addition, some data sets are virtual - that is they are build up from calling other data sets.  Figure 10 9 shows the data stores and virtual data sets linking to the data set.

 

Figure 109: Data stores hosting data sets

 

...

Applications provide business or management logic.  They are often custom built but may also be brought as a package.  They are deployed onto a server.  Some applications are written to support specific processes.  Figure 11 10 shows how applications relate to processes and the servers that host them

 

Figure 1110: Applications and the servers they run on

 

...

The network model for open metadata is very simple, to allow hosts to be grouped into the networks they are connected to.  This can show details such as where hosts are isolated in private networks, where the gateways onto the Internet. 

 

Figure 1211: The networks that specific hosts connect to

 

...

The cloud platforms and services model show that cloud computing is not so different from what we have been doing before.  Cloud infrastructure and services are classified as such to show that the organization is not completely in control of the technology supporting their data and processes.

 

Figure 1312: Cloud platforms and services

The cloud provider is the organization that provides and runs the infrastructure for a cloud service.  Typically the host it offers is actually a HostCluster. 

The cloud provider may offer infrastructure as a service (IaaS), in which case, an organization can deploy VirtualContainers onto the cloud provider's HostCluster (see model 0035 above).

If the cloud provider is offering platform as a service (PaaS), an application may deploy server capability onto the cloud platform.

If the cloud provider is offering Software as a Service (SaaS) then it has provided APIs backed by pre-deployed server capability that an organization can call as a cloud service.

 

 

 

 

 

...