Area 2 provides definitions for the glossary. This enables the definition of meanings and the relationships between different types of terminology. Most definitions are created through a manual process, however, this definition may occur in a different tool and be replicated automatically into other metadata repositories. There can be multiple glossaries in the metadata repositories. Each glossary owns a set of Glossary terms and (optionally) a category hierarchy. Glossary terms can be linked into none, one or many categories, from any glossary.
Figure 1 shows the packages for the glossary.
Figure 1: Packages for area 2 - the glossary |
|
Each package will be defined in its own model file <package-name>.json and added to the addons/model directory in the atlas build tree.
Glossary Object
An Apache Atlas repository may contain many glossaries, particularly when it is part of a bigger enterprise cluster of repositories. Each glossary may come from a specific team or external organization. Or it may be focused on a particular topic or set of use cases. Figure 2 shows how a glossary is defined.
Figure 2: The glossary object provides the anchor point for the glossary content |
The anchor for each glossary is the Glossary object. The classifications associated with the glossary object are used to document the type of vocabulary it contains and its purpose:
These classifications are independent of one another so a Glossary object may have none, one or all of these classifications attached. |
Category Hierarchies
The vocabulary for the glossary is organized into a hierarchy of categories. These categories effectively provide a folder structure for the glossary. Figure 3 shows the definition for a glossary category.
Figure 3: The glossary category and its hierarchy | GlossaryCategory represents a category in a glossary. CategoryAnchor links each category to exactly one Glossary object. This means that this is its home glossary. If the Glossary object is deleted then so are all of the categories linked to it. CategoryHierarchyLink is a relationship used to organize categories into a hierarchy to, for example, create a structure for a taxonomy. A category may have none or one super-categories. However this super-category may be in a different glossary. SubjectArea is a classification for a category that indicates that the category represents a subject area. LibraryCategoryReference provides reference information for how this category corresponds to a category in an external glossary.
|
Terms
The vocabulary for the glossary is documented using terms. Each term represents a concept of short phrase in the vocabulary. Just like a category, a term is owned by a glossary but can be linked into a category from any glossary. Figure 4 shows the glossary term.
Figure 4: Terms |
Dictionary
The dictionary model adds some basic term relationship used to show how the meanings of different terms are related to one another. Figure 5 shows the dictionary model.
Figure 5: The dictionary model |
Spine Objects
The spine object model adds the relationships that enable a glossary to contain the definition of spine objects that can be used to control access to data, and the guild the design of new data stores and APIs. Figure 6 shows the relationships and classifications used to describe spine object.
Figure 6: Spine Object Model |