State
[progress record] :
Proposed time : 2022/03/01
Discussion time : 2022/04/01 (The preliminary discussion of the proposal in the wechat group developed by the community)
Acceptance/Rejection time : 2022/04/30
Completion time : 2022/05/21
[issue] :
[email] : At present, you must initiate a discussion in the wechat group [Apache Linkis Community Development group], and the discussion minutes can be sent to the official dev email of linkis
[release] : linkis 1.1.0
[proposer]:
Motivation & Background
As a computing middleware, linkis provides a unified data computing entrance. linkis is responsible for the connection with the underlying data. In some application scenarios, upper-layer applications need to obtain basic metadata information, such as databases and tables, to perform subsequent operations. Currently, linkis supports metadata query for hive metadata. However, Linkis only supports connection query for the mysql database configured with one hive metadata. For metadata query for multiple databases or non-mysql databases, Linkis cannot support metadata query.
In order to satisfy the query function of metadata information of multiple data sources, linkis is proposed to support the management of necessary information of configuration database connection, and support the query function of metadata information of different types of data sources
Basic concept
● Data source: we will be able to provide data storage of the database service called database, such as mysql/hive/kafka, data source definition is connected to the actual database configuration information, configuration information is mainly connected to the address, user authentication information, connection parameters and so on.
● Metadata: single refers to the metadata of the database, refers to the definition of the data structure of the data, the database of all kinds of object structure of the data. For example, the database name, table name, column name, field length, type and other information data in the database.
Expect to achieve goals
● Able to manage the configuration information of different types of data sources through linkis management console (new/modified/version switch/set expired)
● Able to version control and test basic connectivity of data source configuration information through linkis management console
● Provides an HTTP interface to query the basic metadata information of the data source based on the data source identifier and other parameters
● It can provide JAVA-SDK to query the basic metadata information of the data source through the data source identifier and other parameters
● Only do the data source corresponding to the database basic metadata basic information query, does not provide metadata modification and other change functions
Implementation plan
【New】 Datasource management service: linkis-datasource-manager-server Data source management module, ps-data-source-manager. The basic management of the data source, provide external data source new, query, modify, connection test and other http interface. The internal rpc service is provided to facilitate the metadata query module to query the necessary information needed to establish a connection to the database through rpc calls.
【New】 Metadata query service :linkis-metedata-query-server metadata query service, service name ps-metadata-query. It provides the basic query function of database metadata, external http interface, internal rpc service, convenient data source management service, through rpc call: data source connection test (linkis-metadata-query-server to be modified, In version 1.1.0, the name linkis-metedata-manager-server is not appropriate).
1. The Service is registered in the Linkis-eureak-Service Service and managed in a unified way with other Linkis microservices. The client can obtain the data source management service by connecting the linkis-gateway-service service and the service name data-source-manager.
2. The interface layer provides other applications with the addition, deletion, checking and modification of data source and data source environment, connection test of data source, version management of data source and expiration operation through Restful interface.
3. Service layer, mainly for database and material warehouse service management, permanent retention of data source related information;
4. Link tests of data sources are completed through linkis metastore server service, which now provides corresponding metadata query service
Changes
Modification | Detail | |
---|---|---|
1 | Modification of maven module |
|
2 | Modification of HTTP interface |
|
3 | Modification of the client interface | LinkisDataSourceRemoteClient interface
LinkisMetaDataRemoteClient interface
|
4 | Modification of database table structure |
|
5 | Modification of configuration item | |
6 | Modification Error code | |
7 | Modifications for Third Party Dependencies |
Compatibility, Deprecation, and Migration Plan
- What impact (if any) will there be on existing users?
- If we are changing behavior, how will we phase out the older behavior?
- If we require special migration tools, describe them here.
- When will we remove the existing behavior?