Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

[Progress progress record] :

Proposed time : 2022/03/01

...

Completion time : 2022/05/21

[Issueissue] : https://github.com/apache/incubator-linkis/issues/xxxxx  

[Emailemail] : At present, you must initiate a discussion in the wechat group [Apache Linkis Community Development group], and the discussion minutes can be sent to the official dev email of linkis

[Releaserelease] : Linkis-linkis 1.1.0  

...

Document the state by adding a label to the LKIP page with one of “discussion”, “accepted”, “released”, “rejected”.

[proposer]: 

...

Motivation & Background

As a computing middleware, linkis provides a unified data computing entrance. linkis is responsible for the connection with the underlying data. In some application scenarios, upper-layer applications need to obtain basic metadata information, such as databases and tables, to perform subsequent operations. Currently, linkis supports metadata query for hive metadata. However, Linkis only supports connection query for the mysql database configured with one hive metadata. For metadata query for multiple databases or non-mysql databases, Linkis cannot support metadata query.

In order to satisfy the query function of metadata information of multiple data sources, linkis is proposed to support the management of necessary information of configuration database connection, and support the query function of metadata information of different types of data sources

Basic concept

● Data source: we will be able to provide data storage of the database service called database, such as mysql/hive/kafka, data source definition is connected to the actual database configuration information, configuration information is mainly connected to the address, user authentication information, connection parameters and so on.

● Metadata: single refers to the metadata of the database, refers to the definition of the data structure of the data, the database of all kinds of object structure of the data. For example, the database name, table name, column name, field length, type and other information data in the database.

Expect to achieve goals

● Able to manage the configuration information of different types of data sources through linkis management console (new/modified/version switch/set expired)

...

● Only do the data source corresponding to the database basic metadata basic information query, does not provide metadata modification and other change functions

Implementation plan

【New】 Datasource management service: linkis-datasource-manager-server Data source management module, ps-data-source-manager. The basic management of the data source, provide external data source new, query, modify, connection test and other http interface. The internal rpc service is provided to facilitate the metadata query module to query the necessary information needed to establish a connection to the database through rpc calls.

...

4. Link tests of data sources are completed through linkis metastore server service, which now provides corresponding metadata query service

Changes

1. Change of code module Module

...


Modification Detail
1
Modification of maven module

  • New module linkis-datasource-query-common

...

  • is

...

  • added,

...

  • and

...

  • new

...

  • datasource

...

  • data

...

  • structure,

...

  • exception

...

  • class,

...

  • and

...

  • tool

...

  • class

...

  • are

...

  • added

...

  • A

...

  • new

...

  • module,

...

  • linkis-datasource-quwey-server,

...

  • is

...

  • added

...

  • to

...

  • manage

...

  • data

...

  • sources.

...

  • It

...

  • provides

...

  • functions

...

  • such

...

  • as

...

  • adding,

...

  • deleting,

...

  • checking,

...

  • modifying,

...

  • and

...

  • testing

...

  • data

...

  • sources

...

  • through

...

  • restful

...

  • interfaces

...

  • Added

...

  • the

...

  • linkis-metadata-manager-common

...

  • module,

...

  • and

...

  • added

...

  • the

...

  • metadata

...

  • data

...

  • structure,

...

  • exception

...

  • class,

...

  • and

...

  • tool

...

  • class

...

  • The

...

  • linkis-metadata-manager-server

...

  • module

...

  • is

...

  • added

...

  • to

...

  • provide

...

  • metadata

...

  • management

...

  • services

...

  • and

...

  • query

...

  • metadata

...

  • databases,

...

  • tables,

...

  • and

...

  • columns

...

  • through

...

  • restful

...

  • interfaces

...

  • Added

...

  • a

...

  • new

...

  • linkis-metadata-manager-service-es

...

  • module

...

  • to

...

  • provide

...

  • the

...

  • elasticsearch

...

  • metadata

...

  • management

...

  • service

...

  • The

...

  • linkis-metadata-manager-service-hive

...

  • module

...

  • is

...

  • added

...

  • to

...

  • provide

...

  • the

...

  • metadata

...

  • query

...

  • service

...

  • for

...

  • hive

...

  • Add

...

  • a

...

  • new

...

  • module

...

  • linkis-metadata-manager-service-kafka

...

  • to

...

  • provide

...

  • metadata

...

  • query

...

  • service

...

  • for

...

  • kafka

...

  • A

...

  • new

...

  • module

...

  • linkis-metadata-manager-service-mysql

...

  • is

...

  • added

...

  • to

...

  • provide

...

  • metadata

...

  • query

...

  • services

...

  • for

...

  • mysql

...

  • A

...

  • new

...

  • datasource

...

  • management

...

  • Java

...

  • client

...

  • module

...

  • linkis-datasource-client

...

  • is

...

  • added

...

  • to

...

  • facilitate

...

  • datasource

...

  • management

...

  • using

...

  • sdk
2

...

Modification of HTTP interface
  • Added the interface for querying metadata d
  • New data source add delete change search function
3

...

Modification of the client interface

...

LinkisDataSourceRemoteClient interface

...

  • GetAllDataSourceTypesResult getAllDataSourceTypes (GetAllDataSourceTypesAction) query all data types

...

  • QueryDataSourceEnvResult queryDataSourceEnv(QueryDataSourceEnvAction) Queries the cluster configurations that can be used by the data source

...

  • GetInfoByDataSourceIdResult getInfoByDataSourceId (GetInfoByDataSourceIdAction) : through the data source id query data source information

...

  • QueryDataSourceResult QueryDataSourceAction (QueryDataSourceAction) Queries data sources

...

  • GetConnectParamsByDataSourceIdResult getConnectParams (GetConnectParamsByDataSourceIdAction) get connection configuration parameters

...

  • CreateDataSourceResult createDataSource(CreateDataSourceAction) Creates a data source

...

  • DataSourceTestConnectResult getDataSourceTestConnect (DataSourceTestConnectAction) to test whether or not the data source connection is established properly

...

  • DeleteDataSourceResult deleteDataSource(DeleteDataSourceAction) Deletes a data source

...

  • ExpireDataSourceResult expireDataSource(ExpireDataSourceAction) Sets the data source to the expired state

...

  • GetDataSourceVersionsResult getDataSourceVersions (GetDataSourceVersionsAction) query list data source configuration version

...

  • PublishDataSourceVersionResult publishDataSourceVersion (PublishDataSourceVersionAction) released data source configuration version

...

  • UpdateDataSourceResult UpdateDataSourceAction (UpdateDataSourceAction) Updates data sources

...

  • UpdateDataSourceParameterResult updateDataSourceParameter (UpdateDataSourceParameterAction) to update the data source configuration parameters

...

  • GetKeyTypeDatasourceResult getKeyDefinitionsByType (GetKeyTypeDatasourceAction) data source type of a query need configuration properties

LinkisMetaDataRemoteClient interface

...

  • MetadataGetDatabasesResult getDatabases (MetadataGetDatabasesAction) query the database list

...

  • MetadataGetTablesResult getTables(MetadataGetTablesAction) Queries table data

...

  • MetadataGetTablePropsResult getTableProps (MetadataGetTablePropsAction)

...

  • MetadataGetPartitionsResult getPartitions (MetadataGetPartitionsAction) query partition table

...

  • MetadataGetColumnsResult getColumns(MetadataGetColumnsAction) Queries the columns of the data table
4

...

Modification of database table structure

...

  • No table to modify
  • The new table structure is as follows:

Image Modified

5Modification of configuration item
6Modification Error code 
7Modifications for Third Party Dependencies

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users?
  • If we are changing behavior, how will we phase out the older behavior?
  • If we require special migration tools, describe them here.
  • When will we remove the existing behavior?