- Motivation
- Public Interfaces
- Proposed Changes
- Compatibility, Deprecation, and Migration Plan
- Test Plan
- Rejected Alternatives
- Risk
Motivation
The current JDBC connector unsupport reading multiple tables, but user cases indicate that this feature is needed. Missing this feature leads to the following problems:
- a table requires single read task, and management and maintenance are very complex.
- resource waste
The purpose of this project is to solve these two main problems.
Public Interfaces
Rely on the public interface provided by Catalog API to publish metadata of multiple tables.
Proposed Changes
Single table reading supports table_path
option replacement/enhancement query
Single table reading supports table_path
option replacement/enhancement query
Added table_path
optional option, used to specify the table path to be read, and use table_path
to automatically generate query
.
Of course, the query
parameter is still retained. On the one hand, it is compatible with the use of older versions, and on the other hand, the query can be customized in special situations such as complex query scenarios.
The combination of table_path
and query
is as follows:
- Only configure
query
: keep the current logic unchanged, automatically generatetable_path
tonull.null.default
- Only configure
table_path
: automatically generatequery
asselect * from xxx
, automatically obtain primary key/unique key as partition_column - Configure
table_path
andquery
: check the query is a complex sql. If it is not a complex sql, usetable_path
to get metadata (primary key/unique key/partition_column)
After the STIP:
source{
Jdbc {
url = "jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2b8"
driver = "com.mysql.cj.jdbc.Driver"
connection_check_timeout_sec = 100
user = "root"
password = "******"
table_path = 'test.test-1'
partition_column = "id"
partition_num = 10
partition_lower_bound = 1
partition_upper_bound = 500
}
}
Supports multi-table configuration reading
Add the configuration parameter table-list
, the configuration structure is list, in which each element is the configuration of a table, and the configuration items of each table are consistent with those of a single table.
example
source{
Jdbc {
url = "jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2b8"
driver = "com.mysql.cj.jdbc.Driver"
connection_check_timeout_sec = 100
user = "root"
password = "******"
table-list = [
{
table_path = 'test-1'
query = "select * from test-1"
partition_column = "id"
},
{
table_path = 'test-2'
query = "select * from test-2"
partition_column = "id"
}
]
}
}
JDBC multiple database adaptation
Since JDBC internally contains processing of multiple databases, table_path parsing needs to be adapted by the database dialect.
Compatibility, Deprecation, and Migration Plan
Compatibility
- It does not affect single table reading and remains compatible with users of older versions.
Deprecation
Migration Plan
- Compatibility upgrade, users can upgrade as needed
Test Plan
Test coverage scenarios
- Connector:JDBC single table reading
- Connector:JDBC multiple table reading
Rejected Alternatives
Risk
Missing metadata in complex query scenarios
If you use complex query SQL, such as Join/Union, the primary key information may be missing when Sink automatically creates tables.
The current JDBC connector unsupport reading multiple tables, but user cases indicate that this feature is needed. Missing this feature leads to the following problems: