Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Markdown
[TOC]

## Motivation

The current JDBC connector unsupport reading multiple tables, but user cases indicate that this feature is needed. Missing this feature leads to the following problems:

- a table requires single read task, and management and maintenance are very complex.
- resource waste

The purpose of this project is to solve these two main problems.

## Public Interfaces

Rely on the public interface provided by Catalog API to publish metadata of multiple tables.

## Proposed Changes

### Single table reading supports `table_path` option replacement/enhancement `query` 

Single table reading supports `table_path` option replacement/enhancement `query` 

Added `table_path` optional option, used to specify the table path to be read, and use `table_path` to automatically generate `query`.

Of course, the `query` parameter is still retained. On the one hand, it is compatible with the use of older versions, and on the other hand, the query can be customized in special situations such as complex query scenarios.

The combination of `table_path` and `query` is as follows:
- Only configure `query`: keep the current logic unchanged, automatically generate `table_path` to `null.null.default`
- Only configure `table_path`: automatically generate `query` as `select * from xxx`, automatically obtain primary key/unique key as partition_column
- Configure `table_path` and `query`: check the query is a complex sql. If it is not a complex sql, use `table_path` to get metadata (primary key/unique key/partition_column)


AfterBefore the STIP:STIP:

```
source{
    Jdbc {
        url = "jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2b8"
        driver = "com.mysql.cj.jdbc.Driver"
        connection_check_timeout_sec = 100
        user = "root"
        password = "******123456"

        table_pathquery = 'test.test"select * from test-1'"

        partition_column = "id"
    }
}
```


After the    partition_num = 10STIP:

```
source{
    Jdbc {
        url = "jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2b8"
        driver = "com.mysql.cj.jdbc.Driver"
        partitionconnection_check_lower_bound = 1timeout_sec = 100
        user = "root"
        password = "******"

        partitiontable_upper_boundpath = 500'test.test-1'
    }
}
```

### Supports multi-table configuration reading

Add the configuration parameter `table-list`, the configuration structure is list, in which each element is the configuration of a table, and the configuration items of each table are consistent with those of a single table.

example
```
source{
    Jdbc {
        url = "jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2b8"
        driver = "com.mysql.cj.jdbc.Driver"
        connection_check_timeout_sec = 100
        user = "root"
        password = "******"

        table-list = [
            {
                table_path = 'test-1'

                query = "select * from test-1"
                partition_column = "id"
            },
            {
                table_path = 'test-2'

                query = "select * from test-2"
                partition_column = "id"
            }
        ]
    }
}
```

### JDBC multiple database adaptation

Since JDBC internally contains processing of multiple databases, table_path parsing needs to be adapted by the database dialect.

## Compatibility, Deprecation, and Migration Plan

### Compatibility

- It does not affect single table reading and remains compatible with users of older versions.

### Deprecation

### Migration Plan

- Compatibility upgrade, users can upgrade as needed

## Test Plan

### Test coverage scenarios

- Connector:JDBC single table reading
- Connector:JDBC multiple table reading

## Rejected Alternatives

## Risk

### Missing metadata in complex query scenarios

If you use complex query SQL, such as Join/Union, the primary key information may be missing when Sink automatically creates tables.

...

tables

...

.

...