Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).


Motivation


When running table API program "table.execute().print();", the columns with long string value are truncated to 30 chars, e.g.,:

After setting the max width with: tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width", 100); It has no effect.  

Here is the example code:

val env = StreamExecutionEnvironment.getExecutionEnvironment

val tEnv = StreamTableEnvironment.create(env)

tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width", 100)

val orderA = env.fromCollection(Seq(Order(1L, "beer", 3), Order(1L, "diaper--.diaper-.diaper-.diaper--.", 4), Order(3L, "rubber", 2))).toTable(tEnv)

orderA.execute().print()

 

Users want to configure the max display column width while:

  • using CLI
  • Using Table API
  • Running job in batch execution mode
  • Running job in streaming execution mode

Current Status

Currently there is only one ConfigOption SqlClientOptions.DISPLAY_MAX_COLUMN_WIDTH(sql-client.display.max-column-width') could be used to configure the value and it only works in a very specific case. Developers need to truly understand it before using it correctly.  


Job running in batch execution mode with SqlClient

It is not configurable. By default, PrintStyle.DEFAULT_MAX_COLUMN_WIDTH is used. [1]

Test didn't cover case with text whose length exceeds PrintStyle.DEFAULT_MAX_COLUMN_WIDTH, i.e. 30. [2]


Job running in streaming execution mode with SqlClient

Currently the max column display width is configurable for job running in streaming execution mode with cli: 

SET 'sql-client.display.max-column-width' = '40';

It only works for jobs running in streaming execution mode. There is a bug in documentation [3].


Job running in batch execution mode with Table API

It is not configurable. By default, PrintStyle.DEFAULT_MAX_COLUMN_WIDTH is used. [4]

 

Job running in streaming execution mode with Table API

Same as in batch execution mode, it is not configurable. By default, PrintStyle.DEFAULT_MAX_COLUMN_WIDTH is used. [4]

Conceptually, there is no difference between printing table results of jobs running in batch and streaming exec mode. The snapshot at the calling time will be printed. [5]

Summary

Topics mentioned above are mixed with layered and orthogonal issues. In order to provide a clear big picture, all of them will be summarized in the following table:




sql-client.display.max-column-width, default value is 30

sqlclient

Streaming

Text longer than the value will be truncated and replaced with “...”

sqlclient

Batch

No effect. The default value 30 is hard coded. 

Text longer than 30 will be truncated and replaced with “...”

Table API

Streaming

No effect. The default value 30 is hard coded. 

Text longer than 30 will be truncated and replaced with “...”

Table API

Batch

No effect. The default value 30 is hard coded. 

Text longer than 30 will be truncated and replaced with “...”



Proposed Changes


New Display ConfigOption for Table API and SqlClient in both batch and streaming execution modes


Introduce new ConfigOption DISPLAY_MAX_COLUMN_WIDTH(table.display.max-column-width) in TableConfigOptions class and use it when printing the table result via Table API and sqlClient.

Deprecate sql-client.display.max-column-width

Since sqlClient is calling Table API underneath and sqlClient and Table API are used in very different scenarios individually and isolatedly, it is rational to keep one central configuration, which is also easier for users who only need to take care of one configOption for the sake of display management. 

During the migration phase, while sql-client.display.max-column-width is deprecated, any changes done with sql-client.display.max-column-width will be delegated to table.display.max-column-width

Enable Display ConfigOption for Job running in Batch execution mode with SqlClient

Changing the value of “table.display.max-column-width” will apply to jobs running not only in batch but also in streaming execution mode. 

It is true that the max column length can be fetched from the data, because all rows are available at that time. But, since the text might be too long to display, the max column width for display is still required. 

Summary


After the proposed changes, the big picture looks like the following tables. During the migration phase while sql-client.display.max-column-width marked as deprecated: 



Deprecated: sql-client.display.max-column-width, default value is 30

table.display.max-column-width, default value is 30

sqlclient

Streaming

Changes will be forwarded to table.display.max-column-width

Text longer than the value will be truncated and replaced with “...”

sqlclient

Batch

Changes will be forwarded to table.display.max-column-width

Text longer than the value will be truncated and replaced with “...”

Table API

Streaming

No effect. 

table.display.max-column-width with the default value 30 will be used

Text longer than the value will be truncated and replaced with “...”

Table API

Batch

No effect. 

table.display.max-column-width with the default value 30 will be used

Text longer than the value will be truncated and replaced with “...”


After removing sql-client.display.max-column-width, there will be only one configuration for users:



table.display.max-column-width, default value is 30

sqlclient

Streaming

Text longer than the value will be truncated and replaced with “...”

sqlclient

Batch

Text longer than the value will be truncated and replaced with “...”

Table API

Streaming

Text longer than the value will be truncated and replaced with “...”

Table API

Batch

Text longer than the value will be truncated and replaced with “...”



Compatibility, Deprecation, and Migration Plan


Sql-client.display.max-column-width will be deprecated and will continue work compatible during the deprecation phase. It is recommended to replace table.display.max-column-width with table.display.max-column-width for the migration.

Test Plan

New tests will be added into related ITCases.

Rejected Alternatives

Different Display ConfigOption for Table API and SqlClient


Sql-client.display.max-column-width will not be deprecated. Two configurations will be used in two scenarios, i.e. sql-client.display.max-column-width will be used in CLI and table.display.max-column-width will be used with Table API. For the synchronization issue between these two configurations, there are some options. 

Option 1, no synchronization at all. Two configurations will be used isolatedly. 

Option 2, unidirectional sync. One configuration has higher priority. Since SqlClient wrapped Table API, the max column display width setting for Table API table.display.max-column-width should be used as the fallback for the configOption sql-client.display.max-column-width used for SqlClient. Changes made for table.display.max-column-width will be delegated to sql-client.display.max-column-width, i.e. SqlClient. On the other side, any on-the-fly update of sql-client.display.max-column-width will NOT be forwarded back to table.display.max-column-width for Table API, because comparing to SqlClient, Table API is at lower level. Any changes on the high level module should have no effect on the low level module. 

Option 3, bidirectional sync, i.e. both configurations will be the same. Any changes of one configuration will be forwarded to the other.

The reason for rejection is that this solution, on one hand, will make users confused and users will need extra effort to understand when to use which configuration. On the other hand, the extra coding logic of synchronization between two configurations is a little bit tricky and even worse because sqlclient and Table API are used isolatedly which turns out that there is no use case that requires such functionality. 


[1] https://github.com/apache/flink/blob/06ba29458536a75f4c8d78f41452b22bf6cc7fe7/flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/CliTableauResultView.java#L129

[2] https://github.com/apache/flink/blob/06ba29458536a75f4c8d78f41452b22bf6cc7fe7/flink-table/flink-sql-client/src/test/java/org/apache/flink/table/client/cli/CliTableauResultViewTest.java#L149

[3] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sqlclient/#sql-client-display-max-column-width

[4] https://github.com/apache/flink/blob/06ba29458536a75f4c8d78f41452b22bf6cc7fe7/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java#L929

[5] https://github.com/apache/flink/blob/8b4cb7583d509898b09782bbd4fb06388a219efe/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableResultImpl.java#L152

  • No labels