Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The goal of this FLIP is to address the shortcomings mentioned above and make API of the APIs in TableEnvironment & Table more clear and stable. This FLIP won't support multiline statement statements which needs more discussion in further FLIP. (There have been some conclusions, please see the appendix.)

...

Code Block
languagejava
titleTableEnvironment
interface TableEnvironment {
   /** 
    * Execute multiline statement separated by a semicolon, return Iterator 
    * over all TableResults that corresponds to each single line statement.
    * The Iterator.next() method would trigger the next statement execution. 
    * This allows a caller to decide whether execute the statement 
    * synchronously or asynchronously.
    * 
    * @param statements multiline statement separated by a semicolon
    */
    Iterator<TableResult> executeMultilineSql(String statements);
}

We introduce `executeMultilineSql` method Introduce the `executeMultilineSql` in TableEnvironment  method and return `Iterator<TableResult>` which would trigger the next statement submission. This allows a caller to decide synchronously when to submit
submit statements async to the cluster. Thus, a service such as the SQL Client can handle the result of each statement individually and process statement by statement sequentially.

...

  1. How SQL CLI leverage the StatementSet class to obtain optimization?

    We can reference other system design like Sqlline Batch Command[9] and introduce similarly command but we should notice that the sql in batch can only be `insert into`.

  2. How SQL CLI parse and execute multiple statements?

    Currently, TableEnvironment does not support multiple statements but this feature is needed in the SQL CLI for it’s natural to execute an external script. I have thought provided a parse method like `List<String> parse(String stmt)`, but it’s not intuitive to understand and this method shouldn’t belong to the TableEnvironment API. As the discussion in the pull-request [5][6], calcite has provided the `SqlNodeList parseSqlStmtList()` method to parse a list of SQL statements separated by a semicolon and constructs a parse tree. I think the SQL CLI can use this method to parse multiple statements and execute every single statement one by one through TableEnvironmet#executeSql(String statement). Here is one thing we should take care of is that there are some special commands like `help/set/quit` in SQL CLI to control the environment’s lifecycle and change the variables of the context. IMO, there are some ways to deal with these commands in the multiple statements:
    1. Support these special control commands in flink-sql-parser and the shortcoming will be that TableEnvironment should take care of those noisy commands and flink-sql-parser will lose it’s more widely expansibility to other external systems. For example, SQL CLI may need to support `source xx` that execute an external script, it’s not proper to make TableEnvironment parser to see such syntax.
      1.  pro’s: 
      2.  con’s: 
        • many commands are only used for sql-client, e.g. help, quit, source
        • how to meet the requirements of non-builtin commands, e.g. commands from flink-sql-gateway
        • not easy to extend, it’s more difficult to implement a client-specific command in sql-parser than in specific client 
    2. SQL CLI parses those control commands on its own and should pre-split the multiple statements according to the control command. Then SQL CLI can pass the part of multiple statements to SqlParser and obtain a SqlNodeList. 
      1. pro’s:
        • sql-parser is more clean
        • more easy to extend for sql-client
      2.  con’s: 
    3. Flink already introduces a `Parser` interface which is exposed by `Planner`. We can add one more method to `Parser` like: List<String> splitStatement(String) and then we can borrow calcite to achieve this functionality. Special client commands (e.g. help, quit, source) are not supported in sql-parser now. Because the SqlParser#parseStmtList return SqlNodeList, not a string list, those special commands are not defined in SqlNode. So I think this approach is only a complement to the first one.
    4. Support a utility class to parse a statement separated by semicolon into multiple statements.
      1. pro’s:
        • more easy to extend for sql-client
        • can handle corner case in a unified place
      2. con’s:
        • many parsers: sql-parser,  a utility parser
    5. use TableEnvironment#executeMultilineSql to support this.

      we will open an another flip to discuss this.
    Other open question?