Status
Current state: "Under Discussion"
Discussion thread: here (https://mail-archives.apache.org/mod_mbox/flink-dev/202110.mbox/%3cCAKQOUBh1DoW3pgJLLbqpbiqwfF2wFBAj1Gj-z0Feq=_bvEiUZg@mail.gmail.com%3e)
JIRA: here (<- link to https://issues.apache.org/jira/browse/FLINK-XXXX)
Released: <Flink Version>
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Flink SQL is a great tool however sometimes it takes time lots of time to understand what is the problem with the current query.
E.g. here from error message below it is not clear that the reason is wrong first word.
Flink SQL> ELECT 1; [ERROR] Could not execute SQL statement. Reason: org.apache.calcite.runtime.CalciteException: Non-query expression encountered in illegal context
For larger queries it could be more time consuming to find out the root cause.
There is a number of corner cases when sql client submits a query instead of continuing editing with current parser behavior
e.g. there is a set of some simple queries which fail to execute via sql client
SELECT 'a; '; SELECT 1; -- comment SELECT 1; /* my comment; end comment */
During typing a query in sql client it is easy to have a misprint, miss a quote, bracket and do not notice it, e.g. like here
Flink SQL> SELECT COUNT(1) as `field FROM (VALUES(1));
Proposed Changes
Parsing improvements
Parser should have a state in order to answer if a query ends ok or still need to close a quote, a bracket or a comment.
Information about what is missing could be shown in prompt e.g.
Flink SQL> SELECT word, SUM(frequency) AS `count` // Imagine back tick was missed here ;> FROM ( )> VALUES ('Hello', 1), ('Ciao', 1), ('Hello', 2) )> ) ;> AS WordTable(word, frequency) ;> GROUP BY word;
Query could be submitted only when there is no such violations. And this will help to resolve issues with corner cases queries.
Syntax Highlighting
Having parser state it is also possible to use this info to highlight each part of the query differently by implementation of org.jline.reader.Highlighter interface [1].
e.g. there could be the next groups: keywords, single-quoted strings, string quoted with sql identifier, comments, hints, numbers, default.
Keywords could be everything from
Here probably it makes sense to have ability to support several color schemas as for dark and light terminal usually different colors suit.
There are some existing color schemas [2], [3], [4].
Completion
- Currently not all available commands in completion candidates list
- Use jline's aggregate completer to be able to apply different completers depending on context e.g.
SET and RESET could suggest key names as completion
SHOW: could be not only SHOW JAR, all options should be in completion candidates list.
ADD/REMOVE jar could suggest path with FileCompleter [5].
Error Message
Currently sql client unwinds exception till end. However error description for some cases does not clarify the reason e.g. missed first letter of SELECT in example in motivation section
Within the trace there is org.apache.calcite.sql.parser.SqlParseException containing line number and line column info of parser's cursor when it was failed.
This info could be used to improve the message.
Other jline features that could be applied
- line numbers + editing query consuming more than screen space
- autopairing[6] for single quote, sql identifier quote, square and round brackets could be turn on/off via property
(requires update at least till 3.13.3) - Enabling/Disabling description for completion candidates via property. An example how it looks like is here [7]
Implementation Details
Parsing
The main idea is to extend existing org.apache.flink.table.client.cli.SqlMultiLineParser . => No need to use Flink SQL parser
It should check the input if all quotes, comments, hints, brackets (square and round) are closed otherwise it throws new EOFError - normal behavior in jline in case input should be continued on a new line.
During need to keep track of cursor states if it is a comment, keyword, quoted string.
public enum State { DEFAULT, KEYWORD, // used only for highlighting SINGLE_QUOTED, SQL_IDENTIFIER_QUOTED, HINT, LINE_COMMENTED, BLOCK_COMMENTED; }
This track could be used to answer a question at what state parser by the end of a query.
Also such track, if there is recorded state for every cursor position, could be used to do highlighting.
Highlighting
To do highlighting it is necessary to implement org.jline.reader.impl.DefaultHighlighter.
From parser it could receive a track of state for each position and do highlighting.
Colors for different states are going to be defined in a separate SyntaxHighlightStyle class.
private final AttributedStyle blockCommentStyle; private final AttributedStyle lineCommentStyle; private final AttributedStyle hintStyle; private final AttributedStyle defaultStyle; private final AttributedStyle keywordStyle; private final AttributedStyle numberStyle; private final AttributedStyle singleQuotedStyle; private final AttributedStyle sqlIdentifierStyle; public SyntaxHighlightStyle( AttributedStyle blockCommentStyle, AttributedStyle lineCommentStyle, AttributedStyle hintStyle, AttributedStyle defaultStyle, AttributedStyle keywordStyle, AttributedStyle numberStyle, AttributedStyle singleQuotedStyle, AttributedStyle sqlIdentifierStyle) { this.blockCommentStyle = blockCommentStyle; this.lineCommentStyle = lineCommentStyle; this.hintStyle = hintStyle; this.defaultStyle = defaultStyle; this.keywordStyle = keywordStyle; this.numberStyle = numberStyle; this.singleQuotedStyle = singleQuotedStyle; this.sqlIdentifierStyle = sqlIdentifierStyle; }
At the same time there could be enum of this styles and that will allow to add more styles for highlighting e.g.
public enum BuiltInStyle { DEFAULT(null, null, null, null, null, null, null, null), DARK(BOLD_BLUE, WHITE, ITALIC_GREEN, ITALIC_GREEN, BOLD_GREEN, YELLOW, RED, MAGENTA), LIGHT(BOLD_RED, WHITE, ITALIC_CYAN, ITALIC_CYAN, BOLD_CYAN, YELLOW, GREEN, MAGENTA); private final SyntaxHighlightStyle style; BuiltInStyle( AttributedStyle keywordStyle, AttributedStyle defaultStyle, AttributedStyle blockCommentStyle, AttributedStyle lineCommentStyle, AttributedStyle hintStyle, AttributedStyle numberStyle, AttributedStyle singleQuotedStyle, AttributedStyle sqlIdentifierStyle) { style = new SyntaxHighlightStyle( blockCommentStyle, lineCommentStyle, hintStyle, defaultStyle, keywordStyle, numberStyle, singleQuotedStyle, sqlIdentifierStyle); }
Completion
There are 2 types of completions: command completion and sql completion.
Currently for sql it is used CalciteAdviser.
Within this improvement the idea is to leave it except first words which are the same as Flink SQL command like "INSERT INTO", "BEGIN STATEMENT SET;" and so on...
For commands it is possible to use jline's org.jline.reader.impl.completer.ArgumentCompleter and org.jline.reader.impl.completer.AggregateCompleter.
The first one allows to build completers for commands consisting of several words e.g. "BEGIN STATEMENT SET;" and the second one allows to unite them all together.
Cross dialect
Since there are at least 2 dialects in Flink (HIVE and DEFAULT). There could be a builtin dialect enum containing required info.
Taking current dialect it's possible to retrieve required info from that enum about quotes, comments and etc for parsing, highlighting and completion.
public enum BuiltInDialect implements Dialect { DEFAULT("FlinkSQLDefaultDialect", () -> SqlAbstractParserImpl.getSql92ReservedWords(), '`', '\'', "\"", Pair.of("/*", "*/"), Pair.of("/*+", "*/"), "--"); private final Supplier<Set<String>> keywords; private final char quote; private final char sqlIdentifierQuote; private final Pair<String, String> blockComments; private final Pair<String, String> hints; private final String lineComments; ... }
So that means that adding a new dialect should be relatively simple.
Summary
After this FLIP finishes, the sql client will have the options.
Supported options in sql client
Option | Default | Type | Description |
---|---|---|---|
sql-client.color-schema |
| Enum | Determine the color schema for syntax highlighting. Will contain names of available color schemas |
sql-client.autopairing | `true` | boolean | Determine whether autopairing is enabled or not. |
sql-client.completion-description | `false` | boolean | Determine whether description for completion candidates is enabled or not. |
sql-client.prompt.show-hint | `false` | boolean | Determine whether prompt shows hints about current input issue. For more detailed about prompt values see Supported prompt hint values in sql client |
sql-client.prompt.show-line-numbers | `false` | boolean | Determine whether prompt shows line numbers for multiline query. |
Supported prompt hint values in sql client
Prompt | Meaning |
---|---|
;> | Waiting for the next line of multi-line query, waiting for completion of query with semicolon (;) |
'> | Waiting for the next line, waiting for a completion of string that began with a single quote (') |
`> | Waiting for the next line, waiting for a completion of string that began with a back tick (`) |
*\> | Waiting for the next line, waiting for completion of a multi-line comment that began with (/*) |
)> | Waiting for the next line, waiting for completion of a string that began with a round bracket ( |
]> | Waiting for the next line, waiting for completion of a string that began with a square bracket [ |
extra )> | There is an extra round bracket ), that is not opened with ( |
extra ]> | There is an extra square bracket ], that is not opened with [ |
extra *\> | There is an extra closing of multi-line comment *\, that is not opened with \* |
Compatibility, Deprecation, and Migration Plan
This FLIP is an improvement for sql client. Compatibility is not affected immediately.
Rejected Alternatives
No rejected alternatives yet.
References
[1] https://github.com/jline/jline3/wiki/Highlighting-and-parsing
[2] https://github.com/Gillisdc/sqldeveloper-syntax-highlighting
[3] https://github.com/ozmoroz/ozbsidian-sqldeveloper
[4] https://github.com/GeSHi/geshi-1.0/blob/master/src/geshi/sql.php
[5] https://github.com/jline/jline3/wiki/Completion#file-completers
[6] https://github.com/jline/jline3/wiki/Auto-Indentation-and-Pairing
[7] https://asciinema.org/a/251412?speed=2.0