Authors
Forward Xu, Jark Wu
Status
Current state: "Under Discussion"
Discussion thread: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-JSON-functions-in-Flink-SQL-td32674.html
JIRA: here (<- link to https://issues.apache.org/jira/browse/FLINK-XXXX)
Released: <Flink Version>
Page properties | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Please keep the discussion Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
...
Use in SQL:
SQL SYNTAX | DESCRIPTION | RETURN TYPE |
JSON_EXISTS(jsonValue, path [ { TRUE | FALSE | UNKNOWN | ERROR } ON ERROR ] ) | Whether a jsonValue satisfies a search criterion described using JSON path expression path. | BOOLEAN |
JSON_VALUE(jsonValue, path [ RETURNING type ] [ { ERROR | NULL | DEFAULT expr } ON EMPTY ] [ { ERROR | NULL | DEFAULT expr } ON ERROR ] ) | Extract an SQL scalar from a jsonValue using JSON path expression path. | if no "RETURNING type" clause, STRING is returned. If a "RETURNING type" clause is included, an attempt is made to cast to the specified type. For example: json_value ('{"foo": 100}', 'lax $ .foo' returning integer null on empty) will return an integer type. |
JSON_QUERY(JSON_QUERY(jsonValue, path [ { WITHOUT [ ARRAY ] | WITH [ CONDITIONAL | UNCONDITIONAL ] [ ARRAY ] } WRAPPER ] [ { ERROR | NULL | EMPTY ARRAY | EMPTY OBJECT } ON EMPTY ] [ { ERROR | NULL | EMPTY ARRAY | EMPTY OBJECT } ON ERROR ] ) | Extract a JSON object or JSON array from jsonValue using the path JSON path expression. | STRING |
Note: The ON ERROR and ON EMPTY clauses define the fallback behavior of the function when an error is thrown or a null value is about to be returned.
...
Use in TableApi:
jsonExists
JAVA/PYTHON/SCALA | DESCRIPTION | RETURN TYPE |
STRING.jsonExists((Expression path [,JsonExistsBehavior behavior]) | Whether a jsonValue satisfies a search criterion described using JSON path expression path. |
STRING.jsonExists(path,JsonErrorStrategy.ERROR)
JsonErrorStrategy.ERROR means ERROR ON ERROR.
STRING.jsonExists(path,JsonErrorStrategy.UNKNOWN)
JsonErrorStrategy.UNKNOWN means UNKNOWN ON ERROR.
The JsonExistsBehavior is built as follows: behavior = JsonExistsBehavior.TrueOnError ("abc"). | BOOLEAN |
jsonValue
JAVA/PYTHON/SCALA | DESCRIPTION | RETURN TYPE |
STRING.jsonValue(Expression path[,JsonValueBehavior behavior]) | Extract an SQL scalar from a jsonValue using JSON path expression path. |
STRING.jsonValue(path,JsonReturningStrategy.INTEGER)
JsonReturningStrategy is an enum object that can be integer, boolean, etc.
STRING.jsonValue(path,JsonEmptyStrategy.NULL)
JsonEmptyStrategy.NULL means NULL ON EMPTY.
STRING.jsonValue(path,JsonEmptyStrategy.ERROR)
JsonEmptyStrategy.ERROR means ERROR ON EMPTY.
STRING.jsonValue(path,JsonEmptyStrategy.DEFAULT)
JsonEmptyStrategy.DEFAULT means DEFAULT ON EMPTY.
STRING.jsonValue(path,JsonErrorStrategy.NULL)
JsonEmptyStrategy.NULL means NULL ON ERROR.
STRING.jsonValue(path,JsonErrorStrategy.ERROR)
JsonEmptyStrategy.ERROR means NULL ON ERROR.
STRING.jsonValue(path,JsonErrorStrategy.DEFAULT)
JsonEmptyStrategy.DEFAULT means NULL ON ERROR.
...
The JsonValueBehavior is built as follows: behavior = JsonValueBehavior.defaultOnEmpty ("abc"). NullOnError(). or behavior = JsonValueBehavior.returnInt().defaultOnEmpty ("abc"). NullOnError(). | if no "RETURNING type" clause, STRING is returned. If a "RETURNING type" clause is included, an attempt is made to cast to the specified type. For example: json_value ('{"foo": 100}', 'lax $ .foo' returning integer null on empty) will return an integer type. |
jsonQuery
JAVA/PYTHON/SCALA | DESCRIPTION | RETURN TYPE |
STRING.jsonQuery(Expression path [,JsonQueryBehavior behavior]) | Extract a JSON object or JSON array from jsonValue using the path JSON path expression. The JsonQueryBehavior is built as follows: behavior = JsonQueryBehavior.withoutArray().defaultOnEmpty ("abc"). NullOnError (). | STRING |
...
JAVA/PYTHON/SCALA
DESCRIPTION
STRING.jsonQuery(path)
Extract a JSON object or JSON array from jsonValue using the path JSON path expression.
STRING.jsonQuery(path,JsonEmptyStrategy.NULL)
JsonEmptyStrategy.NULL means NULL ON EMPTY.
STRING.jsonQuery(path,JsonEmptyStrategy.ERROR)
JsonEmptyStrategy.ERROR means ERROR ON EMPTY.
STRING.jsonQuery(path,JsonEmptyStrategy.EMPTY_ARRAY)
JsonEmptyStrategy.EMPTY_ARRAY means EMPTY ARRAY ON EMPTY.
STRING.jsonQuery(path,JsonEmptyStrategy.EMPTY_OBJECT)
JsonEmptyStrategy.EMPTY_OBJECT means EMPTY OBJECT ON EMPTY.
STRING.jsonQuery(path,JsonErrorStrategy.NULL)
JsonEmptyStrategy.NULL means NULL ON ERROR.
STRING.jsonQuery(path,JsonErrorStrategy.ERROR)
JsonEmptyStrategy.ERROR means NULL ON ERROR.
STRING.jsonQuery(path,JsonErrorStrategy.EMPTY_ARRAY)
JsonErrorStrategy.EMPTY_ARRAY means EMPTY ARRAY ON ERROR.
STRING.jsonQuery(path,JsonErrorStrategy.EMPTY_OBJECT)
JsonErrorStrategy.EMPTY_OBJECT means EMPTY OBJECT ON ERROR.
STRING.jsonQuery(path,JsonWrapperStrategy.NONE)
JsonWrapperStrategy.NONE means WITHOUT WRAPPER
STRING.jsonQuery(path,JsonWrapperStrategy.NONE_ARRAY)
JsonWrapperStrategy.NONE_ARRAY means WITHOUT WRAPPER
STRING.jsonQuery(path,JsonWrapperStrategy.DEFAULT)
Example Data:
{"a": "[1,2]", "b": [1,2], "c": "hi"} |
Comparison:
OPERATOR | $.A | $.B | $.C |
JSON_EXISTS | true | true | true |
JSON_VALUE | [1, 2] | error | hi |
JSON QUERY WITHOUT ARRAY WRAPPER | error | [1, 2] | error |
JSON QUERY WITH UNCONDITIONAL ARRAY WRAPPER | [ “[1,2]” ] | [ [1,2] ] | [ “hi” ] |
JSON QUERY WITH CONDITIONAL ARRAY WRAPPER | [ “[1,2]” ] | [1,2] | [ “hi” ] |
Constructor Functions
Use in SQL:
SQL SYNTAX | DESCRIPTION | RETURN TYPE |
JSON_OBJECT( { [ KEY ] name VALUE value [ FORMAT JSON ] | name : value [ FORMAT JSON ] } * [ { NULL | ABSENT } ON NULL ] ) | Construct JSON object using a series of key (name) value (value) pairs. | STRING |
JSON_OBJECTAGG( { [ KEY ] name VALUE value [ FORMAT JSON ] | name : value [ FORMAT JSON ] } [ { NULL | ABSENT } ON NULL ] ) | Aggregate function to construct a JSON object using a key (name) value (value) pair. | STRING |
JSON_ARRAY( { value [ FORMAT JSON ] } * [ { NULL | ABSENT } ON NULL ] ) | Construct a JSON array using a series of values (value). | STRING |
JSON_ARRAYAGG( value [ FORMAT JSON ] [ ORDER BY orderItem [, orderItem ]* ] [ { NULL | ABSENT } ON NULL ] ) | Aggregate function to construct a JSON array using a value (value). | STRING |
Use in TableApi:
jsonObject
JAVA/PYTHON/SCALA | DESCRIPTION | RETURN TYPE |
jsonObject(ANY1, ANY2, …[, |
SqlJsonConstructorNullClause.NULL_ON_NULL]) | Construct JSON object using a series of key (name) value (value) pairs.Table API currently only supports |
jsonObject(ANY1, ANY2, …,JsonNullStrategy.NULL)
JSON string operations. This means NULL ON NULL. | STRING |
jsonObject(ANY1, ANY2, …[, |
SqlJsonConstructorNullClause.ABSENT_ON_NULL]) |
This means ABSENT ON NULL |
. |
STRING |
jsonObjectAgg
JAVA/PYTHON/SCALA | DESCRIPTION | RETURN TYPE |
jsonObjectAgg(ANY1,ANY2[,SqlJsonConstructorNullClause.NULL_ON_NULL]) | Aggregate function to construct a JSON object using a key (name) value (value |
jsonObjectAgg(ANY1, ANY2,JsonNullStrategy.NULL)
) pair.This means NULL ON NULL. | STRING |
jsonObjectAgg(ANY1,ANY2[, |
SqlJsonConstructorNullClause.ABSENT_ON_NULL]) |
This means ABSENT ON NULL |
. |
STRING |
jsonArray
JAVA/PYTHON/SCALA | DESCRIPTION | RETURN TYPE |
jsonArray(ANY1, ANY2, ... |
Construct a JSON array using a series of values (value). Table API currently only supports JSON string operations.
jsonArray(ANY1, ANY2, …,JsonNullStrategy.NULL)
[,SqlJsonConstructorNullClause.NULL_ON_NULL]) | Construct a JSON array using a series of values (value). This means NULL ON NULL. | STRING |
jsonArray(ANY1, ANY2, |
...[,SqlJsonConstructorNullClause.ABSENT_ON_NULL]) |
This means ABSENT ON NULL |
. |
STRING |
jsonArrayAgg
JAVA/PYTHON/SCALA | DESCRIPTION | RETURN TYPE |
jsonArrayAgg(ANY[,SqlJsonConstructorNullClause.NULL_ON_NULL]) | Aggregate function to construct a JSON array using a |
jsonArrayAgg(ANY,JsonNullStrategy.NULL)
value (value). This means NULL ON NULL. | STRING |
jsonArrayAgg(ANY[, |
SqlJsonConstructorNullClause.ABSENT_ON_NULL]) |
This means ABSENT ON NULL |
. |
STRING |
Note: The flag FORMAT JSON indicates the value is formatted as JSON character string. When FORMAT JSON is used, the value should be de-parse from JSON character string to a SQL structured value.
...
If ORDER BY clause is provided, JSON_ARRAYAGG sorts the input rows into the specified order before performing aggregation.
Comparison:
OPERATOR | RESULT |
JSON_OBJECT('id', 87, 'name', 'carrot') | {"id": 87, "name": "carrot"} |
JSON_ARRAY(1, "abc", NULL, TRUE) | [1, "abc", null, true] |
Simple Data Table T1:
o_id | attribute | value |
2 | color | red |
2 | fabric | silk |
3 | color | green |
3 | shape | square |
SELECT o_id, JSON_OBJECTAGG(attribute, value) FROM t1 GROUP BY o_id;
OPERATOR | o_id | AFTER AGG |
JSON_OBJECTAGG(attribute, value) | 2 | {"color": "red", "fabric": "silk"} |
JSON_OBJECTAGG(attribute, value) | 3 | {"color": "green", "shape": "square"} |
SELECT o_id, JSON_ARRAYAGG(attribute) AS attributes FROM t1 GROUP BY o_id;
OPERATOR | o_id | AFTER AGG |
JSON_ARRAYAGG(attribute) | 2 | ["color", "fabric"] |
JSON_ARRAYAGG(attribute) | 3 | ["color", "shape"] |
Comparison Operators
Applications will frequently want to ensure that the data they expect to consume as JSON data is, indeed, JSON data. The IS JSON predicate determines whether the value of a specified string does or does not conform to the structural rules for JSON. The syntax of the IS JSON predicate is:
Use in SQL:
SQL SYNTAX | DESCRIPTION | RETURN TYPE |
jsonValue IS JSON [ VALUE ] | Whether jsonValue is a JSON value.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
jsonValue IS NOT JSON [ VALUE ] | Whether jsonValue is not a JSON value.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
jsonValue IS JSON SCALAR | Whether jsonValue is a JSON scalar value.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
jsonValue IS NOT JSON SCALAR | Whether jsonValue is not a JSON scalar value.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
jsonValue IS JSON OBJECT | Whether jsonValue is a JSON object.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
jsonValue IS NOT JSON OBJECT | Whether jsonValue is not a JSON object.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
jsonValue IS JSON ARRAY | Whether jsonValue is a JSON array.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
jsonValue IS NOT JSON ARRAY | Whether jsonValue is not a JSON array.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
Use in TableApi:
JAVA/PYTHON/SCALA | DESCRIPTION | RETURN TYPE |
ANY.isJsonValue() | Whether jsonValue is a JSON value.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
ANY.isNotJsonValue() | Whether jsonValue is not a JSON value.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
ANY.isJsonScalar() | Whether jsonValue is a JSON scalar value.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
ANY.isNotJsonScalar() | Whether jsonValue is not a JSON scalar value.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
ANY.isJsonObject() | Whether jsonValue is a JSON object.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
ANY.isNotJsonObject() | Whether jsonValue is not a JSON object.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
ANY.isJsonArray() | Whether jsonValue is a JSON array.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
ANY.isNotJsonArray() | Whether jsonValue is not a JSON array.Table API functions currently only support JSON strings as input parameters. | BOOLEAN |
The test example:
tester.checkBoolean("'{}' is json value", true); tester.checkBoolean("'{]' is json value", false); tester.checkBoolean("'{}' is json object", true); tester.checkBoolean("'[]' is json object", false); tester.checkBoolean("'{}' is json array", false); tester.checkBoolean("'[]' is json array", true); tester.checkBoolean("'100' is json scalar", true); tester.checkBoolean("'[]' is json scalar", false); tester.checkBoolean("'{}' is not json value", false); tester.checkBoolean("'{]' is not json value", true); tester.checkBoolean("'{}' is not json object", false); tester.checkBoolean("'[]' is not json object", true); tester.checkBoolean("'{}' is not json array", true); tester.checkBoolean("'[]' is not json array", false); tester.checkBoolean("'100' is not json scalar", false); tester.checkBoolean("'[]' is not json scalar", true); |
Implementation Details
Table Api interface proposal
...