...
Even though this is an API breaking change, we aim for backwards compatibility. The new extraction is designed to support most of the old features and enables new features. Some slight adaptation of existing UDFs might be necessary. The new UDF design will only be supported in the newly introduced unified method defined in FLIP-64:
Deprecated with old type inference | New type inference |
registerScalarFunction/ registerAggregateFunction/ registerTableFunction | createTemporaryFunction |
It will enable all kinds of functions in the new `org.apache.flink.table.api.TableEnvironment`.
...
The following table list the classes that are extracted by default:
Class | Data Type |
String | STRING |
Boolean | BOOLEAN (NOT NULL) |
Byte | TINYINT (NOT NULL) |
Short | SMALLINT (NOT NULL) |
Integer | INT (NOT NULL) |
Long | BIGINT (NOT NULL) |
Float | FLOAT (NOT NULL) |
Double | DOUBLE (NOT NULL) |
java.sql.Date | DATE |
java.sql.Time | TIME(0) |
java.sql.Timestamp | TIMESTAMP(9) |
java.time.OffsetDateTime | TIMESTAMP(9) WITH TIME ZONE |
java.time.Instant | TIMESTAMP(9) WITH LOCAL TIME ZONE |
java.time.Duration | INTERVAL SECOND(9) |
java.time.Period | INTERVAL YEAR(4) TO MONTH |
arrays of the above | ARRAY<E> |
Map<K, V> | MAP<K, V> |
POJOs and Case classes | STRUCTURED TYPE |
The list explicitly excludes the following types:
...
input = @DataTypeHint(arbitraryInput = YES"ANY"),
isVarArgs = YES,
output = @DataTypeHint("STRING"))
...
- Defining a logical type with default conversion
e.g. `@DataTypeHint("INT")` - Defining a data type with different conversion
e.g. `@DataTypeHint(value = "TIMESTAMP(3)", bridgedTo = java.sql.Timestamp.class)` - Just parameterizing the extraction
e.g. `@DataTypeHint(version = 1, enableAny allowAnyGlobally = true)`
Within a FunctionHint, an empty DataTypeHint (no logical type) is only allowed as top-level property default.
...
The following options for parameterizing the extraction are exposed through the annotation, we might add more in the future. The list might seem pretty big at first glance but keep in mind that extraction is not always performed on little/simple POJOs but is sometimes performed on classes with 100+ fields that may have been generated using Avro or Protobuf:
Parameter | Description |
version | Logic version for future backwards compatibility. Current version by default. |
allowAnyGlobally | General flag that defines whether ANY data type should be used for classes that cannot be mapped to any SQL-like type or cause an error. Set to false by default, which means that an exception is thrown for unmapped types. For example, `java.math.BigDecimal` cannot be mapped because the SQL standard defines that decimals have a fixed precision and scale. |
allowAnyPattern | Patterns that enable the usage of an ANY type. A pattern is a prefix or a fully qualified name of `Class#getName()` excluding arrays. The general `allowAnyGlobally` flag must not be enabled for patterns. |
forceAnyPattern | Patterns that force the usage of an ANY type. A pattern is a prefix or a fully qualified name of `Class#getName()` excluding arrays. `allowAnyGlobally` must not be enabled for forcing ANY types. |
defaultDecimalPrecision | Sets a default precision for all decimals that occur. By default, decimals are not extracted. |
defaultDecimalScale | Sets a default scale for all decimals that occur. By default, decimals are not extracted. |
defaultYearPrecision | Sets a default year precision for year-month intervals. If set to 0, a month interval is assumed. |
defaultSecondPrecision | Sets a default second fraction for timestamps and intervals that occur. E.g. because some planners don't support nano seconds yet. |
Determines whether arbitrary input should be allowed. If set to true, this has similar behavior as an always passing input type validator. The bridging class must be Object.
Some examples:
public class ScalarFunction {
...
public void eval(String prefix, @DataTypeHint(arbitraryInput = YES"ANY") Object obj) {
//...
...
→ Takes a string and an arbitrary input parameter and returns a STRING.
Note to the last example: The ANY type is a special logical type as it bridges the Java class hierarchy world and the SQL type world. An ANY type is always connected to a class. In the example above, the ANY type is interpreted as `ANY<java.lang.Object>`. When translating an ANY type to an input validation, the validation happens class-based. Only the ANY type uses class-based validation with the class given in the input data type only. So eval(java.lang.Object) will accept any data type (including primitives) according to the JVM specification.
Manual Definition
If the (possibly annotated) extraction cannot solve a certain use case, for example, because literal values of a function call need to be analyzed or the return type is dependent on the input type. More advanced users can overwrite the `getTypeInference()` method.
...