Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

Current state: under discussion

...

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Currently some temporal function behaviors are wired to users. 

...

This FLIP aims to consistent the timestamp function behavior and eventually improve the usability.

Public Interfaces 

As we knew some functions' behavior is wrong currently, but after we correct these function, the legacy behavior should still work in old code.

...

And when the option also influence the result of CAST conversion between NUMERIC TYPE and TIMESTAMP, by default,the CAST conversion is not supported anymore,  users can use TO_TIMESTAMP(milliseconds)  function to get a TIMESTAMP in local time zone from a java epoch milliseconds.  Users can set the option to 'true', thus these cast conversion would keep the legacy behavior.

Proposed Changes

1. Correct five time function behavior

I invested all Flink time-related functions current behavior and compared with other DB vendors like Pg,Presto, Hive, Spark, Snowflake,  I made an excel [2] to organize them well.

...

After the proposal, the function NOW(), CURRENT_TIMESTAMP and LOCALTIMESTAMP become  synonyms, the function CURRENT_TIME and LOCALTIME become synonyms, you can also lookup all time function behaviors in reference [2]. 

2. Correct CAST conversion between NUMERIC TYPE  and TIMESTAMP type

Currently, the following CAST conversion behaviors are wrong which does not consider the session time zone. It should use session time zone when cast between NUMERIC type and TIMESTAMP type if there’re strong requirements to support this , the numeric type include TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, they keep same conversion behavior, for example the cast between BIGINT and TIMESTAMP. Although we didn’t expose this feature in the document, some users may use them.

...

function

current behavior

existed problem

proposed changes

CAST(44 AS TIMESTAMP) 


TIMESTAMP(6) NOT NULL

#session timezone: UTC

1970-01-01T00:00:44 

#session timezone: UTC+8

1970-01-01T00:00:44 

The time in BIGINT type usually represents a unixtime semantic, which represents the elapsed time since java epoch(1970-01-01 00:00:00 UTC+0), when convert to a timestamp we should consider local time zone

Support function  TO_TIMESTAMP(seconds)

return type: TIMESTAMP(3)

#session timezone: UTC

TO_TIMESTAMP(44)

1970-01-01T00:00:44 

#session timezone: UTC+8

TO_TIMESTAMP(-28756)

1970-01-01T00:00:44

CAST(TIMESTAMP ‘1970-01-01 00:00:44’ AS BIGINT) 

BIGINT NOT NULL

#session timezone: UTC

44 

#session timezone: UTC+8

44

The inverse conversion of above, this conversion is used rarely.

UNIX_TIMESTAMP(TIMESTAMP ‘1970-01-01 00:00:44’)

#session timezone: UTC

44

#session timezone: UTC+8

-28756




General Implementations

1.Change the codegen implementations for above five functions/cast conversions according to the value of introduced table option: table.exec.fallback-legacy-time-function.enable 

...

both DataStream and SQL/API users feel intuitive.

Compatibility, Deprecation, and Migration Plan

  • Compatibility

This is an incompatible change, we introduce SQL/API option  table.exec.fallback-legacy-time-function for compacting current wrong behavior, and set it to ‘false’. If users want to keep the legacy behavior, they need to set it to ’true’ manually, this would be add to release note. 

...

records earlier than 2020-12-29 will not be output.


Test Plan

Will add plan tests, unit tests, window operator harness tests as well as IT tests. 

Rejected Alternatives

1. Change the return type of function CURRENT_TIMESTAMP/NOW()/PROCETIME() to TIMESTAMP WITH TIME ZONE. This proposal needs to introduce a new type TIMESTAMP WITH TIME ZONE, and we think there are no enough benefits. If we do this, the return type of function CURRENT_TIME must be TIME WITH TIME ZONE for consistent consideration, we need to introduce another type.

2. Change the return type of function  CURRENT_TIMESTAMP/NOW()/PROCETIME() to TIMESTAMP WITH LOCAL TIME ZONE. This proposal will lead to a embarrassed situation for function CURRENT_TIME, because no DB vendor use type TIME WITH LOCAL TIME ZONE yet, and the time semantic of TIMESTAMP WITH LOCAL TIME ZONE is instant which is too complex to understand for normal users.


References:

  1. https://docs.cloudera.com/documentation/enterprise/latest/topics/impala_timezone.html

...