Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

functionFlink current behavior

Flink proposed changes

other SQL vendors' behavior

CURRENT_TIMESTAMP

return type: TIMESTAMP 

#session timezone: UTC
2020-12-28 23:52:52

#session timezone: UTCGMT+808:00
2020-12-28 23:52:52

wall clock:
UTCGMT+808:00:2020-12-29 07:52:52

return type: TIMESTAMP WITH LOCAL TIME ZONE

#session timezone: UTC
2020-12-28 23:52:52

#session timezone: UTCGMT+808:00
2020-12-29 07:52:52

In MySQL, Spark, the function NOW() and CURRENT_TIMESTAMP return current timestamp value in session time zone,the return type is TIMESTAMP

In
Pg, Presto, the function NOW() and CURRENT_TIMESTAMP return current timestamp in session time zone,the return type is TIMESTAMP WITH TIME ZONE

In
Snowflake the function CURRENT_TIMESTAMP/LOCALTIMESTAMP return current timestamp in session time zone,the return type is TIMESTAMP WITH LOCAL TIME ZONE

...

This FLIP proposes option 1 which changes the return typethe proposed changes as following, the current wall-clock is 2020-12-29 07:52:52 in Beijing time(UTCGMT+808:00):

function

existed problem

current behavior

proposed changes

CURRENT_DATE

returns UTC date, but user expects current date in session time zone

return type: DATE

#session timezone: UTC

2020-12-28

#session timezone: UTCGMT+808:00

2020-12-28

 return current date in session time zone, the return type should be DATE

#session timezone: UTC

2020-12-28

#session timezone: UTCGMT+808:00

2020-12-29

CURRENT_TIME

returns UTC time, but user expects current time in session time zone

return type:  TIME 

#session timezone: UTC

23:52:52

#session timezone: UTCGMT+808:00

23:52:52

return current time in session time zone, the return type should be TIME

#session timezone: UTC

23:52:52

#session timezone: UTCGMT+808:00

07:52:52

CURRENT_TIMESTAMP

returns UTC timestamp, but user expects current timestamp in session time zone


return type:  TIMESTAMP

#session timezone: UTC

2020-12-28 23:52:52

#session timezone: UTCGMT+808:00

2020-12-28 23:52:52

return current timestamp in session time zone, the return type should be

TIMESTAMP WITH LOCAL TIME ZONE

#session timezone: UTC

2020-12-28 23:52:52

#session timezone: UTCGMT+808:00

2020-12-29 07:52:52

NOW()

returns UTC timestamp, but user expects current timestamp in session time zone


return type: TIMESTAMP

#session timezone: UTC

2020-12-28 23:52:52

#session timezone: UTCGMT+808:00

2020-12-28 23:52:52

return current timestamp in session time zone, the return type should be

TIMESTAMP WITH LOCAL TIME ZONE

#session timezone: UTC

2020-12-28 23:52:52

#session timezone: UTCGMT+808:00

2020-12-29 07:52:52

PROCTIME()

returns UTC timestamp, but user expects current timestamp in session time zone

return type: TIMESTAMP  *PROCTIME*

#session timezone: UTC

2020-12-28 23:52:52

#session timezone: UTCGMT+808:00

2020-12-28 23:52:52

return current timestamp in session time zone for PROCTIME(), the return type should be TIMESTAMP  WITH LOCAL TIME ZONE *PROCTIME*

#session timezone: UTC

2020-12-28 23:52:52

#session timezone: UTCGMT+808:00

2020-12-29 07:52:52

After the proposal, the function NOW() and CURRENT_TIMESTAMP  become  synonyms, the function CURRENT_TIME and LOCALTIME become synonyms, you can also lookup all time function behaviors in reference [2]. 

...

function

current behavior

existed problem

migration plan

CAST(44 AS TIMESTAMP) 

TIMESTAMP(0) NOT NULL

#session timezone: UTC

1970-01-01 00:00:44 

#session timezone: UTCGMT+808:00

1970-01-01 00:00:44 

The time in BIGINT type usually represents a unixtime semantic, which represents the elapsed time since java epoch(1970-01-01 00:00:00 UTC+0), when convert to a timestamp we should consider local time zone

This is an invalid behavior, disable the invalid CAST behavior, to get same behavior, user can workaround with: 

#session timezone: UTC

  • TO_TIMESTAMP(FROM_UNIXTIME(44 - 0))

1970-01-01 00:00:44 

#session timezone: UTCGMT+808:00

TO_TIMESTAMP(FROM_UNIXTIME(44 - 8 * 60 * 60))

1970-01-01 00:00:44

CAST(TIMESTAMP ‘1970-01-01 00:00:44’ AS BIGINT) 

BIGINT NOT NULL

#session timezone: UTC

44 

#session timezone: UTCGMT+808:00

44

The inverse conversion of above, this conversion is used rarely.

UNIX_TIMESTAMP(TIMESTAMP ‘1970-01-01 00:00:44’)

#session timezone: UTC

44

#session timezone: UTCGMT+808:00

-28756

3. Disable legacy behavior of theses time functions

...

proposed changes

note

Support function  TO_TIMESTAMP_LTZ(numeric_expr, [,scale])

return type: TIMESTAMP(3) WITH LOCAL TIME ZONE

#session timezone: UTC

TO_TIMESTAMP_LTZ(44)

1970-01-01 00:00:44 

#session timezone: UTCGMT+808:00

TO_TIMESTAMP_LTZ(-28756)

1970-01-01 00:00:44

TO_TIMESTAMP_LTZ(numeric_expr [,scale])

TO_TIMESTAMP_LTZ(seconds, 0)

TO_TIMESTAMP_LTZ(milliseconds, 3)



...

After the proposal is finished, the above user cases will work smoothly.  Assume users' local time zone is UTCGMT+808:00, the wall-clock is 2020-12-29 07:52:52.

...

2. Change the return value of function  CURRENT_TIMESTAMP/NOW()/PROCETIME() to the timestamp value in local timezone. The proposal is fine if we only use it in FLINK SQL world, but we need to consider the conversion between Table/DataStream, assume a record produced in UTC+0 timezone with TIMESTAMP '1970-01-01 08:00:44'  and the Flink SQL processes the data with session time zone 'UTCGMT+808:00', if the sql program need to convert the Table to DataStream, then we need to calculate the timestamp in StreamRecord with session time zone (UTCGMT+808:00), then we will get 44 in DataStream program, but it is wrong because the expected value should be (8 * 60 * 60 + 44). The corner case tell us that the ROWTIME/PROCTIME in Flink are based on UTC+0, when correct the PROCTIME() function, the better way is to use TIMESTAMP WITH LOCAL TIME ZONE which keeps same long value with time based on UTC+0 and can be expressed with  local timezone.

...