...
Page properties | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Document the state by adding a label to the FLIP page with one of "discussion", "accepted", "released", "rejected".
|
Motivation
Currently users have to click on every operator and check how much data each sub-task is processing to see if there is data skew. This is particularly cumbersome and error-prone for jobs with big job graphs. Data skew is an important metric that should be more visible.
...
- dataSkewPercentage: This will be used to show an overall or historical data skew score under the proposed Data Skew tab (see the UI Changes section).
- The existing numRecordsIn metric can be used to build this new metric
- The existing numRecordsIn metric can be used to build this new metric
- dataSkewPercentagePerSecond: This will be used to show a "live" score on the Job Graph (see the UI Changes section).
- The existing numRecordsInPerSecond metric can be used to build this new metric
See the "rejected alternatives" section for other metrics that were considered.
...
The proposed tab would sit next to the Exceptions tab as its purpose seems to me to be more similar to the Exceptions tab than other tabs. Highlighted in red blue in below screenshot.
Note that below screenshot/mock-up does not show the Data Skew tab next to the Exceptions tab, but the actual implementation will put it next to the Exceptions tab.
The look and feel of the proposed Data Skew tab will This FLIP does not talk in detail about how the UI of this new Data Skew tab should look. The look should be compatible with the rest of the UI. The list/table view of checkpoints under the Checkpoints tab could be used for inspiration.
The content of the proposed tab will roughly look as follows:
It This new Data Skew tab will show the overall accumulated data skew score of the operators as opposed to current/live view proposed under the Additional "Data Skew" Metric on the Flink Job Graph section. This tab will also contain a definition of what data skew is and the metric being used to calculate it before the table/list of operators (this is not shown in the screenshot and is left as implementation detail). It will have a Refresh button as shown in the screenshot similar to the Checkpoints tab.
A Note on Using Feature Flag / Config for Enabling the Proposed Changes
...