...
Or user can specify total length to be read, but it has same limitation with PERCENT sampling. (As of Hive 0.10.0 - https://issues.apache.org/jira/browse/HIVE-3401)
Code Block |
---|
block_sample: TABLESAMPLE (ByteLengthLiteral) ByteLengthLiteral : (Digit)+ ('b' | 'B' | 'k' | 'K' | 'm' | 'M' | 'g' | 'G') |
...
Hive also supports limiting input by row count basis, but it acts differently with above two.
First, it does not need CombineHiveInputFormat which means this can be used with non-native tables. Second, the row count given by user is applied to each split. So total row count can be vary by number of input splits. (As of Hive 0.10.0 - https://issues.apache.org/jira/browse/HIVE-3401)
Code Block |
---|
block_sample: TABLESAMPLE (n ROWS) |
...