Apache Kylin : Analytical Data Warehouse for Big Data
Page History
...
Please refer to Global Dictionary on Spark.
- The Are all the query results of Cube will be different from the same as query results from Push down engine(Spark SQL) in the following cases?
No. There are two cases will be different, show below:
1. When cube contains 'COUNT_DISTINCT' from HLL measure, Spark SQL will still calculate the accurate measure values from source data;
...
because Spark can't use 'VectorizedParquetRecordReader' to read parquet file when the returned schemas include 'ArrayType'. Please use the original design (dimension + sum measure) directly to execute TopN-style SQL.
Overview
Content Tools
ThemeBuilder
Apps