Apache Kylin : Analytical Data Warehouse for Big Data
Page History
...
2. When cube contains 'PERCENTILE' measure, the algorithm used to calculate the values in Kylin 4.0 is different from the one of Spark SQL;
- It is Is it recommended not to use the TopN measure in Kylin 4.0 .?
In No. In Kylin 4.0, if there is a TopN measure in cube, the data of 'TopN' measure will be saved in parquet file as 'ArrayType', which will lead to low reading performance,
because Spark can't use 'VectorizedParquetRecordReader' to read parquet file when the returned schemas include 'ArrayType'. Please use the original design (dimension + sum measure) directly to execute ' TopN' -style SQL.