Apache Kylin : Analytical Data Warehouse for Big Data
Welcome to Kylin Wiki.
NO MINOR PROCESS 4.0.0-ALPHA NO MINOR PROCESS 4.0.0-ALPHA NO MINOR PROCESS 4.0.0-ALPHA NO MINOR PROCESS The content summary objects returned by different cloud vendors are not the same, so need to provide targeted implementation. 4.0.0-ALPHA NO MINOR PROCESS 4.0.0-ALPHA NO PROCESS 4.0.0-ALPHA NO PROCESS YES PROCESS YES PROCESS YES PROCESS NO PROCESS 4.0.0-ALPHA NO MAJOR PROCESS NO MAJOR PROCESS PROCESS NO MINOR PROCESS 4.0.0-ALPHA YES MAJOR PROCESS 4.0.0-ALPHA YES MAJOR CUBE YES MAJOR CUBE 2500000 YES MAJOR CUBE NO PROCESS 4.0.0-ALPHAProperty Required Priority Datatype Configuration Level Default Description Version Reference kylin.engine.spark.build-class-name
String org.apache.kylin.engine.spark.job.CubeBuildJob
For developer only. The className use in spark-submit. kylin.engine.spark.cluster-info-fetcher-class-name
String org.apache.kylin.cluster.YarnInfoFetcher
For developer only. Fetch yarn information of spark job kylin.engine.spark-conf.XXX
String Null Spark configurations want to override for build job like "spark.driver.cores". If don't set these spark properties, kylin will automaticly adjust these properties before submitting build job. Adaptively-adjust-spark-parameters kylin.storage.provider
String org.apache.kylin.common.storage.DefaultStorageProvider
You can refer to this to learn more : org.apache.kylin.common.storage.IStorageProvider
kylin.engine.spark.merge-class-name
String org.apache.kylin.engine.spark.job.CubeMergeJob
For developer only. The className use in spark-submit kylin.engine.spark.task-impact-instance-enabled
Boolean true Check kylin.engine.spark.task-core-factor. If kylin.engine.spark.task-impact-instance-enabled is set to true and kylin.engine.spark-conf.spark.executor.instances is not set, Kylin will calculate spark.executor.instances for Build Engine. Adaptively-adjust-spark-parameters kylin.engine.spark.task-core-factor
Integer 3 kylin.engine.driver-memory-base
Integer 1024 Auto adujst spark.driver.memory for Build Engine if kylin.engine.spark-conf.spark.driver.memory is not set.
4.0.0-ALPHAAdaptively-adjust-spark-parameters kylin.engine.driver-memory-strategy
Array {"2", "20", "100"}
kylin.engine.driver-memory-maximum
Integer 4096 kylin.engine.persist-flattable-threshold
Integer 1 If the number of cuboids which will be build from flat table is bigger than this threshold, the flat table will be persisted into $HDFS_WORKING_DIR/job_tmp/flat_table for saving more memory. kylin.snapshot.parallel-build-timeout-seconds
3600
To improve the speed of snapshot build.
4.0.0-ALPHAkylin.snapshot.parallel-build-enabled
Boolean true kylin.spark-conf.auto.prior
Boolean true If need to adjust spark parameters adaptively. Adaptively-adjust-spark-parameters kylin.engine.submit-hadoop-conf-dir
String /etc/hadoop/conf Set HADOOP_CONF_DIR for spark-submit.
kylin.storage.columnar.shard-size-mb
Integer 128 The size of each parquet partition file of cuboid
4.0.0-ALPHAShardBy kylin.storage.columnar.shard-rowcount
Long The max rows of each parquet partition file of cuboid kylin.storage.columnar.shard-countdistinct-rowcount
Long 1000000 The number rows of each parquet partition file of cuboid when the shard column is distinct column. kylin.query.spark-engine.join-memory-fraction
Double 0.3 Limit memory used by broadcast join.