THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!

Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Welcome to Kylin Wiki.


PropertyRequiredPriorityDatatypeDefaultDescriptionVersionReference
kylin.engine.spark.build-class-name

NO

MINOR

String
org.apache.kylin.engine.spark.job.CubeBuildJob
For developer only. The className use in spark-submit.

4.0.0-ALPHA


kylin.engine.spark.cluster-info-fetcher-class-name

NO

MINOR

String
org.apache.kylin.cluster.YarnInfoFetcher
Fetch yarn information of spark job

4.0.0-ALPHA


kylin.engine.spark-conf.XXX

NO

MINOR

String
Spark configurations want to override for build job like "spark.driver.cores". If don't set these spark properties, kylin will automaticly adjust these properties before submitting build job. 

4.0.0-ALPHA

Adaptively-adjust-spark-parameters
kylin.storage.provider

NO

MINOR

String
org.apache.kylin.common.storage.DefaultStorageProvider

The content summary objects returned by different cloud vendors are not the same, so need to provide targeted implementation.

You can refer to this to learn more : org.apache.kylin.common.storage.IStorageProvider

4.0.0-ALPHA


kylin.engine.spark.merge-class-name

NO

MINOR

String
org.apache.kylin.engine.spark.job.CubeMergeJob
For developer only. The className use in spark-submit

4.0.0-ALPHA


kylin.engine.spark.task-impact-instance-enabled

NO


BooleantrueCheck kylin.engine.spark.task-core-factor. If kylin.engine.spark.task-impact-instance-enabled is set to true and kylin.engine.spark-conf.spark.executor.instances is not set, Kylin will calculate spark.executor.instances for Build Engine.

4.0.0-ALPHA

Adaptively-adjust-spark-parameters

kylin.engine.spark.task-core-factor

NO


Integer3
kylin.engine.driver-memory-base

YES


Integer1024Auto adujst spark.driver.memory for Build Engine if kylin.engine.spark-conf.spark.driver.memory is not set.



4.0.0-ALPHA

Adaptively-adjust-spark-parameters
kylin.engine.driver-memory-strategy

YES


Array
{"2", "20", "100"}
kylin.engine.driver-memory-maximum

YES


Integer4096
kylin.engine.persist-flattable-threshold

NO


Integer1If the number of cuboids which will be build from flat table is bigger than this threshold, the flat table will be persisted into $HDFS_WORKING_DIR/job_tmp/flat_table for saving more memory.

4.0.0-ALPHA


kylin.snapshot.parallel-build-timeout-seconds

NO

MAJOR


3600
To improve the speed of snapshot build.


4.0.0-ALPHA


kylin.snapshot.parallel-build-enabled

NO

MAJOR

Booleantrue








kylin.spark-conf.auto.prior

NO

MINOR

Booleantrue If need to adjust spark parameters adaptively.

4.0.0-ALPHA

Adaptively-adjust-spark-parameters
kylin.engine.submit-hadoop-conf-dir

YES

MAJOR

String/etc/hadoop/conf
Set HADOOP_CONF_DIR for spark-submit.

4.0.0-ALPHA


kylin.storage.columnar.shard-size-mb

YES

MAJOR

Integer128The size of each parquet partition file of cuboid



4.0.0-ALPHA

ShardBy
kylin.storage.columnar.shard-rowcount

YES

MAJOR

Long

2500000

The number rows of each parquet partition file of cuboid
kylin.storage.columnar.shard-countdistinct-rowcount

YES

MAJOR

Long1000000The number rows of each parquet partition file of cuboid when the shard column is distinct column.
kylin.query.spark-engine.join-memory-fraction

NO


Double0.3Limit memory used by broadcast join.

4.0.0-ALPHA


  • No labels