THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!

Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Welcome to Kylin Wiki.


PropertyRequiredPriorityDatatypeConfiguration LevelDefaultDescriptionVersionReference
kylin.engine.spark.build-class-name

NO

MINOR

String

PROCESS

org.apache.kylin.engine.spark.job.CubeBuildJob
For developer only. The className use in spark-submit.

4.0.0-ALPHA


kylin.engine.spark.cluster-info-fetcher-class-name

NO

MINOR

String

PROCESS

org.apache.kylin.cluster.YarnInfoFetcher
For developer only. Fetch yarn information of spark job

4.0.0-ALPHA


kylin.engine.spark-conf.XXX

NO

MINOR

String

PROCESS

NullSpark configurations want to override for build job like "spark.driver.cores". If don't set these spark properties, kylin will automaticly adjust these properties before submitting build job. 

4.0.0-ALPHA

Adaptively-adjust-spark-parameters
kylin.storage.provider

NO

MINOR

String

PROCESS

org.apache.kylin.common.storage.DefaultStorageProvider

The content summary objects returned by different cloud vendors are not the same, so need to provide targeted implementation.

You can refer to this to learn more : org.apache.kylin.common.storage.IStorageProvider

4.0.0-ALPHA


kylin.engine.spark.merge-class-name

NO

MINOR

String

PROCESS

org.apache.kylin.engine.spark.job.CubeMergeJob
For developer only. The className use in spark-submit

4.0.0-ALPHA


kylin.engine.spark.task-impact-instance-enabled

NO


Boolean

PROCESS

trueCheck kylin.engine.spark.task-core-factor. If kylin.engine.spark.task-impact-instance-enabled is set to true and kylin.engine.spark-conf.spark.executor.instances is not set, Kylin will calculate spark.executor.instances for Build Engine.

4.0.0-ALPHA

Adaptively-adjust-spark-parameters

kylin.engine.spark.task-core-factor

NO


Integer

PROCESS

3
kylin.engine.driver-memory-base

YES


Integer

PROCESS

1024Auto adujst spark.driver.memory for Build Engine if kylin.engine.spark-conf.spark.driver.memory is not set.



4.0.0-ALPHA

Adaptively-adjust-spark-parameters
kylin.engine.driver-memory-strategy

YES


Array

PROCESS

{"2", "20", "100"}

kylin.engine.driver-memory-maximum

YES


Integer

PROCESS

4096
kylin.engine.persist-flattable-threshold

NO


Integer

PROCESS

1If the number of cuboids which will be build from flat table is bigger than this threshold, the flat table will be persisted into $HDFS_WORKING_DIR/job_tmp/flat_table for saving more memory.

4.0.0-ALPHA


kylin.snapshot.parallel-build-timeout-seconds

NO

MAJOR


PROCESS

3600
To improve the speed of snapshot build.


4.0.0-ALPHA


kylin.snapshot.parallel-build-enabled

NO

MAJOR

Boolean

PROCESS

true





PROCESS





kylin.spark-conf.auto.prior

NO

MINOR

Boolean

PROCESS

true If need to adjust spark parameters adaptively.

4.0.0-ALPHA

Adaptively-adjust-spark-parameters
kylin.engine.submit-hadoop-conf-dir

YES

MAJOR

String

PROCESS

/etc/hadoop/conf
Set HADOOP_CONF_DIR for spark-submit.

4.0.0-ALPHA


kylin.storage.columnar.shard-size-mb

YES

MAJOR

Integer

CUBE

128The size of each parquet partition file of cuboid



4.0.0-ALPHA

ShardBy
kylin.storage.columnar.shard-rowcount

YES

MAJOR

Long

CUBE

2500000

The max rows of each parquet partition file of cuboid
kylin.storage.columnar.shard-countdistinct-rowcount

YES

MAJOR

Long

CUBE

1000000The number rows of each parquet partition file of cuboid when the shard column is distinct column.
kylin.query.spark-engine.join-memory-fraction

NO


Double

PROCESS

0.3Limit memory used by broadcast join.

4.0.0-ALPHA


  • No labels