THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!

Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Background

In order to facilitate users to troubleshoot problems, kylin will collect all logs to kylin.log in the running process, including the logs generated by spark in the build job and query job, which will be managed by kylin's log system.In the previous kylin versionrelease, the logs of kylin's build engine and query engine will be collected in kylin.log. Since spark is used as the build engine in kylin4, the logs of Kylin 4' s build job are the logs of spark jobs, including spark driver log and spark executor log. The log in the query process is mainly the log of the query engine sparder in kylin4. Because there are many spark logs in the build job, when these logs are output to kylin.log together with other logs, the kylin.log file will take up a lot of space, and its content will be chaotic, which is not conducive to problem analysisare collected or stored by resource manager(such as: yarn logs -applicationId xxx) or HBase Region Server instance. This may make it difficult to find root cause of failure job or slow query.

In order to solve this problem, Kylin 4.0.0 refactored the log of the build job. After refactor, the build log in Kylin 4 will be separated from kylin.log and uploaded to hdfs In Kylin 4.0.0, we are trying to collect and store these log under Kylin's working dir(HDFS or S3). The log4j configuration files of this log include the following two: ${KYLIN_HOME}/conf/spark-driver-log4j.properties and ${KYLIN_HOME}/conf/spark-executor-log4j.properties.


Driver log

SparkDriverHdfsLogAppender

...

When enabled SparkDriverHdfsAppender, users can download driver logs from kylinKylin's WebUIWeb UI, even the spark.submit.deployMode is cluster(means the driver is not located at the same node of Kylin Job Server).

...

After modifying the configuration, restart kylin, and then the spark driver log of one step of a job will be output to the local file: ${KYLIN_HOME}/logs/spark/${step_id}.log.


Executor log

SparkExecutorHdfsAppender

...