THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!

Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Currently, Kylin use Log4j 1.2 as logger library. 

Logger Configuration File

File NameAffect ScopeComment

kylin-server-log4j.properties

Kylin Instance, Sparder Driver

kylin-tools-log4j.properties

Kylin CLI

kylin-driver-log4j.properties

Driver 

kylin-executor-log4j.properties

Executor 


Logger Directory

Table of Contents
1、Background

In order to facilitate users to troubleshoot problems, kylin will collect all logs to kylin.log in the running process, including the logs generated by spark in the build job and query job, which will be managed by kylin's log system.

In the previous kylin version, the logs of kylin's build engine and query engine will be collected in kylin.log. Since spark is used as the build engine in kylin4, the logs of kylin4's build job are the logs of spark jobs, including spark driver log and spark executor log. The log in the query process is mainly the log of the query engine sparder in kylin4. 

Because there are many spark logs in the build job, when these logs are output to kylin.log together with other logs, the kylin.log file will take up a lot of space, and its content will be chaotic, which is not conducive to problem analysis.

In order to solve this problem, kylin4.0.0-beta refactored the log of the build job. After refactor, the build log in kylin4 will be separated from kylin.log and uploaded to hdfs. The log4j configuration files of this log include the following two: ${KYLIN_HOME}/conf/spark-driver-log4j.properties and ${KYLIN_HOME}/conf/spark-executor-log4j.properties.

2、Spark driver log of build job

2.1 Output log to hdfs file

...

PathTypeLog From
$KYLIN_HOME/logs/kylin.logLocal
  • Output of Kylin's Logger(Log4j.12)
  • Driver of Query Job
$KYLIN_HOME/logs/kylin.outLocalstderr of Kylin Instance
$KYLIN_HOME/logs/spark/xxx.logLocalDriver of Cubing Job (check kylin-driver-log4j.properties)

${project_name}/spark_logs/driver/${step_id}/execute_output.json.timestamp.log

...

Image Removed

2.2 View logs through kylin WebUI

Users can view logs through the Job Output on the Monitor page on kylin's WebUI. By default, the Output will only show the contents of the first and last 100 lines of all logs of this step. if you need to view all logs, you can click "download the log file" at the top of the Output window to download all logs, and then the complete spark driver log file of this step will be downloaded locally by the browser.

Image Removed 

2.3 Output log to local file

...

HDFSDriver of Cubing Job (check kylin-driver-log4j.properties

...

)
${project_name}/spark_logs/executor/yyyy-mm-dd/${job_id}

...

Code Block
languagebash
titleModify spark-driver-log4j.properties
linenumberstrue
collapsetrue
vim ${KYLIN_HOME}/conf/spark-driver-log4j.properties
log4j.rootLogger=INFO,logFile

...

/${step_id}/executor-x.log

...

HDFSExecutor of Cubing Job (check kylin

3、Spark executor log of build job

3.1 Log output to hdfs file

...

-executor-log4j.properties

...

)
_sparder

...

_logs/executor/yyyy-mm-dd/${job_id}/${step_id}/executor-x.log

Image Removed

4、Troubleshooting

When the spark job submitted by kylin is submitted to the yarn cluster for execution,  the user who uploads the spark executor log to HDFS may be yarn. At this time, the user of yarn may not have write permission to the hdfs directory ${kylin.env.hdfs-working-dir}/${kylin.metadata.url}/${project _ name}/spark_logs, which leads to the failure of uploading spark executor log. At this time, when viewing the task log with "yarn logs -applicationId <Application ID>",  you will see the following error:

Image Removed

This error can be solved by the following command:

...

languagebash
titleacl
linenumberstrue
collapsetrue

...

HDFSExecutor of Query Job (check kylin-executor-log4j.properties)