Apache Kylin : Analytical Data Warehouse for Big Data
Page History
Currently, Kylin use Log4j 1.2 as logger library.
Logger Configuration File
File Name | Affect Scope | Comment |
---|---|---|
kylin-server-log4j.properties | Kylin Instance, Sparder Driver | |
kylin-tools-log4j.properties | Kylin CLI | |
kylin-driver-log4j.properties | Driver | |
kylin-executor-log4j.properties | Executor |
Logger Directory
Table of Contents |
---|
In order to facilitate users to troubleshoot problems, kylin will collect all logs to kylin.log in the running process, including the logs generated by spark in the build job and query job, which will be managed by kylin's log system.
In the previous kylin version, the logs of kylin's build engine and query engine will be collected in kylin.log. Since spark is used as the build engine in kylin4, the logs of kylin4's build job are the logs of spark jobs, including spark driver log and spark executor log. The log in the query process is mainly the log of the query engine sparder in kylin4.
Because there are many spark logs in the build job, when these logs are output to kylin.log together with other logs, the kylin.log file will take up a lot of space, and its content will be chaotic, which is not conducive to problem analysis.
In order to solve this problem, kylin4.0.0-beta refactored the log of the build job. After refactor, the build log in kylin4 will be separated from kylin.log and uploaded to hdfs. The log4j configuration files of this log include the following two: ${KYLIN_HOME}/conf/spark-driver-log4j.properties and ${KYLIN_HOME}/conf/spark-executor-log4j.properties.
2、Spark driver log of build job
2.1 Output log to hdfs file
...
Path | Type | Log From |
---|---|---|
$KYLIN_HOME/logs/kylin.log | Local |
|
$KYLIN_HOME/logs/kylin.out | Local | stderr of Kylin Instance |
$KYLIN_HOME/logs/spark/xxx.log | Local | Driver of Cubing Job (check kylin-driver-log4j.properties) |
${project_name}/spark_logs/driver/${step_id}/execute_output.json.timestamp.log |
...
2.2 View logs through kylin WebUI
Users can view logs through the Job Output on the Monitor page on kylin's WebUI. By default, the Output will only show the contents of the first and last 100 lines of all logs of this step. if you need to view all logs, you can click "download the log file" at the top of the Output window to download all logs, and then the complete spark driver log file of this step will be downloaded locally by the browser.
2.3 Output log to local file
...
HDFS | Driver of Cubing Job (check kylin-driver-log4j.properties |
...
) |
${project_name}/spark_logs/executor/yyyy-mm-dd/${job_id} |
...
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
vim ${KYLIN_HOME}/conf/spark-driver-log4j.properties
log4j.rootLogger=INFO,logFile |
...
/${step_id}/executor-x.log |
...
HDFS | Executor of Cubing Job (check kylin |
3、Spark executor log of build job
3.1 Log output to hdfs file
...
-executor-log4j.properties |
...
) |
_sparder |
...
_logs/executor/yyyy-mm-dd/${job_id}/${step_id}/executor-x.log |
4、Troubleshooting
When the spark job submitted by kylin is submitted to the yarn cluster for execution, the user who uploads the spark executor log to HDFS may be yarn. At this time, the user of yarn may not have write permission to the hdfs directory ${kylin.env.hdfs-working-dir}/${kylin.metadata.url}/${project _ name}/spark_logs, which leads to the failure of uploading spark executor log. At this time, when viewing the task log with "yarn logs -applicationId <Application ID>", you will see the following error:
This error can be solved by the following command:
...
language | bash |
---|---|
title | acl |
linenumbers | true |
collapse | true |
...
HDFS | Executor of Query Job (check kylin-executor-log4j.properties) |