Introduction
This page provides a narrative on Nutch logging. Nutch uses Simple Logging Facade for Java (SLF4J) API's and Apache Log4j 2 as the logging implementation.
Audience
The page is targeted towards
- users who wish to learn about default logging in Nutch
- developers who would wish to further extend/customize logging
Related Development Work
- (pull request also linked)
Example Logging Syntax
import org.slf4j.Logger; import org.slf4j.LoggerFactory; ... public class Injector extends NutchTool implements Tool { private static final Logger LOG = LoggerFactory .getLogger(MethodHandles.lookup().lookupClass()); ... @Override public void setup(Context context) { ... LOG.info("Injector: overwrite: " + overwrite); LOG.info("Injector: update: " + update); }
Default Configuration
Legacy logging
Prior to Nutch version 1.19, Nutch logging was configured via conf/log4j.properties this changed in Nutch 1.19... see below
As of version 1.19 Nutch uses conf/log4j2.xml for logging configuration. By default, Nutch will log to log to $NUTCH_HOME/logs/hadoop.log
The configuration uses a RollingFileAppender with the cron triggering policy configured to trigger every day at midnight. Archives are stored in a directory based on the current year and month. All files under the base directory that match the */nutch-*.log.gz
glob and are 60 days old or older are deleted at rollover time. Additionally, it uses the ConsoleAppender configuration so everything is also written to STDOUT. This is useful for things like the ParserChecker and similar tooling.
Extending Nutch Logging Configuration
Log4j2 provides many Appenders which can be configured to extend Nutch logging. See below for some examples of how this could be done
logzio-log4j2-appender
The Logzio Log4j 2 Appender ships logs to Logzio using HTTPS bulk. It can be configured as follows
Add a dependency to ivy/ivy.xml
<dependency org="io.logz.log4j2" name="logzio-log4j2-appender" rev="1.0.13" conf="*->master" />
Augment the log4j2.xml configuration
<?xml version="1.0" encoding="UTF-8"?> ... <Configuration status="info" name="Nutch" packages=""> ... <Appenders> <LogzioAppender name="Logzio"> <addHostname>true</addHostname> <logzioToken>${insert_your_token_here}</logzioToken> <logzioType>java</logzioType> <logzioUrl>https://listener.logz.io:8071</logzioUrl> </LogzioAppender> ... </Appenders> <Loggers> <Root level="info"> <AppenderRef ref="Logzio"/> ... </Root> </Loggers> </Configuration>