You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Introduction

This page provides a narrative on Nutch logging. Nutch uses Simple Logging Facade for Java (SLF4J) API's and Apache Log4j 2 as the logging implementation.

Audience

The page is targeted towards

  • users who wish to learn about default logging in Nutch
  • developers who would wish to further extend/customize logging

Related Development Work

  • Unable to render Jira issues macro, execution error. (pull request also linked)

Default Configuration


Legacy logging

Prior to Nutch version 1.19, Nutch logging was configured via conf/log4j.properties this changed in Nutch 1.19... see below

Nutch uses conf/log4j2.xml to define how logging will work. By default, Nutch will log to log to $NUTCH_HOME/logs/hadoop.log

The configuration uses a RollingFileAppender with the cron triggering policy configured to trigger every day at midnight. Archives are stored in a directory based on the current year and month. All files under the base directory that match the */nutch-*.log.gz glob and are 60 days old or older are deleted at rollover time. Additionally, it uses the ConsoleAppender configuration so everything is also written to STDOUT. This is useful for things like the ParserChecker and similar tooling.

Extending Nutch Logging Configuration




  • No labels