IDIEP-107
Author
Sponsor
Created

 

StatusDRAFT


Motivation

One of the most important parts of any running application is its logs. The operations team uses them to make sure the application runs smoothly. Developers use the log for troubleshooting. The goal of this document is to outline logging subsystems of Ignite 3, as well as providing guidance on how to use this subsystem and what information should be logged.

Description

Core Framework

We will rely on the java.lang.System.Logger framework to do logging in AI3.

For convenience, a logger proxy class is provided to a developer to do all formatting and delegation – org.apache.ignite.internal.logger.IgniteLogger:

public class IgniteLogger {
    
    // Group of convenient methods for every log level
    public void info(String msg, Object... params) { <...> }

    public void info(String msg, Throwable th, Object... params) { <...> }

    public void info(Supplier<String> msgSupplier, Throwable th) { <...> }

    public void info(String msg, Throwable th) { <...> }

    public boolean isInfoEnabled() { <...> }
}


To create an instance of IgniteLogger, the utility class with factory methods should be provided:

public final class Loggers {
    public static IgniteLogger forClass(Class<?> cls) {
        return forName(Objects.requireNonNull(cls, "cls").getName());
    }

    public static IgniteLogger forName(String name) {
        var delegate = System.getLogger(name);

        return new IgniteLogger(delegate);
    }
}


For server-side logging the log instance must be created as static final field:

public class IgniteImpl implements Ignite {
    private static final IgniteLogger LOG = Loggers.forClass(IgniteImpl.class);
    
    <...>
}

Supported Levels of Severity

IgniteLogger defines well-understood logging severities, matching the java.lang.System.Logger:

  • ERROR  operation has failed because of reasons not related to the user. Normal cluster functioning is in question. Attention of the maintenance team is required immediately.
    Rule of thumb: ERROR is when you need a duty engineer to wake up in the middle of the night. Remember that sometimes it’s you. ERROR is always actionable - the user must understand how to react.

    Examples:
    • Unable to recovery from persisted state
    • Found B-Tree corruption

  • WARN – the cluster is working fine, but we are close to some kind of error. The attention of the maintenance team is required, but not immediately.
    Rule of thumb: WARN is what the team will need to review at the start of the next business day, at a high priority. WARN is always actionable - the user must understand how to react.

    Examples:
    • Resource consumption is above the threshold
    • Ignoring some configuration properties with reason disclosed
    • Unable to send a message to a node because of network problem (recovery is possible here)

  • INFO – normal level. Should be used to log out any change of cluster state. Should be avoided for frequent operations like get or put.
    Rule of thumb: INFO is good for anything that’s infrequent and will give info to support or development teams when troubleshooting. If an event happens no more frequently than every few minutes (such as page memory checkpoint), it probably should be logged at INFO

    Examples:

    • Remote node joining/leaving the cluster
    • Cluster activation/deactivation
    • Recovery of a component from persisted state

  • DEBUG – additional information that could help to debug a problem. Could be used to log additional context to the INFO level or events that won't be needed in a normal situation.

    Examples:

    • Additional context to the operation logged on INFO level
    • Attempt of retrying some operation
    • Errors occurring while doing some processing, before falling back to another approach
  • TRACE – like DEBUG, but more verbose.

Log Message Layout

The message layout has two parts: the first one is specified in the config of the logging backend, another is a message passed to the backend. We have no control over the former, thus it's out of scope of this document. The latter should be formed as the actual message with substituted variables. 

Message Formatting

The logger formatting patterns are independent of the underlying implementation. The formatting is similar to log4j and uses formatting anchor {}:

LOG.info("Table was created [tableName={}]", tableName);

Logging Backend

By default, the java.util.logging.Logger will be used. To change the backend, the corresponding bridge for System.Logger must be added to the classpath. The example below is for log4j:

<dependencies>
    <dependency>
        <groupId>org.apache.logging.log4j</groupId>
        <artifactId>log4j-core</artifactId>            
        <version>2.17.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.logging.log4j</groupId>    
        <artifactId>log4j-jpl</artifactId>
        <version>2.17.0</version>
    </dependency>
</dependencies>

Logging In The Embedded Mode

When running Ignite in embedded mode, the only possible way to configure logging through configuring logging's backend.

Logging On The Client

Logging on client is similar to a logging in embedded mode except there should be one more option to configure a backend by specifying it with client builder. For this, an interface LoggingFactory should be introduced:

@FunctionalInterface
public interface LoggerFactory {

    System.Logger forName(String name);
}


The entire flow will look like:

class CustomSystemOutLogger implements System.Logger {
    <...>    
}

class CustomLoggerFactory implements LoggerFactory {
    @Override
    public Logger forName(String name) {
        return new CustomSystemOutLogger(name);
    }
}

IgniteClient.builder()
        .loggerFactory(new CustomLoggerFactory())
        <...>
        .build()


The one downside of providing a way to configure backend on per client basis is that instance of a logger MUST NOT be created as static field neither in ignite client code nor anywhere in its dependencies.

Tickets

Unable to render Jira issues macro, execution error.

  • No labels