Exceptions and logging

CloudStack have not had a strong tradition of enforcing a exception and logging behaviour. However, do as we say and not as we do. Just because we weren't good at it doesn't mean you shouldn't. And we are working very hard to be good at it.

Logging

CloudStack uses log4j. Yes, we could have use a number of logging facades out there. Yes, log4j is somewhat of an oldie but it is a goodie. Besides, what's really important is not the tool but the content (a recurring theme you'll find in CloudStack). CloudStack should be deployed with logging at INFO level or above and all logs should be stamped at GMT. However, CloudStack DOES NOT require restart to change logging levels. The following is a list of our logging levels and their suggested usage.

Level	Use When
FATAL	This ship's sunk. Or the JVM has to die due to this.
ERROR	The system has hit a problem that it can not recover from. This error does not affect the general health of CloudStack but does error out for a particular request to CloudStack.
WARNING	The system has hit an problem that it thinks it can recover from but the admin should be aware so they can take a look at it.
INFO	The admin is interested in knowing this information (like the pilot announcing "Grand Canyon is to your right" on the flight)
DEBUG	Information that may be helpful to the admin in debugging a problem. The deciding factor here often is if an admin can reliably reproduce FATAL, ERROR, and WARNING condition, turning on DEBUG in logging should provide sufficient information about how they got to the error.
TRACE	Repetitive and annoying logs that really shouldn't be needed in normal debugging but may be useful as a last resort. Generally, the deciding factor on whether TRACE level is used is how fast this log can fill up the disk space if it is turned on.

Exception and Exception Handling

There are plenty of wisdom out on the internet regarding exceptions and handling. Here is some general anti-patterns and, on the bottom of that page, there are resources to other guidelines. There are a few that I like to single out as important.

I

try {
    code...;
} catch (Exception specific to your code) {
    Specific exception handling and logging;
} catch (Exception e) {
    s_logger.warn("Caught unexpected exception", e);
    exception handling code.
}

I

try {
    code...;
} catch (XenAPIException e) {
    // Do either this: s_logger.warn("Caught a xen api exception", e);
    // or throw new CloudRuntimeException("Caught a xen api exception", e);
    throw new CloudRuntimeException("Got a xen api exception"); // Don't ever do JUST this.
}

D

public void irresponsibleMethod() throws Exception;
public void responsibleMethod() throws XenAPIException;
public void runtimeExceptMethod(); // throws CloudRuntimeException that's not suppose to be logged until entry point.
public void innocentCaller() {
try
{ 
    irresponsibleMethod(); 
    responsibleMethod(); 
    runtimeExceptionMethod(); 
} catch(Exception e) { 
    s_logger.warn("Unable to execute", e); 
    throw new CloudRuntimeException("Unable to execute", e); 
    // What's wrong here? 
    // 1. If the error was thrown from responsibleMethod, the caller now forgot to do special handling for XenAPIException. 
    // 2. If the error was thrown from runtimeExceptionMethod, the caller now log it once here, and will log again at entry point. 
}

Don't ever throw Exception itself. If you need a checked Exception, either find one that fits your needs or create one yourself. If what you run into shouldn't be possible or is due to programmer mistake, then throw CloudRuntimeException. To decide if you need a CloudRuntimeException, ask yourself this, is this similar to hitting a null pointer? NullPointerException is a runtime exception because if the caller wanted to handle the pointer being null situation, they would have handled it before calling. Checked exceptions should be thrown if and only if the caller has a reasonable chance of handling the exception other than log and report error. Prefer CloudRuntimeException unless you have a good reason to throw a checked exception. Note the words "programmer mistake" here. User errors should be handled properly.

I

try {
    some code;
} catch(XenAPIException e) { // catch generic error here.
    s_logger.debug("There's an exception. Rolling back code: " + e.getMessage());
    ...rollback some code;
    throw e; // note there's no "new" here.
}

I

for (Task task : taskList) {
    try { 
       process task; 
    } catch (Exception e) {
        ...handle exception and continue 
    }
}

CloudStack Exceptions

CloudStack do have a list of well known exceptions and there are some exceptions are important to describe here.

Exception	Thrown By	Purpose	Usage
CloudRuntimeException	everyone	An error has been hit that cannot be handled.	When using this exception, it is best to pack as much debugging information into the message as possible
ResourceUnvailableException	components that deal with resource allocation.	To serve as a parent class for when a physical resource is unusable when CloudStack wants to use it.	This exception must be thrown with the scope set in the exception. The scope tells the caller above whether this exception affects a host, storage pool, cluster, pod, or zone. The caller can then decide if it can retry.
InsufficientCapacityException	components that deal with resource allocation.	To serve as a parent class for when a physical resource is out of capacity when CloudStack wants to use it.	This exception must be thrown with the scope set in the exception. The scope tells the caller above whether this exception affects a host, storage pool, cluster, pod, or zone. The caller can then decide if it can retry.

There is also a good reference to CloudStack exceptions and error codes here.

Space shortcuts

Child pages

Logging

Exception and Exception Handling

CloudStack Exceptions