Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

CloudStack

...

have

...

not

...

had

...

a

...

strong

...

tradition

...

of

...

enforcing

...

a

...

exception

...

and

...

logging

...

behaviour.

...

However,

...

do

...

as

...

we

...

say

...

and

...

not

...

as

...

we

...

do.

...

Just

...

because

...

we

...

weren't

...

good

...

at

...

it

...

doesn't

...

mean

...

you

...

shouldn't.

...

And

...

we

...

are

...

working

...

very

...

hard

...

to

...

be

...

good

...

at

...

it.

...

 

Logging

CloudStack uses log4j. Yes, we could have use a number of logging facades out there. Yes, log4j is somewhat of an oldie but it is a goodie. Besides, what's really important is not the tool but the content (a recurring theme you'll find in CloudStack). CloudStack should be deployed with logging at INFO level or above and all logs should be stamped at GMT. However, CloudStack DOES NOT require restart to change logging levels. The following is a list of our logging levels and their suggested usage.  

Level

Use When

FATAL

This ship's sunk. Or the JVM has to die due to this.

ERROR

The system has hit a problem that it can not recover from. This error does not affect the general health of CloudStack but does error out for a particular request to CloudStack.

WARNING

The system has hit an problem that it thinks it can recover from but the admin should be aware so they can take a look at it.

INFO

The admin is interested in knowing this information (like the pilot announcing "Grand Canyon is to your right" on the flight)

DEBUG

Information that may be helpful to the admin in debugging a problem. The deciding factor here often is if an admin can reliably reproduce FATAL, ERROR, and WARNING condition, turning on DEBUG in logging should provide sufficient information about how they got to the error.

TRACE

Repetitive and annoying logs that really shouldn't be needed in normal debugging but may be useful as a last resort. Generally, the deciding factor on whether TRACE level is used is how fast this log can fill up the disk space if it is turned on.

Exception and Exception Handling

There are plenty of wisdom out on the internet regarding exceptions and handling. Here is some general anti-patterns and, on the bottom of that page, there are resources to other guidelines. There are a few that I like to single out as important.

  1. If you are writing entry point code, you are responsible for catching all exceptions, both Checked and Unchecked, and properly logging the error message and exception stack trace. What is an entry point? That's the point where a thread enters into our code base. For example, all API commands are entry points. If you are spinning off threads to do processing, the run() method of that thread is an entry point. If you are scheduling tasks to be run in a thread pool, that particularly task is an entry point. All of those code should be wrapped as follows.
    Code Block
    
    try {
        code...;
    } catch (Exception specific to your code) {
        Specific exception handling and logging

...

  1. ...
    } catch (Exception e) {
        s_logger.warn("Caught unexpected exception", e);
        exception handling code...
    }

...

  1. 
    

...

  1. If

...

  1. you

...

  1. are

...

  1. not

...

  1. writing

...

  1. entry

...

  1. point

...

  1. code,

...

  1. then

...

  1. it's

...

  1. fine

...

  1. to

...

  1. expect

...

  1. that

...

  1. code

...

  1. above

...

  1. yours

...

  1. will

...

  1. catch

...

  1. and

...

  1. log

...

  1. the

...

  1. exception.

...

  1. However,

...

  1. it

...

  1. is

...

  1. your

...

  1. responsibility

...

  1. to

...

  1. make

...

  1. sure

...

  1. that

...

  1. the

...

  1. stack

...

  1. trace

...

  1. of

...

  1. the

...

  1. exception

...

  1. is

...

  1. captured

...

  1. in

...

  1. the

...

  1. log.

...

  1. Don't

...

  1. ever

...

  1. catch

...

  1. and

...

  1. throw

...

  1. a

...

  1. new

...

  1. exception

...

  1. without

...

  1. either

...

  1. logging

...

  1. the

...

  1. exception

...

  1. or

...

  1. including

...

  1. it

...

  1. as

...

  1. the

...

  1. cause

...

  1. of

...

  1. the

...

  1. new exception.
    Code Block
    
    

...

  1. try {
        code...;
    } catch (XenAPIException e) {
        // Do either this: s_logger.warn("Caught a xen api exception", e);
        // or throw new CloudRuntimeException("Caught a xen api exception", e);
        // Don't ever do JUST this.
        throw new CloudRuntimeException("Got a xen api exception");

...

  1.  

...

  1. 
    }

...

  1. 
    

...

  1. Don't

...

  1. ever

...

  1. declare

...

  1. a

...

  1. method

...

  1. to

...

  1. throw

...

  1. Exception.

...

  1. This

...

  1. may

...

  1. seem

...

  1. like

...

  1. a

...

  1. nice

...

  1. and

...

  1. quick

...

  1. easy

...

  1. way

...

  1. to

...

  1. handle

...

  1. exceptions

...

  1. but

...

  1. it

...

  1. forces

...

  1. the

...

  1. caller

...

  1. methods

...

  1. to

...

  1. catch

...

  1. Exception

...

  1. which

...

  1. then

...

  1. hides

...

  1. all

...

  1. other

...

  1. checked

...

  1. Exceptions

...

  1. in

...

  1. other

...

  1. parts

...

  1. of

...

  1. the

...

  1. code.

...

  1. Take

...

  1. for

...

  1. instance:
    Code Block
    
    

...

  1. public void irresponsibleMethod() throws Exception;
    public void responsibleMethod() throws XenAPIException;
    public void runtimeExceptMethod(); // throws CloudRuntimeException that's not suppose to be logged until entry point.
    public void innocentCaller() {
        try {
            irresponsibleMethod();
            responsibleMethod();
            runtimeExceptionMethod();
        } catch(Exception e) {
            s_logger.warn("Unable to execute", e);
            throw new CloudRuntimeException("Unable to execute", e);
            // What's wrong here?
            // 1. If the error was thrown from responsibleMethod, the caller now forgot to do special handling for XenAPIException.
            // 2. If the error was thrown from runtimeExceptionMethod, the caller now log it once here, and will log again at entry point.
        }
    }

...

  1. 
    

...

  1. Don't

...

  1. ever

...

  1. throw

...

  1. Exception

...

  1. itself.

...

  1. If

...

  1. you

...

  1. need

...

  1. a

...

  1. checked

...

  1. Exception,

...

  1. either

...

  1. find

...

  1. one

...

  1. that

...

  1. fits

...

  1. your

...

  1. needs

...

  1. or

...

  1. create

...

  1. one

...

  1. yourself.

...

  1. If

...

  1. what

...

  1. you

...

  1. run

...

  1. into

...

  1. shouldn't

...

  1. be

...

  1. possible

...

  1. or

...

  1. is

...

  1. due

...

  1. to

...

  1. programmer

...

  1. mistake,

...

  1. then

...

  1. throw

...

  1. CloudRuntimeException.

...

  1. To

...

  1. decide

...

  1. if

...

  1. you

...

  1. need

...

  1. a

...

  1. CloudRuntimeException,

...

  1. ask

...

  1. yourself

...

  1. this,

...

  1. is

...

  1. this

...

  1. similar

...

  1. to

...

  1. hitting

...

  1. a

...

  1. null

...

  1. pointer?

...

  1. NullPointerException

...

  1. is

...

  1. a

...

  1. runtime

...

  1. exception

...

  1. because

...

  1. if

...

  1. the

...

  1. caller

...

  1. wanted

...

  1. to

...

  1. handle

...

  1. the

...

  1. pointer

...

  1. being

...

  1. null

...

  1. situation,

...

  1. they

...

  1. would

...

  1. have

...

  1. handled

...

  1. it

...

  1. before

...

  1. calling.

...

  1. Checked

...

  1. exceptions

...

  1. should

...

  1. be

...

  1. thrown

...

  1. if

...

  1. and

...

  1. only

...

  1. if

...

  1. the

...

  1. caller

...

  1. has

...

  1. a

...

  1. reasonable

...

  1. chance

...

  1. of

...

  1. handling

...

  1. the

...

  1. exception

...

  1. other

...

  1. than

...

  1. log

...

  1. and

...

  1. report

...

  1. error.

...

  1. Prefer

...

  1. CloudRuntimeException

...

  1. unless

...

  1. you

...

  1. have

...

  1. a

...

  1. good

...

  1. reason

...

  1. to

...

  1. throw

...

  1. a

...

  1. checked

...

  1. exception.

...

  1. Note

...

  1. the

...

  1. words

...

  1. "programmer

...

  1. mistake"

...

  1. here.

...

  1. User

...

  1. errors

...

  1. should

...

  1. be

...

  1. handled

...

  1. properly.

...

  1.  
  2. If

...

  1. you

...

  1. have

...

  1. to

...

  1. do

...

  1. some

...

  1. error

...

  1. handling

...

  1. for

...

  1. an

...

  1. exception,

...

  1. don't

...

  1. throw

...

  1. a

...

  1. new

...

  1. exception,

...

  1. rethrow

...

  1. the

...

  1. original

...

  1. one.

...

  1. Rethrowing

...

  1. the

...

  1. original

...

  1. one

...

  1. allows

...

  1. the

...

  1. correct

...

  1. stack

...

  1. trace

...

  1. to

...

  1. be

...

  1. logged.
    Code Block
    
    

...

  1. try {
        some code;
    } catch(XenAPIException e) {
        // catch generic error here.
        s_logger.debug("There's an exception.  Rolling back code: " + e.getMessage());
        ...rollback some code;
        throw e; // note there's no "new" here.
    }

...

  1. 
    

...

  1. If

...

  1. you

...

  1. have

...

  1. a

...

  1. background

...

  1. thread

...

  1. processing

...

  1. a

...

  1. list

...

  1. of

...

  1. equal

...

  1. items,

...

  1. it

...

  1. is

...

  1. important

...

  1. that

...

  1. the

...

  1. processing

...

  1. of

...

  1. each

...

  1. item

...

  1. includes

...

  1. a

...

  1. try-catch

...

  1. loop.

...

  1. If

...

  1. you

...

  1. don't

...

  1. and

...

  1. if

...

  1. there

...

  1. is

...

  1. any

...

  1. exception

...

  1. in

...

  1. processing

...

  1. one

...

  1. of

...

  1. the

...

  1. items,

...

  1. the

...

  1. items

...

  1. that

...

  1. are

...

  1. not

...

  1. processed

...

  1. yet

...

  1. will

...

  1. stop

...

  1. processing.

...

  1. This

...

  1. can

...

  1. have

...

  1. disastrous

...

  1. consequences

...

  1. as

...

  1. the

...

  1. background

...

  1. thread

...

  1. can

...

  1. keep

...

  1. coming

...

  1. back

...

  1. to

...

  1. the

...

  1. same

...

  1. list

...

  1. of

...

  1. items

...

  1. but

...

  1. every

...

  1. item

...

  1. after

...

  1. the

...

  1. item

...

  1. with

...

  1. the

...

  1. exception

...

  1. will

...

  1. never

...

  1. get

...

  1. processed.
    Code Block
    
    

...

  1. for (Task task : taskList) {
        try {
            process task;
        } catch (Exception e) {
            ...handle exception and continue
        }
    }

...

  1. 
    

CloudStack Exceptions

CloudStack do have a list of well known exceptions and there are some exceptions are important to describe here.

Exception

Thrown By

Purpose

Usage

CloudRuntimeException

everyone

An error has been hit that cannot be handled.

When using this exception, it is best to pack as much debugging information into the message as possible

ResourceUnvailableException

components that deal with resource allocation.

To serve as a parent class for when a physical resource is unusable when CloudStack wants to use it.

This exception must be thrown with the scope set in the exception. The scope tells the caller above whether this exception affects a host, storage pool, cluster, pod, or zone. The caller can then decide if it can retry.

InsufficientCapacityException

components that deal with resource allocation.

To serve as a parent class for when a physical resource is out of capacity when CloudStack wants to use it.

This exception must be thrown with the scope set in the exception. The scope tells the caller above whether this exception affects a host, storage pool, cluster, pod, or zone. The caller can then decide if it can retry.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

There is also a good reference to CloudStack exceptions and error codes here.