Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

List of failures should be covered by this engine:

  • Critical Errors
  • Persistence errors
  • IOOM errors (part of persistence errors?)
  • Critical system workers crashes
  • Assertion errors (should be covered at Throwable catch for every system worker as well)
  • etcSegmentation

List of system workers should be covered by this engine:

  • disco-event-worker
  • tcp-disco-sock-reader
  • tcp-disco-srvr
  • tcp-disco-msg-worker
  • tcp-comm-worker
  • grid-nio-worker-tcp-comm
  • exchange-worker
  • sys-stripe
  • grid-timeout-worker
  • db-checkpoint-thread
  • wal-file-archiver
  • ttl-cleanup-worker
  • nio-acceptor

List of errors to be handled 

  • Persistence errors
  • IOOM errors (part of persistence errors?)
  • IO errors (list to be provided)
  • OOM (we should have some memory reserved for this case at node startup to increase chances to handle OOM)
  • Assertion errors (we should handle assertions as failures in case -ea flag set) (should be covered at Throwable catch for every system worker as well)

Risks and Assumptions

// Describe project risks, such as API or binary compatibility issues, major protocol changes, etc.

...