...
List of failures should be covered by this engine:
- Critical Errors
- Persistence errors
- IOOM errors (part of persistence errors?)
- Critical system workers crashes
- Assertion errors (should be covered at Throwable catch for every system worker as well)
- etcSegmentation
List of system workers should be covered by this engine:
- disco-event-worker
- tcp-disco-sock-reader
- tcp-disco-srvr
- tcp-disco-msg-worker
- tcp-comm-worker
- grid-nio-worker-tcp-comm
- exchange-worker
- sys-stripe
- grid-timeout-worker
- db-checkpoint-thread
- wal-file-archiver
- ttl-cleanup-worker
- nio-acceptor
List of errors to be handled
- Persistence errors
- IOOM errors (part of persistence errors?)
- IO errors (list to be provided)
- OOM (we should have some memory reserved for this case at node startup to increase chances to handle OOM)
- Assertion errors (we should handle assertions as failures in case -ea flag set) (should be covered at Throwable catch for every system worker as well)
Risks and Assumptions
// Describe project risks, such as API or binary compatibility issues, major protocol changes, etc.
...