Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • This system does not intend to solve the problem of Dynamic Workload Balance i.e  Cruise control. It may act as a building block for one later.
  • Solving the canary problem for YARN based deployment model is out of the scope of this solution however system built should be easily extensible to support canary 
  • This system will not have built-in intelligence to find a better match for the host for a container it will make simplistic decisions as per params passed by the user.

SLA / SCALE / LIMITS (Assumptions)

  • At a time AM for a single job will only serve one request per container, parallel requests across containers are still supported. If a control request is underway any other requests issued on the same container will be queued. Same assumption holds for in-flight requests on standby and active i.e if any container placement request is in-progress for an active or its standby replica, all subsequent placement actions on either are queued
  • Actions are de-queued and sorted in order of timestamps populated by the client and are executed in that order
  • The system should be capable of scaling to be used across different jobs at the same time 

...