This proposal is to change invoker such that Invoker user-memory config is not relied on. Specifically :
- ContainerPool can examine some cluster stats (NodeStats) to determine estimated ability to launch a container at any time
- ContainerFactory can emit NodeStats based on the cluster manager impl (mesos, k8s, yarn, etc)
Note: to keep changes isolated to Invoker, we assume that whisk.container-pool.user-memory is set to a large value, to the point where Controller will always send activations to "home invoker", until that invoker becomes Unresponsive. This can be enhanced in the future to potentially:
- track additional invoker states (low memory, too many activations, too large activation requests/responses)
- shard based on other details (number of activations, activation request size, action code size, etc)
Main changes are to ContainerPool:
- receive and track NodeStats for purpose of estimating launchability of containers
- track pending container starts/stops (Reservations) for purpose of avoiding oversubscription of cluster resources (which would result in potentially significant wait for cluster manager to determine that resources are not available)
ContainerFactory:
Once ContainerPool changes are agreed, it ContainerFactory impls can be modified to make NodeStats data available, based on whatever means is convenient for their impl.
Diagram of changes: