Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Add potential backends.

...

I believe we can (and should) converge on an architecture that abstract the VM/bare-metal case away and give the Controller direct access to the containers for each action. Lots of proposals out there already go into that direction (see: Problem Analysis - Autonomous Container Scheduling v1, Clustered Singleton Invoker for HA on Mesos) and it is a natural fit for the orchestrator based deployments, which the whole community seems to move to.

...

The overall architecture revolves around the Controller talking to user-containers directly via HTTP (similar to Autonomous Container Schedulinga clear distinction between user-facing abstractions (implemented by the Controller as of today) and the execution system, which is responsible for loadbalancing and scaling. On the hot path, only HTTP is spoken directly into the containers (similar to Problem Analysis - Autonomous Container Scheduling v1 in a sense). The Controller orchestrates the full workflow of an activation (calling `/run`, getting the result, composing an activation record, writing that activation record). This eliminates both Kafka and the Invoker as we know it today from the critical path. Picking a container boils down to choosing a free (warm) container from a locally known list of containers for a specific action in the ContainerRouter.

Container creation happens at a component called ContainerManager (which can reuse a lot of today's invoker's code, like the container factories). The ContainerManager is a cluster-singleton (see Clustered Singleton Invoker for HA on Mesos). It talks to the API of the underlying container orchestrator to create new containers. If the Controller ContainerRouter has no more capacity for a certain action, that is, it exhausted its containers known for that action, it requests more containers from the ContainerManager.

...

draw.io Diagram
bordertrue
viewerToolbartrue
fitWindowfalse
diagramNameflow
simpleViewerfalse
diagramWidth563
revision3

Invocation path in steady state


2. Loadbalancing:

The Controllers Loadbalancing is no longer a concern of the Controller, this responsibility moves into the ContainerRouter. The ContainerRouter no longer need to do any guesstimates to optimise container reuse. Controllers They are merely routers to containers they already know are warm, in fact, the Controllers only ContainerRouters only know warm containers.
Loadbalancing is more a concern of container creation time, where it is important to choose the correct machine to place the container on. Therefore, the ContainerManager can exploit the container orchestrator's existing placement algorithms without reinventing the wheel or can do the placement itself when we don't have an orchestrator running underneath.

...

3. Scalability:

The Controllers and ContainerRouters in this proposal need to be massively scalable, since they handle a lot more than they do today. The Controllers have no state anymore, they can scale freely. The only state known to them the ContainerRouters is which containers exist for which action.
As each of those containers can only ever handle C concurrent requests (where C in the default case is 1), it is of utmost importance that the system can **guarantee** that there will never be more than C requests to any given container. Therefore, shared state between the Controllers is ContainerRouters is not feasible due to its eventually consistent nature.
Instead, the ContainerManager divides the list of containers for each action distinctively across all existing ControllersContainerRouters. If we have 2 Controllers and ContainerRouters and 2 containers for action A, each Controller will ContainerRouters will get one of those containers to handle himself.

Edge case: If an action only has a very small amount of containers (less than there are Controllers in ContainerRouters in the system), we have a problem with the method described above. Since the front-door schedules round-robin or least-connected, it's impossible to decide to which Controller the request needs to go to hit that has a container available.
In this case, the other Controllers ContainerRouters (who didn't get a container) act as a proxy and send the request to a Controller that ContainerRouters that actually has a container (maybe even via HTTP redirect). The ContainerManager decides which Controllers will ContainerRouters will act as a proxy in this case, since its the instance that distributes the containers.

...

draw.io Diagram
bordertrue
viewerToolbartrue
fitWindowfalse
diagramNameedgecase
simpleViewerfalse
diagramWidth451
revision23

4. Throttling/Limits:

Limiting in OpenWhisk should always revolve around resource consumption. The per-minute throttles we have today can be part of the front-door (like rate-limiting in nginx) but should not be part of the OpenWhisk core system.
Concurrency throttles and limits can be applied at container creation time. When a new container is requested, the ContainerManager checks the consumption of containers for that specific user. If she has exhausted her concurrency throttle, one of her own containers (of a different action) can be evicted to make space for that new container. If all containers are already of the same action, the call will result in a `429` response.

...

The rest of the container lifecycle is responsibility of the ControllerContainerRouters. It instructs the ContainerManagerAgent to pause/unpause containers. Removal (after a certain idle time or after a failure) is requested by the Controller from ContainerRouters from the ContainerManager again. The ContainerManager can also decide to remove containers, when it needs to make space for other containers. If it comes to that conclusion, it will first ask the Controller owning ContainerRouters owning this Container to give it back (remove it from its list effectively). Only then can the container be removed.

Not very fleshed out either, mostly a suggestion. We need to make sure the network traffic generated from the Controllers to ContainerRouters to all the containers to lifecycle the containers is viable. Under bursts though, that should not matter because of the pause grace.

...

The path to optimize gets a lot thinner which could result in easier trackable performance bottlenecks and less code to optimize for performance. In this architecture, all deployment cases look at the very same path when it comes to optimizing burst performance for instance.

Possible execution backend implementations

  1. Raw VM/Bare-metal based: The ContainerManager and ContainerManagerAgent work in conjunction to create containers. The ContainerManager will need to implement a scheduling algorithm to place the container intelligently.
  2. Kubernetes based: The ContainerManager talks to the Kubernetes API to create new Pods.
  3. Mesos based: The ContainerManager talks to the Mesos API to create new containers.
  4. Knative based: The whole execution layer above vanishes and is replaced by Knative serving, which implements all the routing, scaling etc. itself.