Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Invoke an authN/authZ service, reusing the existing authN/authZ implementation in OpenWhisk
  • it may be possible to implement this inside the proxy itself. i.e. NGINX supports extensibility through lua thanks to Openresty.
  • Routing 
    • Including support for sequences
  • Throttling 
    • Respect namespace limits
    • Respect Action level concurrency 
  • Caching the response, based on what the action returns. I.e. an action that validates an OAuth token could instruct the system to cache the response for that token until the token expiration time.it expires.
  • Support API Management
    • Otherwise the existing OpenWhisk Gateway can be reused
  • Support for Observability: Metrics, activation info, tracing, etc

...

  1. The request arrives from a client
  2. Authentication and Authorization
    1. The Container Router validates the Authorization header with OpenWhisk Auth Service
    2. The response of the Auth Service is cached 
  3. Routing
    1. Check namespace limits
    2. Forward the request to a container selected from a list of warmed actions that the Action Router keeps. 
      1. (new) Streaming the request to the action would be a nice; OpenWhisk doesn't have support for this, and such feature could remove the max payload limits
      2. (new) Websockets could also be supported, another missing feature in OpenWhisk.
  4. Container Proxy sidecar
    1. Check action concurrency limit
    2. Buffer a few more requests, queueing them into an overflow buffer; this may be something useful when cold-start could take longer than just queuing a few more requests. Blackbox actions that need to download the docker image may benefit from this more. This idea is inspired from KNative Serving
  5. Invoke the action and return the response
    1. (new) Caching the action response could be another nice to have feature, which is not implemented in OpenWhisk. Caching should be controller by the action response.
  6. Collect activation info.
  7. Sequence support. If the action is part of a sequence, then the Router should have logic to invoke the next action in the sequence.

...

When the Action Proxy is at capacity, it should return a 429 message back to the Container Router. A Retry-After  header could specify <delay-seconds>  or <http-date>  for a CircuitBreaker in the ContainerRouter so the router doesn't retry againto avoid routing to that action. The time window for retry should ideally be computed from the response times observed by the Container Proxy. TBD

Gliffy Diagram
nameOpenWhisk-ColdStart-ControlPlane
pagePin2


The green steps are additional steps required for cold-start:

4. Container Proxy returns a 429  indicating the action has reached its max concurrency and can't take more activations. If there's no container running for that action, skip to step 5.

5. Container Router goes to the DistributedContainerPool  to request a new container to be created

6. After the container is created, all Container Router instances are informed, and the activation proceeds as in the Flow for the warm container described above.

Control Plane

Candidates:

  • TBDOpenWhisk Controller and Invoker - refactored into a single service that meets the requirements

Control Plane concerns:

  1. Cold-start actions - allocate resource
  2. Garbage Collect idle actions - de-allocate resources

The Control Plane should be used by the Data Plane only when cold-starting new actions.

DistributedContainerPool

This Component is at the core of the Control Plane.

...

 It should be concerned with the following:

  • globalPool
    • Cluster Wide view of all running actions
    • Distributed Map with minimum data about actions needed for ResourceAllocator and GC
    • it should sync with Kubernetes from time to time to update the state, in case a container dies, or a Kubernetes operation kills that container
  • resourceAllocator - SingletonActor
    • It’s in charge to start containers on a node that has resources
    • When allocating resources, Placement Strategies should consider CPU, MEM, GPU, Network, and other resources an action might consume.

...

GC

...

  • garbageCollector - SingletonActor
    • it removes idle actions
    • It needs to be a singleton so that when deciding what resource to free, in can avoid fragmentation. In other words, it should free resources to make the free space as compact as possible.  
      • This is particularly important when scaling down the nodes running actions
    • Its implementation should be configurable and swappable 

Management Plane

It This can reuse the OpenWhisk implementation. 

Candidates:

  • OpenWhisk Controller, slimmed for Management APIs

Previous discussions

Provide support for integration with Kubernetes. One approach could be to deploy and run the components on a Kubernetes provider as we do for Vagrant, Docker, Docker-Compose, and OpenStack.

...