You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

Goals

OpenWhisk-w-ClusterManager-highlevel

  • Try to reuse Kubernetes components in OW
  • Keep the existing developer experience and CLI

Current deployment

OpenWhisk-with-ClusterManager-simplified  

Originally OpenWhisk was built with the assumption that each Invoker is responsible for a single VM in the cluster. With a Cluster Manager, this premise changes, as a single Invoker could be in charge of the entire cluster.  The Cluster Manager is responsible for each VM. From the Invoker's perspective, the entire cluster looks like a single pool of resources.

The current OpenWhisk components, Controller and Invoker, have problems managing the same pull of resources. For example:

  • when 2 or more Invokers manage the same resources, conflicts may arise due to the fact that Invokers don't share any model
  • the load balancing logic in the Controller becomes less important given than it doesn't matter which Invoker executes a given action, because it will still execute it on the same pool of resources
  • the max memory limit set per invoker is also not useful  

CNCF based Architecture

Given these new premises, and the experience of building a FaaS solution that the OpenWhisk community has, can the OW system benefit from existing CNCF projects to simplify its implementation, while keeping the same developer experience ?

This document looks at some possible options to achieve this with Kubernetes and other solutions from the CNCF landscape.

Management, Control, and Data Plane

OpenWhisk system can be decomposed in 3 distinct areas of concern, inspired from Network Devices and Systems designs.


OpenWhisk-ManagementControlData-plane

Management Plane

The management layer exposes an API that is primarily serving developers that manage actions and APIs. The wsk  CLI interacts with this layer.

OpenWhisk operators may interact with this layer to manage namespaces too. 

Control Plane

The control plane layer has 2 main responsibilities: cold-starting new actions, and removing idle actions (Garbage Collector). 

This layer could be implemented by reusing the JVM architecture. In a JVM architecture the memory is allocated and is then freed with the help of a GC component. Similarly, a FaaS system needs to allocate resources such as cpu, memory, disk space to each action, then, when an action becomes idle, the FaaS system needs to scale it down to zero, freeing the resources.

The control plane should:

  • use the Cluster Manager's API to start and stop actions;
  • inform the Data Plane when starting or removing an action;
  • provide a configurable GC which should avoid fragmentation where possible; the less fragmentation the more compact the pool of resources is.  
  • use a mark-and-sweep GC logic to remove containers, to allow enough time for the Data Plane to stop sending traffic to the actions marked for removal

The Control Plane should provide an API used by the Data Plane to cold-start actions, and it should also emit events each time a change in the resource allocation happens; each time GC removes idle containers, or each time a new action is created, the Control Plane should notify all Data Plane instances of such changes. 

Data Plane

The data plane layer invokes actions as fast as possible. When an action needs to be cold-started, the data plane delegates this to the Control Plane, awaiting for the action to become ready before invoking it. Once an action is warmed-up the data plane is notified, and if it was waiting for such event in order to invoke an activation, it should resume the execution. 

The Data Plane invokes warmed actions without going to the Control Plane. The only time Control Plane is used in an activation flow, is when a cold-start is required.

The Data Plane should stop sending traffic to actions that are marked for removal by the Control Plane. The only exception is when an action marked for removal receives an activation in the mean time, in which case the Data Plane informs the Control Plane, which  may choose to remove the "mark for removal" and keep the action running, or recycle the action with a new one.  

This layer should have support for sequences, and for ensuring the default FaaS execution model which sends only 1 request at a time to an action.   


CNCF Projects to integrate with

TBD


Previous discussions

Provide support for integration with Kubernetes. One approach could be to deploy and run the components on a Kubernetes provider as we do for Vagrant, Docker, Docker-Compose, and OpenStack.

Proposal to be discussed on the dev list https://lists.apache.org/thread.html/66b2111f8edb4a44466728d697c735f549971909600a02f1b585a9e7@%3Cdev.openwhisk.apache.org%3E



  • No labels