Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This document describes the next version of Autonomous Container Scheduling.

Table of Contents

Basics

Currently, discussion and the implementation for future architecture of OW are in progress.

...


Following sections describe more details about each part in ACS.

1. Segregation of container creation and activation processing

Activation lives in the world of 1~2 figures of ms.

...

If there is no container or existing containers are not enough to handle all incoming activations in time, it will send ContainerCreation requests to an invoker via invokerN topic.

2. Location-independent scheduling.


It is critical for OW to reuse the container for better performance.

...

In turn, traffic is more evenly distributed among invokers with location-independent scheduling.

3. MessageDistributor


In Kafka, the unit of parallelism is the number of partitions.

...

This guarantee there will be the same number of MessageDistributor running with the number of invoker slots even in the worst case.

4. ContainerProxy Lifecycle changes


Even pausing/unpausing the container could also be overheads because it is relatively slower than action invocation.

...

ContainerProxy will wait for up to 10s(configurable) and if there is no subsequent request, it is just terminated.


5. ETCD


In OpenWhisk, many cluster-wise resources and states are being shared asynchronously.

...

Since controllers will always to try to send the request to the invoker with the least loads, that indicates there are not enough resources in the cluster and it is the time to add more invokers.

6. ActionMonintor


In ACS, once a container is running, it handles the requests in a best-effort manner.

...

Since there will be limited number of action kinds that a controller will handle at the same time, we can safely assume there will be limited number of ActionMonitor in each controller at some point.

7. Changes in Throttler.


In current codes, a throttler checks the number of invocation per minute and the number of concurrent invocation.

...

And it only happens when there is already the max number of containers running.


8. ETC

8.1 Handling of Action updates

An action can be updated at any time even during it is being invoked.

...

Info

This kind of issue makes me feel the needs for routing components. Since OW rely on Kafka for routing, it is limited to route activations with our own rules. And this is the one of the reason why I become a fan of future architecture of OW.


8.2 Drawbacks of ACS

ACS is not an all-round scheduler. It has a few drawbacks.

1. It is more effective for short-running actions.

For long-running actions such as 1 minute-long, it would be best to create a container for each invocation request.

...

But still it is obvious that there is a space to optimize it.


2. Since a container is asynchronously created, the first invocation can take a little bit more time.

When the first ActionMonitor is created, there will be no container. Then ActionMonitor just sends ContainerCreation to invokers.

...

But subsequent invocation is faster than current scheduler because there is no other logic associated in the activation path.


3. In the big cluster(e.g: # of controllers/invokers > 500~1000), it might not be effective because it rely on ETCD transaction and Kafka partitions.

As more and more partitions are used, it will take more time to rebalance the consumers in a group.

...