Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Info

STATUS: Draft, WIP


This document captures requirements for making AI actions a first-class citizen in Openwhisk. The main ideas captured bellow are inspired from the work done in running deep learning inferences on images and videos.

...

  • A new action kind ( python3-ai ) that contains most frameworks used for ML/DL ( i.e. scipy, tensorflow, caffe, pytorch, and others )
  • GPU support
  • tmpfs for sequences and compositions compositions / "shared memory FaaS" ( Carlos Santana) / "serverless datastore" ( Rodric Rabbah )
  • Higher memory limits ( 4G+)
  • Longer execution timeDisk space allocation

New action kind - "python3-ai"

...

Send Asset by Reference

Developers can use an intermediate a 3rd party blob storage such as AWS S3 to upload the asset, and pass it to the next action using the URL to the asset, as "the reference" to it. The problem with this workaround is that the size of the asset influences the performance. The bigger the asset, the bigger the impact. If the asset is 2GB, then uploading and downloading to a blob storage on each activation may add up to a minute to the execution time.

...

The problem with this approach is the network bottleneck; even if the action ends up coincidentally on the same host with other actions in the sequence, it would still consume network bandwidth to write or read an asset from the cluster cache, hence it's dependent on the speed of the network performance.

Streaming responses between actions

In this option there's no persistence involved, but actions may communicate directly, or through a proxy that can capable to stream the response.

  • Direct action-to-action communication. The most performant option would be to allow actions to communicate directly, and ideally schedule them on the same host. The

...

  • problem introduced by this approach is the network isolation. Usually OpenWhisk deployments create a network perimeter around each action, preventing it to poke other actions in the cluster. For action-to-action communication to be opened, each namespace should probably have its own network perimeter.
  • Action-to-proxy-to-action. This is similar to how Kafka is being used currently. But to workaround kafka payload limits, another proxy could be used. The communication should be secured in this case as well, so that only the allowed actions can communicate.      

Higher memory limits

Problem

Some DL/ML algorithms need more memory to process a data input. Existing limits are too low.

Workarounds

Configure Openwhisk deployments with higher memory limits. 

Proposal

TBD:  allocation implications in Scheduler / Load Balancer. 

Longer execution time 

Problem

During a clod start, an AI action may have to do a one-time initialization, such as downloading a model from the model registry. 

Or it may take longer for some actions to perform an operation

Workarounds

Configure OpenWhisk deployment with higher execution time settings ( CONFIG_whisk_timeLimit_max )

Proposal

TBD: side effects for increasing the timeout.

...