You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »

Status

Authors: Jian He, Weiqing Yang

Current state: UNDER DISCUSSION

Discussion thread: <link to mailing list DISCUSS thread>

JIRA Unable to render Jira issues macro, execution error.

Released: 

Problem

Currently Samza applications can be run on YARN or in standalone mode. This SEP enables Samza on Kubernetes. 

Background

This section describes a bit of background of Kubernetes. If you are familiar with Kubernetes, you can skip this section.

Kubernetes is a container orchestration system for automating deployment, scaling, and management of containerized applications. Kubernetes consists of Master components and Node components.


Figure 1. Kubernetes architecture

Master Components

  • API-server: Component on the master that exposes the Kubernetes API. It is the front-end for the Kubernetes control plane.

  • Scheduler: Component on the master that watches newly created Pods that have no node assigned, and selects a node for them to run on.

API-server + Scheduler is similar to the ResourceManager in YARN. The difference is that API-server and scheduler runs in separate containers/processes, whereas ResourceManager is a single process.

  • Controller/Operator: Kubernetes ships built-in controllers such as StatefulSet that ensures the number of replicas for an application is up and running.  Users can also develop their own operators (The idiom in Kubernetes community to usually call user-developed controllers as operator). In this case, the operator is similar to the AM in YARN.

Node Components

  • Kubelet: An agent that runs on each node in the cluster. It makes sure that containers are running in a Pod. This is similar to NodeManager in YARN.

  • Pod: A group of one or more containers, with shared storage/network, and a specification for how to run the containers (refer to here). Note that Pod is similar to the container concept in YARN.

  • Container-runtime: The container runtime is the software that is responsible for running containers, most notably, docker.

Proposed Changes

The Samza Operator, similar to the Samza AM in YARN, is the control hub for Samza applications running on Kubernetes. It is responsible for requesting Pods from Kubernetes and coordinating work assignment across Pods.  

Below graph describes the lifecycle of a Samza application running on Kubernetes.


Figure 2. Lifecycle of Samza applications running on Kubernetes


  • The run-app.sh script is started providing the location of your application’s binaries and its config file. The script instantiates an ApplicationRunner, which is the main entry-point responsible for running your application.

  • The ApplicationRunner parses your configs and writes them to a special Kafka topic named - the Coordinator Stream for distributing them. It proceeds to submit a request to Kubernetes API-server to launch the Samza-Operator Pod.

  • The Samza-Operator Pod (The AM, in YARN’s parlance) is started, It is then responsible for managing the overall application. It reads configs from the Coordinator Stream and computes work-assignments for individual Pods.

  • It also determines the hosts each Pod should run on taking data-locality into account. It proceeds to send Pod creation requests to API-server.

  • The Kubelet will watch the requests and start the task Pods. If the application’s dependencies are hosted in remote artifact repositories like HDFS. They need to be downloaded to the nodes first. How to download?

    • M1: the task Pod can leverage the Kubernetes Init-container functionality to download the dependencies.

    • M2: the regular container can download the dependencies first before executing the core logic.

      • M1 vs M2: The Init-containers is ensured to be run before regular containers. In M1, if the regular container fails, the Init-container will not be re-run.  In M2, if the regular container fails, it needs to handle the case to not re-run the logic to download the resources.

    • M3: the other way is to pre-bake all the dependencies into the container image itself, but that is less flexible as it requires all the code, configs to be available in the image. Regardless of M1 or M2, this method can always be used.

  • When the task Pod is started, each Pod first queries the Samza Operator to determine its work-assignments and configs. It then proceeds to execute its assigned tasks.

  • The Samza Operator does the typical control-loop pattern, ensures the current state matching the desired state. e.g. It monitors how many task Pods are alive and creates new Pods to match the desired replicas.

Host Affinity & Kubernetes

This document describes a mechanism to allow Samza to request containers from YARN on a  specific machine. This locality-aware container assignment is particularly useful for containers to access their local state on the machine. This mechanism leverages a feature in YARN to be able to request container by hostname.

Similar primitive is provided in Kubernetes to allow users to request pods by hostname. This document describes the feature. Particularly, “preferredDuringSchedulingIgnoredDuringExecution” policy can be used to “run my pod on host X, if not satisfied, run it elsewhere.”

Alternatively, If a remote storage, instead of local, can be used for persisting Samza task state, the goal of container-state-rebinding can be achieved by dynamically attach the remote storage to the container even if the container is restarted on a different host, by leveraging the Kubernetes PersistentVolume primitive. This is usually useful in a cloud environment where remote storage is typically accessible.

Implementation

  1. Prepare a base container image for Samza application including all the Samaza framework jars etc.

  2. The run-app.sh and ApplicationRunner needs to be modified to support submitting apps to Kubernetes api-server.

  3. Develop a Samza Operator, similar to YARN AM, that creates task Pods in Kubernetes.

  4. Develop a module that can download the dependencies from remote artifact repositories. The module can be in the init-container or embedded in the main/regular container.

Interfaces

WIP .. Coming soon

Reference


  • No labels