Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Application dependencies should not be able to impact the Samza cluster-based job coordinator
  • Solution should be leverageable for the Samza logic running on processing containers

Design

New configs

Config keyDescription
samza.cluster.based.job.coordinator.dependency.isolation.enabledSet to "true" to enable cluster-based job coordinator dependency isolation
yarn.resources.__samzaFrameworkApi.pathPath to the Samza framework API resource
yarn.resources.__samzaFrameworkApi.*Any other YARN resource configurations for the Samza framework API resource
yarn.resources.__samzaFrameworkInfrastructure.pathPath to the Samza framework infrastructure resource
yarn.resources.__samzaFrameworkInfrastructure.*Any other YARN resource configurations for the Samza framework infrastructure resource

Existing JAR management

Currently, Samza infrastructure code and dependencies are included in the tarball with the Samza application. This means that conflicting dependencies between the application and Samza are resolved at build time before the tarball is created, which can cause a certain version of a dependency to be excluded. All JARs in the tarball are installed into a single directory for classpath generation and execution.

...

Generating the Samza API whitelist

In order to load the Samza API classes from the API classloader, we need to tell cytodynamics what those classes are. We can do this by providing a whitelist of packages/classes when building the cytodynamics classloader. All public interfaces/classes inside of samza-api should be considered an API class. One way to generate this whitelist is to use a Gradle task to find all the classes from samza-api and put that list in a file. Then, that file can be read by Samza when constructing the cytodynamics classloader. The Gradle task should also include classes from samza-kv.

...

  1. Continue to use "yarn.package.path" for the application package.
  2. Set "yarn.resources.__apisamzaFrameworkApi.path" to the path for the API package.
  3. Set "yarn.resources.__infrastructuresamzaFrameworkInfrastructure.path" to the path for the infrastructure package.

...