...
For in memory streams the api actually initializes the in memory stream and spins by spinning up a Samza producer using an InMemorySystemProducer to write the stream, this is how a collection of data or events is gets initialized as a steam. It also configures any output stream if the user configureshas added any.
Data Transformation:
This is the Samza job you write, this can be either done using Low Level Api or the fluent High Level Api. Test frameworks provides api to set up test for both the samza apicases. Test framework supports both the api apis with Single container and Multi-container mode. Users implement StreamTask and Async Stream task in the same way, as they do for their Samza job, and they pass it along to the framework. For high level api
Data Validation:
...
level api users don't need a class implementing StreamApplication, they just configure message streams of any type and apply operators on it directly (see the sample example below).
Data Validation:
Data Types & Partitions:
Samza provides complete flexibility in usage of different data types for input steams, this The framework will also provide complete flexibility for usage of primitive and derived data types. Test framework will provide api's for initialization of input streams (data injection), read from/write to Serdes are required for local kafka stream and file stream but in memory streams dont require any Serde configuration. Test framework will provide api's for initialization of input streams (both single and multi-partition), and also data validation on single partition and multi-partition bounded streams (data transformation) and verification of expected to actual results (data validation)of the bounded streams
Running Config
Traditionally we ask users to set up config for any samza job, for test purposes we set up basic config boiler plate for users and provide them a flexible option to still add any custom config (rarely needed), api exposes functions to configure single container or multi container mode (using Zookeeper). It also provides apis to configure concurrency semantics for the job.
them a flexible option to still add any custom config (rarely needed), api exposes functions to configure single container or multi container mode (using Zookeeper). It also provides functions to configure concurrency semantics for their Samza job.
Future Changes with Stream Descriptors:
The test framework is designed in a way which asks users to do none or minimal Samza configs, in future we intend to use StreamDescriptors in the test framework to do Samza configs. This change would cause a small change in the way user passes their custom configs (if any) to the test framework
Public Interfaces
Two Apis for writing tests are: Low Level Test Api (TestTask) & High Level Api (TestApplication)
...
Implementation and Test Plan
Implementatio
Compatibility, Deprecation, and Migration Plan
...