Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Gliffy Diagram
nameFlow New
pagePin6

Data Injection:

In the context of this Test framework an input stream means a stream that Samza job may consume from and an output stream means a stream that Samza job can produce to. The initial source of data stream is always supposed to be bounded since this is a test, and that may originate as one of the following sources listed below.

...

Shown below is a targeted test matrix plan to test various components of Samza in p2.

nnumber of threads

x = number of partitions 

m = number of containers

n
Samza APIConcurrencyPartitionsContainerExpected Result
StreamTask
task.max.concurrency = 1
job.container.thread.pool.size = n
1 <= p <= nx1 / nmin order processing
   
task.max.concurrency > 1
job.container.thread.pool.size = n
1 <= p <= nx1 / nmout of order processing
   
AsyncStreamTask
task.max.concurrency = 11 <= p <= nx1 / nmin order processing
task.max.concurrency = n1 <= p <= nx1 / nmout of order processing
Windowable TaskN/A1 <= p <= nx1 / nmexpecting processing n messages for messages window of time t
Initiable TaskN/A1 <= p <= nx1 / nmstateful testing (assertions on kv store)
Closable TaskN/A1 <= p <= nx1 / nmverify closing the client?

Map / Flatmap / Filter /

Partition By / Merge /

SendTo / Sink / Join /

Window

 

task.max.concurrency = 1

job.container.thread.pool.size = n

1 <= p <= nx 1 / min order processing

task.max.concurrency > 1

job.container.thread.pool.size = n

out of order processing

...