Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Gliffy Diagram
namenewtimeout


Public Interfaces

This only adds a new producer configuration: 

...

batch.expiry.ms.

Compatibility, Deprecation, and Migration Plan

Path to support current behavior via batch.expiry.ms

...

Public Interfaces

...

 default values
  1. When max.in.flight.request.per.connection==1: batch.expiry.ms=MAX_LONG. Batches will stay in the accumulator as long as progress is being made. Will ensure expiration is also in-order.
  2. When max.in.flight.request.per.connection > 1: delivery and notification ordering is not needed/provided. batch.expiry.ms=request.timeout.

...

Compatibility, Deprecation, and Migration Plan

...

  1. ms. Batches may expire out of order. As batch.expiry.ms will be set dynamically (if not provided explicitly) based on the above settings.
As a consequence, batch expiration delay is coupled with strict ordering is desired or not. This correlation is not obvious but makes sense eventually. In case of guaranteed ordering, head-of-the-line blocking may arbitrarily delay expiration of batches far back in the queue.

Validation

This configuration is backwards compatible. TODO: Consider various combinations of timeouts that don't make sense. (E.g., batch.expiry.ms < linger.ms).

Test Plan

TBD

Rejected Alternatives
Anchor
rejected
rejected

  • Bumping up request timeout does not work well because that is an artificial way of dealing with the lack of an accumulator timeout. Setting it high will increase the time to detect broker failures.
  • In KAFKA-4089 we also considered looking at whether metadata is stale or not to determine whether to expire. This may work to address the problem raised in KAFKA-4089 but is still hard for users to understand without understanding internals of the producer and makes it difficult to put an upper bound on the overall timeout.
  • We cannot repurpose max.block.ms since there are use-cases for non-blocking calls to send.
  • We also discussed the ideal of providing precise per-record timeouts or at least per-batch timeouts. These are very difficult to implement correctly and we believe it is sufficient to provide users with the ability to determine an upper bound on delivery time (and not specify it on a per-record level). Supporting per-record timeouts precisely is problematic because we would then need the ability to extract records from compressed batches which is horribly inefficient. The difficulty is even more pronounced when requests are inflight since batches need to be extracted out of inflight requests. If we are to honor the retry backoff settings then this would mean that we have to split an inflight request with an expiring record or batch into smaller requests which is again horribly inefficient. Given the enormous complexity of implementing such semantics correctly and efficiently, and the limited value to users we have decided against pursuing this path. The addition of an explicit timeout as summarized in this proposal will at least give the users the ability to come up with a tight bound on the maximum delay before a record is actually sent out.
  • Allow batch.expiry.ms to span the inflight phase as well. This won't work because a request would contain batches from multiple partitions. One expiring batch should not cause the other batches to expire, and it is too inefficient to surgically remove the expired batch for the subsequent retry.