Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In order to sustain a higher throughput, you would typically use an asynchronous embedded producer and it should be configured to be in blocking mode (i.e., queue.enqueueTimeout.ms=-1). This recommendation is to ensure that messages will not be lost. Otherwise, the default enqueue timeout of the asynchronous producer is zero which means if the producer's internal queue is full, then messages will be dropped due to QueueFullExceptions. A blocking producer however, will wait if the queue is full, and effectively throttle back the embedded consumer's consumption rate. You can enable trace logging in the producer to observe the remaining queue size over time. If the producer's queue is consistently full, it indicates that the migration tool is bottle-necked on re-publishing messages to the local (mirror) target cluster and/or flushing messages to disk.

...

You can use the --num.producers option to use a producer pool in the mirror maker migration tool to increase throughput. This helps because each producer's requests are effectively handled by a single thread on the receiving Kafka broker. i.e., even if you have multiple consumption streams (see next section), the throughput can be bottle-necked at the handling stage of the migration tool's producer requests.

Number of consumption streams

Use the --num.streams option to specify the number of consumer threads to create. Note that if you start multiple migration tool processes, you may want to look at the distribution of partitions on the source cluster. If the number of consumption streams is too high per migration tool process, then some of the consumer threads will be idle by virtue of the consumer rebalancing algorithm (if they do not end up owning any partitions for consumption).

...