Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

...

Note that the E2E mechanics - notably the reliance on the master for ACK delivery - is being broken out into a separate discussion and is out of scoop scope for this document. That said, it may be worth spending time to at least discuss whether or not we want to deal with E2E mode in the context of a new master. It may be more work to rearchitect the master while preserving the current ACK mechanism than to simply address ACK delivery concurrently. This is something that should be discussed prior to beginning development as to have a clear path through implementation.

The primary JIRA tracking this work is https://issues.cloudera.org/browse/FLUME-617Image Removed

High Level Options

There are a few possible approaches with varying degrees of effort and functionality.1.

  1. Allow all masters to remain active, pushing all state into ZK so it's shared between them. Clients retain the list of all possible masters and pick one at random to connect to. Deal with E2E ACKs by pushing them into ZK.

...

  1. Have masters go through an election process, push all state into ZK so it's shared between them in the case of failure. Clients no longer contain the list of masters and instead contain the ZK quorum node list. The current master is fetched from ZK. Deal with E2E ACKs by pushing them into ZK so in the case of master failover, no ACKs are lost.

...

  1. Have masters go through an election process, push all state into ZK so it's shared between them in the case of failure. Clients no longer contain the list of masters and instead contain the ZK quorum node list. The current master is fetched from ZK. Deal with E2E ACKs by simply letting them expire and be retransmitted in the case of master failover.