Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Discussion threadhere

JIRAhere

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

Zookeeper has advanced low-level primitives for coordinating distributed systems – ephemeral nodes, key-value storage, watchers. Such primitives concepts may not be available in other consensus frameworks. At the same time such low-level primitives (especially ephemeral nodes) are error prone and usually a cause of subtle bugs in Kafka coordination code.

...

Below each module is presented by its interface.

(NOTE: Initial version of the interfaces is in Scala to make it cleaner and shorter. The final version (actual Kafka interfaces) is planned to be written in Java).

 

Code Block
languagescala
titleGroup Membership Protocol
linenumberstrue
collapsetrue
/**
 * A connector for group membership protocol. Supports two modes:
 * 1) "joining" (becoming the member, leaving the group, subscribing to change notifications)
 * 2) "observing" (fetching group state, subscribing to change notifications)
 *
 */
trait GroupMembershipClient {
  /**
   * Each instance of this class is tightly coupled with exactly one group,
   * once set (during initialization) cannot be changed
   * @return unique group identifier among all application groups
   */
  def group: String

  /**
   * Become a member of this group. Throw an exception in case of ID conflict
   * @param id unique member identifier among members of this group
   * @param data supplemental data to be stored along with member ID
   */
  def join(id: String, data: String): Unit

  /**
   * Stop membership in this group
   * @param id unique member identifier among members of this group
   */
  def leave(id: String): Unit

  /**
   * Fetch membership of this group
   * @return IDs of the members of this group
   */
  def membershipList(): Set[String]

  /**
   * Fetch detailed membership of this group
   * @return IDs and corresponding supplemental data of the members of this group
   */
  def membership(): Map[String, String]

  /**
   * Register Apermanent callbackon firedgroup onchange eventlistener.
   */
 There traitis Listenerno {
guarantee listener   /**
     * Eventwill be fired whenon theALL groupevents membership(due hasto changed (member(s) joined and/or left)
  session reconnects etc)
   * @param membershiplistener new membership of the groupsee [[GroupChangeListener]]
     */
    def onGroupChangeaddListener(membershiplistener: Set[String])
  }GroupChangeListener)

  /**
   * Register permanentDeregister on group change listener.
   * There is no guarantee listener will be fired on ALL events (due to session reconnects etc)
   * @param listener see [[ListenerGroupChangeListener]]
   */
  def addListenerremoveListener(listener: ListenerGroupChangeListener)

  /**
   * Deregister on group change listener
   * @param listener see [[Listener]]
   */
  def removeListener(listener: Listener)

  /**
   * Setup everything needed for concrete implementation
   * @param context TBD. Should be abstract enough to be used by different implementations and
   *                at the same time specific because will be uniformly called from the Kafka code -
   *                regardless of the implementation
   */
  def init(context: Any): Unit

  /**
   * Release all acquired resources
   */
  def close(): Unit
}
Code Block
languagescala
titleLeader Election
linenumberstrue
collapsetrue

 
/**
  * A connectorcallback forfired leadershipon electiongroup change event
*/
trait GroupChangeListener {
    /**
     * Event fired when the group membership has changed (member(s) joined and/or left)
     * @param membership new membership of the group
     */
    def onGroupChange(membership: Set[String])
}
Code Block
languagescala
titleLeader Election
linenumberstrue
collapsetrue
/**
 * A connector for leadership election protocol. Supports two modes:
 * 1) "running for election" (joining the candidates for leadership, resigning as a leader, subscribing to change notifications)
 * 2) "observing" (getting current leader, subscribing to change notifications)
 *
 */
trait LeaderElectionClient{protocol. Supports two modes:
 * 1) "running for election" (joining the candidates for leadership, resigning as a leader, subscribing to change notifications)
 * 2) "observing" (getting current leader, subscribing to change notifications)
 *
 */
trait LeaderElectionClient{
  /**
   * Each instance of this class is tightly coupled with leadership over exactly one service (resource),
   * once set (during initialization) cannot be changed
   *
   * @return unique group identifier among all application services (resources)
   */
  def service: String

  /**
   * GetEach current leaderinstance of thethis resourceclass (if any)
   * @return the leader id if it existsis tightly coupled with leadership over exactly one service (resource),
   */
 once def getLeader: Option[String]

  /*set (during initialization) cannot be changed
   *
   * Make@return thisunique candidategroup eligibleidentifier foramong leaderall electionapplication and try to obtain leadership for this service if it's vacant
   services (resources)
   */
  def service: String

  /**
   * @paramGet candidatecurrent IDleader of the candidateresource which is eligible for election(if any)
   * @return truethe ifleader givenid candidateif is now a leaderit exists
   */
  def nominate(candidategetLeader: String): BooleanOption[String]

  /**
   * VoluntarilyMake this resigncandidate aseligible afor leader election and initiatetry newto leader election.
   * Itobtain leadership for this service if it's avacant
 client responsibility to*
 stop all leader* duties@param beforecandidate callingID thisof methodthe tocandidate avoid more-than-one-leader cases
   *which is eligible for election
   * @param@return leadertrue currentif leadergiven IDcandidate (will be ignored if notis now a leader)
   */
  def resignnominate(leadercandidate: String): UnitBoolean

  /**
   * AVoluntarily callbackresign firedas ona leader changeand event
initiate new leader */election.
  trait Listener {
    /**
     * Event fired when the leader has changed (resigned or acquired a leadership)
  * It's a client responsibility to stop all leader duties before calling this method to avoid more-than-one-leader cases
   *
   * @param leader newcurrent leader forID the(will givenbe serviceignored if onenot has been elected, otherwise Nonea leader)
     */
    def onLeaderChangeresign(leader: Option[String])
  }String): Unit

  /**
   * Register permanent on leader change listener
   * There is no guarantee listener will be fired on ALL events (due to session reconnects etc)
   * @param listener see [[ListenerLeaderChangeListener]]
   */
  def addListener(listener: ListenerLeaderChangeListener)

  /**
   * Deregister on leader change listener
   * @param listener see [[ListenerLeaderChangeListener]]
   */
  def removeListener(listener: ListenerLeaderChangeListener)

  /**
   * Setup everything needed for concrete implementation
   * @param context TBD. Should be abstract enough to be used by different implementations and
   *                at the same time specific because will be uniformly called from the Kafka code -
   *                regardless of the implementation
   */
  def init(context: Any): Unit

  /**
   * Release all acquired resources
   */
  def close(): Unit
}
Code Block
languagescala
titleStorage
linenumberstrue
collapsetrue

 
/**
  * InterfaceA tocallback afired (persistent)on keyleader valuechange storageevent
 */
trait StorageLeaderChangeListener {
    /**
   * Get * dataEvent byfired itswhen key
the leader has *changed @param(resigned keyor dataacquired IDa inleadership)
 this storage
   * @return@param leader futurenew resultleader offor the given valueservice (if exists) associated with the keyone has been elected, otherwise None
     */
    def fetch(key: String)onLeaderChange(leader: Future[Option[String])
}
Code Block
languagescala
titleStorage
linenumberstrue
collapsetrue
]

  /**
   * PersistInterface valueto with its associateda (persistent) key. Thevalue contract is to throw an exceptionstorage
 */
trait Storage {
  /**
   * ifGet suchdata keyby already exists
   *its key
   * @param key data ID in this storage
   * @param data value @return future result of the value (if exists) associated with the key
   */
  def putfetch(key: String, data): String)Future[Option[String]]

  /**
   * UpdatePersist value bywith its associated key. The contract is to throw an exception
   * if such key doesn'talready existexists
   *
   * @param key data ID in this storage
   * @param data value associated with the key
   */
  def updateput(key: String, data: String)

  /**
   * Update Setupvalue everythingby neededits forassociated concretekey. implementation
The contract is *to @paramthrow contextan TBD.exception
 Should be abstract* enoughif tosuch bekey used by different implementations anddoesn't exist
   *
   * @param key data ID in this storage
   * @param data value associated  atwith the samekey
 time specific because*/
 will be uniformly called from the Kafka code -def update(key: String, data: String)

  /**
   * Setup everything needed for concrete implementation
   * @param context TBD. Should be abstract regardlessenough ofto thebe implementation
 used by different implementations and
   */
     def init(context: Any): Unit

  /**
   * Release all acquiredat resources
the same time */
specific because def close(): Unit
}
Code Block
languagescala
titleListener Registry
linenumberstrue
collapsetrue
/**
 * A registry for async data change notifications
 */
trait ListenerRegistry {
  /**
   * Register permanent callback for data change event
   * @param key the listenable data identifier
   * @param eventListener see [[ValueChangeListener]]
   will be uniformly called from the Kafka code -
   *                regardless of the implementation
   */
  def addValueChangeListenerinit(keycontext: String, eventListener: ValueChangeListenerAny): Unit

  /**
   * DeregisterRelease permanentall callback for data change eventacquired resources
   */
 @param def close(): Unit
}
Code Block
languagescala
titleListener Registry
linenumberstrue
collapsetrue
/**
 * A registry for async data change notifications
 */
trait ListenerRegistry {key the listenable data identifier
   * @param eventListener see [[EventListener]]
   * @tparam T type of the data ID
   */
  def removeValueChangeListener(key: String, eventListener: ValueChangeListener): Unit
 
  /**
   * Register permanent callback for key-setdata change event
   * @param namespacekey the listenable key-set identifier (e.g. parent path in Zookeeper, table name in Database etc)data identifier
   * @param eventListener see [[ValueChangeListener]]
   */
  def addKeySetChangeListeneraddValueChangeListener(namespacekey: String, eventListener: KeySetChangeListenerValueChangeListener): Unit

  /**
   * Deregister permanent callback for key-setdata change event
   * @param namespacekey the listenable key-set identifier (e.g. parent path in Zookeeper, table name in Database etc)data identifier
   * @param eventListener see [[ValueChangeListenerEventListener]]
   */
  def removeKeySetChangeListenerremoveValueChangeListener(namespacekey: String, eventListener: KeySetChangeListenerValueChangeListener): Unit
 
  /**
   * SetupRegister everythingpermanent neededcallback for key-set concretechange implementationevent
   * @param contextnamespace TBD. Should be abstract enough to be used by different implementations andthe listenable key-set identifier (e.g. parent path in Zookeeper, table name in Database etc)
   * @param eventListener see [[ValueChangeListener]]
   */
  def addKeySetChangeListener(namespace: String, eventListener: KeySetChangeListener): Unit

  at/**
 the same time* specificDeregister becausepermanent willcallback befor uniformlykey-set called from the Kafka code,change event
   * @param namespace the listenable key-set identifier (e.g. parent path in Zookeeper, table name in  regardless of the implementationDatabase etc)
   */
 @param eventListener see [[ValueChangeListener]]
   */
  def init(contextremoveKeySetChangeListener(namespace: String, eventListener: AnyKeySetChangeListener): Unit

  /**
   * ReleaseSetup alleverything acquiredneeded resources
for concrete  */implementation
  def close(): Unit
}

/**
 *@param Callback on value change event
 */
trait ValueChangeListener {
  def valueChanged(newValue: Option[String])
}

/**
 * Callback on key-set change event
 */
trait KeySetChangeListener {
  def keySetChanged(newKeySet: Set[String])
}

 

Compatibility, Deprecation, and Migration Plan

context TBD. Should be abstract enough to be used by different implementations and
   *                at the same time specific because will be uniformly called from the Kafka code,
   *                regardless of the implementation
   */
  def init(context: Any): Unit

  /**
   * Release all acquired resources
   */
  def close(): Unit
}

/**
 * Callback on value change event
 */
trait ValueChangeListener {
  def valueChanged(newValue: Option[String])
}

/**
 * Callback on key-set change event
 */
trait KeySetChangeListener {
  def keySetChanged(newKeySet: Set[String])
}

 

Compatibility, Deprecation, and Migration Plan

Shared interface for plugable consensus and metadata storage systems should be compatible for Zookeeper-based implementation. Also this implementation will likely be the default one.

As part of this KIP it will be required to rework some system and replication tools. It will not be possible anymore to rely on Zookeeper as a default metadata storage system, also it will not be possible to use it to trigger particular administrative commands. Most of the tools are related to topic management (create topics, reassign partitions etc) and consumer group management (offset checker etc).

The approach to topic tools is covered in KIP-4 - we will move all administrative logic to brokers. KIP-4 is currently under development and has agreed Wire Protocol changes.

The consumer group tools should be covered separately. Having New Java Consumer in 0.9 release with server-side coordinator may let us deprecate old consumer and thus all tools related to it. Consumer group tools should work as usual if brokers are run with Zookeeper based implementation of the shared interfaceThe use of zookeeper.connect in server.properties will still be honored. In the code we should have the default value of the new configuration.

Rejected Alternatives

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.

...