Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Discussion threadhere

JIRAhere

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

Zookeeper has advanced low-level primitives for coordinating distributed systems – ephemeral nodes, key-value storage, watchers. Such primitives concepts may not be available in other consensus frameworks. At the same time such low-level primitives (especially ephemeral nodes) are error prone and usually a cause of subtle bugs in Kafka coordination code.

...

Below each module is presented by its interface.

(NOTE: Initial version of the interfaces is in Scala to make it cleaner and shorter. The final version (actual Kafka interfaces) is planned to be written in Java).

 

Code Block
languagescala
titleGroup Membership Protocol
linenumberstrue
collapsetrue
/**
 * A connector for group membership protocol. Supports bothtwo partsmodes:
 * 1) "joining" (becoming the member, leaving the group, subscribing to change notifications)
 * 2) "observing" (fetching group state, subscribing to change notifications)
 *
 */
trait @tparamGroupMembershipClient K{
 type of/**
 the member ID* -Each unique identifier among members instance of this group
class *is @tparamtightly Vcoupled typewith ofexactly theone additionalgroup,
 data that comes* withonce ID
 */
trait GroupMembershipClient[K, V] {
  /**set (during initialization) cannot be changed
   * Each@return instanceunique of this class is tightly coupled with exactly one group,
   * once set (during initialization) cannot be changed
   * @return unique group identifiergroup identifier among all application groups
   */
  def group: String

  /**
   * Become a member of this group. Throw an exception in case of ID conflict
   * @param id unique member identifier among members of this group
   * @param data supplemental data to be stored along with member ID
   */
  def join(id: KString, data: VString): Unit

  /**
   * Stop membership in this group
   * @param id unique member identifier among members of this group
   */
  def leave(id: KString): Unit

  /**
   * Fetch membership of this group
   * @return IDs of the members of this group
   */
  def membershipList(): Set[KString]

  /**
   * Fetch detailed membership of this group
   * @return IDs and corresponding supplemental data of the members of this group
   */
  def membership(): Map[KString, VString]

  /**
   * ARegister permanent callbackon firedgroup onchange eventlistener.
   */
 There traitis Listenerno {
guarantee listener will be /**
fired on ALL events (due *to Eventsession fired when the group membership has changed (member(s) joined and/or left)
     * @param membership new membership of the group
  reconnects etc)
   * @param listener see [[GroupChangeListener]]
   */
    def onGroupChangeaddListener(membershiplistener: Set[K])
  }GroupChangeListener)

  /**
   * Register permanentDeregister on group change listener.
   * There is no guarantee@param listener will be fired on ALL events (due to session reconnects etc)
   * @param listener see [[Listenersee [[GroupChangeListener]]
   */
  def addListenerremoveListener(listener: ListenerGroupChangeListener)

  /**
   * Setup Deregistereverything onneeded groupfor changeconcrete listenerimplementation
   * @param listenercontext see [[Listener]]
   */
  def removeListener(listener: Listener)

  /**
   * Setup everything needed for concrete implementation
   * @param context TBD. Should be TBD. Should be abstract enough to be used by different implementations and
   *                at the same time specific because will be uniformly called from the Kafka code -
   *                regardless of the implementation
   */
  def init(context: Any): Unit

  /**
   * Release all acquired resources
   */
  def close(): Unit
}
Code Block
languagescala
titleLeader Election
linenumberstrue
collapsetrue
/
 
/**
  * A connectorcallback forfired leadershipon electiongroup change event
*/
trait GroupChangeListener {
    /**
     * Event fired when the group membership has changed (member(s) joined and/or left)
     * @param membership new membership of the group
     */
    def onGroupChange(membership: Set[String])
}
Code Block
languagescala
titleLeader Election
linenumberstrue
collapsetrue
/**
 * A connector for leadership election protocol. Supports two modes:
 * 1) "running for election" (joining the candidates for leadership, resigning as a leader, subscribing to change notifications)
 * 2) "observing" (getting current leader, subscribing to change notifications)
 *
 */
trait LeaderElectionClient{protocol. Supports both parts:
 * 1) "running for election" (joining the candidates for leadership, resigning as a leader, subscribing to change notifications)
 * 2) "observing" (getting current leader, subscribing to change notifications)
 *
 * @tparam K type of the candidate ID - unique identifier among candidates participating in leader election
 */
trait LeaderElectionClient[K]{
  /**
   * Each instance of this class is tightly coupled with leadership over exactly one service (resource),
   * once set (during initialization) cannot be changed
   *
   * @return unique group identifier among all application services (resources)
   */
  def service: String

  /**
   * FetchEach leaderinstance of this serviceclass untilis onetightly iscoupled elected
with leadership over *exactly @returnone future result of the leader ID
   */
  def getLeader: Future[K]

  /*service (resource),
   * once set (during initialization) cannot be changed
   *
   * Make@return thisunique candidategroup eligibleidentifier foramong leaderall electionapplication and try to obtain leadership for this service if it's vacant
   services (resources)
   */
  def service: String

  /**
   * @paramGet candidatecurrent IDleader of the candidateresource which is eligible for(if any)
   * @return futurethe resultleader ofid theif leaderit electionexists
   */
  def nominate(candidategetLeader: K): Future[KOption[String]

  /**
   * Make Voluntarilythis resigncandidate aseligible afor leader election and initiatetry to newobtain leaderleadership election.
for this service *if Itit's avacant
 client responsibility to*
 stop all leader* duties@param beforecandidate callingID thisof methodthe tocandidate avoid more-than-one-leader cases
   *which is eligible for election
   * @param@return leadertrue currentif leadergiven IDcandidate (will be ignored if not a leader)
   * @return future result of the leader electionis now a leader
   */
  def resignnominate(leadercandidate: KString): Future[K]Boolean

  /**
   * AVoluntarily callbackresign firedas ona event
leader and initiate */
new  trait Listener {
 leader election.
   /**
 It's a client responsibility *to Eventstop firedall whenleader theduties leaderbefore hascalling changedthis andmethod theto newavoid more-than-one-leader cases
 has been elected*
     * @param leader newcurrent leader forID the(will given service
  be ignored if not a leader)
   */
    def onLeaderChangeresign(leader: KString)
:  }Unit

  /**
   * Register permanent on leader change listener
   * There is no guarantee listener will be fired on ALL events (due to session reconnects etc)
   * @param listener see [[ListenerLeaderChangeListener]]
   */
  def addListener(listener: ListenerLeaderChangeListener)

  /**
   * Deregister on leader change listener
   * @param listener see [[ListenerLeaderChangeListener]]
   */
  def removeListener(listener: ListenerLeaderChangeListener)

  /**
   * Setup everything needed for concrete implementation
   * @param context TBD. Should be abstract enough to be used by different implementations and
   *                at the same time specific because will be uniformly called from the Kafka code -
   *                regardless of the implementation
   */
  def init(context: Any): Unit

  /**
   * Release all acquired resources
   */
  def close(): Unit
}
Code Block
languagescala
titleStorage
linenumberstrue
collapsetrue

 
/**
  * InterfaceA tocallback afired (persistent)on keyleader valuechange storageevent
 */
trait @tparamLeaderChangeListener K{
 type of data key
/**
     * Event @tparamfired Vwhen typethe ofleader thehas fetchedchanged data
 */
trait Storage[K, V] {(resigned or acquired a leadership)
  /**
   * Get@param dataleader bynew itsleader key
for the given *service @paramif keyone datahas IDbeen inelected, thisotherwise storageNone
   * @return future*/
 result of the valuedef onLeaderChange(leader: Option[String])
}
Code Block
languagescala
titleStorage
linenumberstrue
collapsetrue
/**
 * Interface to a (persistent) key value storage
 */
trait Storage {if exists) associated with the key
   */
  def fetch(key: K): Future[Option[V]]

  /**
   * PersistGet valuedata withby its associated key.
 The contract is to throw an exception
   * if such key already exists
   *
   * @param key data ID in this storage
   * @param data value @return future result of the value (if exists) associated with the key
   */
  def putfetch(key: K, data: V)
String): Future[Option[String]]

  /**
   * UpdatePersist value bywith its associated key. The contract is to throw an exception
   * if such key doesn'talready existexists
   *
   * @param key data ID in this storage
   * @param data value associated with the key
   */
  def updateput(key: KString, data: VString)

  /**
   * SetupUpdate everythingvalue neededby forits concreteassociated implementation
key. The contract *is @paramto contextthrow TBD.an Shouldexception
 be abstract enough* toif besuch usedkey by different implementations anddoesn't exist
   *
   * @param key data ID in this storage
   * @param data value associated  atwith the samekey
 time specific because will be uniformly called from the Kafka code - */
  def update(key: String, data: String)

  /**
   * Setup everything needed for concrete implementation
   * @param context TBD. Should be abstract enough to be used regardlessby ofdifferent theimplementations implementationand
   */
   def init(context: Any): Unit

  /**
   * Release all acquired resources
 at the */
same time def close(): Unit
}
Code Block
languagescala
titleListener Registry
linenumberstrue
collapsetrue
/**
 * A registry for async data change notifications
 */
trait ListenerRegistry {
  /**
   * Register permanent callback for data change event
   * @param key the listenable data identifier
   * @param eventListener see [[EventListener]]
   * @tparam T type of the data IDspecific because will be uniformly called from the Kafka code -
   *                regardless of the implementation
   */
  def init(context: Any): Unit

  /**
   * Release all acquired resources
   */
  def close(): Unit
}
Code Block
languagescala
titleListener Registry
linenumberstrue
collapsetrue
/**
 * A registry for async data change notifications
 */
trait ListenerRegistry {addListener[T](key: T, eventListener: EventListener): Unit

  /**
   * DeregisterRegister permanent callback for data change event
   * @param key the listenable data identifier
   * @param eventListener see [[EventListenerValueChangeListener]]
   */
 @tparam T type of the data ID
   */
  def removeLister[T]def addValueChangeListener(key: TString, eventListener: EventListenerValueChangeListener): Unit

  /**
   * SetupDeregister everythingpermanent neededcallback for data concretechange implementationevent
   * @param contextkey TBD.the Shouldlistenable bedata abstractidentifier
 enough to be* used@param byeventListener different implementations andsee [[EventListener]]
   */
  def removeValueChangeListener(key: String, eventListener: ValueChangeListener): Unit
 
  /**
   * Register permanent callback atfor thekey-set samechange timeevent
 specific because will* be uniformly called from@param namespace the Kafka code,
   *                regardless of the implementationlistenable key-set identifier (e.g. parent path in Zookeeper, table name in Database etc)
   * @param eventListener see [[ValueChangeListener]]
   */
  def init(contextaddKeySetChangeListener(namespace: String, eventListener: AnyKeySetChangeListener): Unit

  /**
   * ReleaseDeregister allpermanent acquiredcallback resources
for key-set change */event
  def close(): Unit
}

/**
 * Base class for change events
 */
sealed trait Event

/**
 * Event fired if the listenable data state has changed. In terms of k-v storage - if the
 * value associated with the listenable key has changed
 * @tparam T type
 */
trait DataChangeEvent[T] extends Event {
  /**
   * @return new data state
   */
  def data: T
}

/**
 * Event fired if the listenable collection of data has changed. In terms of k-v storage - if the
 * key set in the particular namespace has changed
 * @tparam T collection data type
 */
trait CollectionChangeEvent[T] extends Event {
  /**
   * @return new collection states* @param namespace the listenable key-set identifier (e.g. parent path in Zookeeper, table name in Database etc)
   * @param eventListener see [[ValueChangeListener]]
   */
  def removeKeySetChangeListener(namespace: String, eventListener: KeySetChangeListener): Unit

  /**
   * Setup everything needed for concrete implementation
   * @param context TBD. Should be abstract enough to be used by different implementations and
   *                at the same time specific because will be uniformly called from the Kafka code,
   *                regardless of the implementation
   */
  def collection: Set[T]
}

 init(context: Any): Unit

  /**
   * Release Aall callbackacquired firedresources
 on event
 */
trait EventListener {
  def onEventclose(event): Event)
}

 

Compatibility, Deprecation, and Migration Plan

Unit
}

/**
 * Callback on value change event
 */
trait ValueChangeListener {
  def valueChanged(newValue: Option[String])
}

/**
 * Callback on key-set change event
 */
trait KeySetChangeListener {
  def keySetChanged(newKeySet: Set[String])
}

 

Compatibility, Deprecation, and Migration Plan

Shared interface for plugable consensus and metadata storage systems should be compatible for Zookeeper-based implementation. Also this implementation will likely be the default one.

As part of this KIP it will be required to rework some system and replication tools. It will not be possible anymore to rely on Zookeeper as a default metadata storage system, also it will not be possible to use it to trigger particular administrative commands. Most of the tools are related to topic management (create topics, reassign partitions etc) and consumer group management (offset checker etc).

The approach to topic tools is covered in KIP-4 - we will move all administrative logic to brokers. KIP-4 is currently under development and has agreed Wire Protocol changes.

The consumer group tools should be covered separately. Having New Java Consumer in 0.9 release with server-side coordinator may let us deprecate old consumer and thus all tools related to it. Consumer group tools should work as usual if brokers are run with Zookeeper based implementation of the shared interfaceThe use of zookeeper.connect in server.properties will still be honored. In the code we should have the default value of the new configuration.

Rejected Alternatives

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.

...