Status

Current state"Under Discussion"

Discussion thread: TODO

JIRA: Unable to render Jira issues macro, execution error.

Released:  

Motivation

Interaction with ZooKeeper is difficult and it’s all over the Solr code. This makes it difficult to test, there are currently limited options other than integration tests, which are slow, fragile, and not as versatile (difficult to test certain scenarios).

Proposed Changes

The main goal of this change is to allow better testing of the different ZooKeeper interactions related to coordination (leader election, queues, etc).
There are already some abstractions in place for lower level operations (set-data, get-data, etc, see DistribStateManager), so the idea is to have a new, related abstraction named CoordinationManager, where we could have some higher level coordination-related classes, like LeaderRunner (Overseer), LeaderLatch (for shard leaders), etc.
Curator comes into place because, in order to refactor the existing code into these new abstractions, we’d have to rework much of it, so we could instead consider using Curator instead, a library that was mentioned in the past many times. While I don’t think this is required, It would make this transition and our code simpler (from what I could see, however, input from people with more Curator experience would be greatly appreciated).

While it would be out of the scope of this change, If the abstractions/interfaces are correctly designed, this could lead to, in the future, be able to use something other than ZooKeeper for coordination, either etcd or maybe even some in-memory replacement for tests.

Public Interfaces

Compatibility, Deprecation, and Migration Plan

WIP

Test Plan

The main goal of this change is to improve testing. The Coordination module should have extensive test coverage with no need to start Solr servers.

Rejected Alternatives

An alternative would be to have this "Coordination" module, but don't introduce Curator, and instead refactor our existing coordination code into the module. This could be an option is people opposes to use Curator.

  • No labels