You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

This page is meant as a template for writing a SIP. To create a SIP choose Tools->Copy on this page and modify with your content and replace the heading with the next SIP number and a description of your issue. Replace anything in italics with your own description.

Status

Current stateUnder Discussion

Discussion thread: https://lists.apache.org/thread.html/r186364d4d22a6301887b54023cb3db48a5324f197590a3b3e95535fd%40%3Cdev.solr.apache.org%3E

JIRA: here (<- link to https://issues.apache.org/jira/browse/SOLR-XXXX)

Released: <Solr Version>

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast). Confluence supports inline comments that can also be used.

Motivation

Many organizations are frustrated with Solr Cloud deployments due to the perceived cost of managing a separate, dedicated Apache ZooKeeper ensemble. We can ameliorate this complexity by running our own embedded Zookeeper ensemble, based on ZOOKEEPER-3874 and released with ZooKeeper 3.7

This ensemble should be launched automatically from Solr processes, and dynamically configure quorum information.

There is some overlap between the motivations of this SIP and SIP-5 Coordination Module + Apache Curator but the two approaches should be complimentary.

Public Interfaces

We will need to create APIs for retrieving quorum status from a Solr node. This may include determining if the node is part of serving a quorum, which quorum it is connected to, getting information about other quorum members (ports, addresses) for observers joining. We may also need APIs for instructing nodes to join or depart a particular quorum.

The full extent of the necessary APIs is not yet determined.


We will need to expose additional ports from Solr nodes for ZK functionality. This will likely include the ZK secureClientPort, and possibly the serverPortelectionPort and others.

Proposed Changes

There are several phases to accomplishing what we would need to do.

Migrate Unit Tests to use ZooKeeperServerEmbedded (ZKSE)

Currently, our unit tests use a fragile construction for an embedded Zookeeper. In order to develop confidence towards an embedded ZooKeeper in production settings, we should ensure that our test framework is using the same APIs.

Migrate ZKRun implementation to use ZKSE

When we launch a Solr service in "cloud" mode without specifying a zookeeper host to connect to, it launches its own service on a separate port.

This is the simplest usage of an embedded zookeeper server that we currently have, it does not use quorums and has lifecycle tied to that of the parent Solr node.

Create an auto-clustering implementation for several ZKRun nodes

This approach may not be feasible for service discovery, but would be the ultimate goal of our efforts.

For example, we would start three Solr nodes each with ZKSE, and instruct all of the ZK servers to form a cluster. There may be ordering issues to resolve here, as well as concerns about service discovery for other Solr nodes.

Compatibility, Deprecation, and Migration Plan

  • Existing users will be able to continue to run Solr Cloud with an external ZooKeeper quorum.

Security considerations

When running our own ZK services, the security of ZK becomes our responsibility instead of being something that we can delegate. The ZK Servers that we start should be secure by default using available authentication methods and practices.

Test Plan

 [ TBD ]

Rejected Alternatives

Continue to launch embedded ZK process the same way that we do now. This is an unattractive proposal because we will be tied to ZK internals which are subject to change and not part of their public APIs.

  • No labels