Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

Current state: Under Discussion (

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keySOLR-15694
)

Motivation

So far, Solr only has a single type of node, one that is capable of assuming all kinds of tasks. There are usecases where one would like dedicated nodes for specific types of workloads. For example, a dedicated overseer node or a dedicated data node and query node or a node with no data hosted on it, one that can be used for administrative tasks or running plugins etc. Elasticsearch, Vespa etc. have first class support for node roles.

Proposal

Every node in Solr has one or more “roles”. The following roles are proposed:

  1. “data” role: A node with this role can host data hosting replicas. By default, this is the case for all nodes. However, if a node specifically wants no data to be hosted, it needs to have a “!data” (or “no data”) role. This ! notation is needed to unset any role that is assumed to be true by default, i.e. “data” role.
  2. “overseer” role: A node with this role indicates that this node is a preferred overseer. When one or more such nodes are live, Solr guarantees that one of those nodes become the overseer.
  3. “query” role [UPCOMING FEATURE]: This role can be associated with a node to where all queries can be sent, and this node sends out other remote calls to data hosting nodes, aggregates the results and sends back to user. This will be also useful for dealing with streaming expressions based queries. See
    Jira
    serverASF JIRA
    serverId5aa69414-a9e9-3523-82ec-879b028fb15b
    keySOLR-15715
    .
  4. “zookeeper” role [UPCOMING FEATURE]: This role can be associated with nodes that can have embedded ZK nodes. See: https://cwiki.apache.org/confluence/display/SOLR/SIP-14+Embedded+Zookeeper


Public Interfaces

There will just one supported way to use the roles functionality:

Startup parameters

Examples:

  1. Data node that can act as overseer too:
        -Dnode.roles=overseer
        -Dnode.roles=overseer,data
  2. Dedicated overseer node: -Dnode.roles=overseer,!data
  3. Dedicated Query node: -Dnode.roles=query,!data

Cluster API

As of today, there is ADDROLE and REMOVEROLE APIs to add/remove roles at run time to nodes. It supports only OVERSEERROLE. We propose to deprecate this API, and recommend users to use startup params for achieving the same. Supporting both ways is tricky and will lead to a lot of confusion among users.

How to Retrieve Roles?

Public API

To Read the values use HTTP GET

...

“!data” : [“node1”, “node2”]

}

Other notes

  • Every time a node starts up with specified roles, the node assumes it is the correct role for that node and publishes those roles in ZK after successful startup.
  • If a node is being assigned a !data role via startup parameter, but it already has data hosting replicas on it, the startup fails with an error (and a hint indicating how to move replicas away from this replica).

Compatibility, Deprecation, and Migration Plan

  • Deprecate APIS ADDROLE, REMOVEROLE
  • New V2 API for GET /api/cluster/roles to have nodes as key. 

Major Risks

None

Security considerations

None

Test Plan

Testing should mainly focus on how the nodes behave when roles are added to and removed from the nodes.

Rejected Alternatives

There is no proper alternative today. 

...