Status

Current state: In Progress

Discussion thread: "[DISCUSS] SIP-16: Polish and Prepare v2 APIs for v1 Deprecation"

JIRA: SOLR-15734

Released: N/A

Motivation

Solr's current API situation is frustrating for users and developers alike.

Its primary API, the "v1" API, has grown organically over several decades.  From a user's perspective, the v1 API is often powerful and flexible, but is rarely intuitive and lacks cohesion and consistency, making it difficult for new users to adopt.  It's also less-than-ideal from a developer perspective, as API inputs and outputs both take the form of loosely-typed maps and "NamedLists" that make it difficult to understand what parameters an API accepts and what format it outputs. 

These limitations were acknowledged in 2015 when a new "v2" API was proposed and later introduced in Solr 6.5.  But after their initial introduction the v2 effort lost momentum and soon fell behind v1 which continued to grow and change.  Partially responsible was the duplication needed to declare v1 and v2 declarations of the same functionality, and the lack of any tests or enforcement that made sure that changes to v1 were reflected in v2.  Today, v2 "coverage" is patchy at best, with many v1 APIs having no v2 equivalent.  What's more, industry best-practices and conventions around API design have progressed considerably in the intervening years, leaving even the v2 APIs in need of a refresh to be intuitive to today's user.

This pain is also felt by developers, who for the better part of a decade now have been supporting 2 disjoint API "surface areas", the frameworks behind each of those, etc.

This SIP aims to address these user and developer pain-points by putting forward a plan to: 

  1. Further modernize our v2 API to make it more coherent and intuitive, and
  2. Achieve and enforce ongoing v2 parity with v1, in order to ultimately...
  3. Deprecate the v1 API.

Observant readers might recognize (1) and (2) from JIRA and mailing list discussions held over the past year.  There's already been a certain amount of consensus and enthusiasm generated around the approaches for each of these, and initial work on them is underway.  Though of course, it's never too late to change course based on any feedback and discussion generated through this SIP. 

Proposed Changes

This SIP proposes a handful of related changes to Solr's APIs and related functionality.  These changes will be discussed as they relate to the 3 overarching goals described in the "Motivation" section above.  The SIP can be considered finished when each of these three goals is has been met.

One note of clarification: this SIP isn't proposing new APIs or syntax for updates, queries, or any of Solr's other DSLs.  Query syntax is a complex topic, and one that's not taken up here.

Further modernize our v2 API to make it more coherent and intuitive

Prior discussions around the v2 API proposed a concrete set of changes that would bring some additional consistency to the v2 API.  These changes are outlined in the "Google Sheet" found here.  Of particular note are any cells highlighted in green (which indicate changes to the current v2 API) or red (which indicate places where no v2 API exists).  SOLR-15781 has been created as an umbrella ticket to implement this functionality.

At a high level, the net-effect of these changes is to organize the v2 API more fully around the "REST"-ful paradigm.  Why REST?  To quote from SOLR-15781:

REST in particular has come up a good bit in previous discussions as being more in keeping with best practices today, and being easier for users to pick up: both big wins for Solr. REST-ful APIs are also much easier to create OpenAPI documentation for, which opens some interesting doors for us like allowing us to have a Swagger UI exposing most/all endpoints, or even auto-generating request/response libraries for Java and other languages.

The Solr community has already given the v2 APIs an "experimental" designation, exempting them from any backcompat restrictions which would normally apply.  This greatly simplifies the work involved, and some progress has already been made in implementing the changes described on SOLR-15781 and its children.

Achieve and enforce ongoing v2 parity with v1

One cause at the root of the v2/v1 parity gap was that there was no mechanism in the build or in the structure of the code itself that prompted (or, more bluntly, forced) developers to keep the API definitions in sync.  The "meat" of the API logic lived in the v1 RequestHandler, and any v2 APIs existed primarily as a shim that reformatted inputs and called into the RequestHandler internally.  So any developer adding a new parameter or making some similar change was forced to update the v1 code because that's where "stuff actually happened". But nothing kept the v2 shim in sync.

This SIP proposes we reverse this situation and move the actual functionality underlying each API from the v1 RequestHandler and into the v2 API definition.  This would prevent v2 APIs from falling any further behind their v1 counterparts, as the v1 codepath would be forced to consume the v2 code under the hood.  It also has the nice side-effect of splitting up Solr's RequestHandler classes into smaller, more maintainable pieces.  In many instances, this refactoring effort is eased by migrating the v2 definition to use JAX-RS annotations when defining its API, instead of the custom annotations created for Solr's longstanding v2 framework.

Like the v2 "modernization" discussed above these ideas all pre-exist this SIP and some work has already been done on them: finding and adding the "missing" v2 APIs has been discussed in SOLR-15737, moving API functionality over to the v2 side to enforce parity has been discussed on SOLR-15736 (and the dev@ mailing list), and bulk migration of v2 APIs over to JAX-RS definitions is discussed in SOLR-16370.

Deprecate the v1 API

 The goals above serve as prerequisites for deprecating the v1 API, but they're not the only changes needed to pave the way towards deprecation.  Deprecating the v1 API will require the following changes in addition to the goals above:

  1. All code maintained by the Solr community should be switched over to use the v2 API.  This includes the Admin UI (SOLR-15752), "bin/solr" and other scripts shipped with Solr, and of course Solr itself.
    1. Other community-owned components that aim to support a range of Solr versions simultaneously (e.g. the Solr Operator, some "sandbox" components) will also need updated to use the v2 API.  But it likely makes sense to deprecate v1 prior to switching these over to v2, due to the version-skew they operate under.
  2. The ref-guide and all other community-owned documentation should be updated to use v2 API syntax primarily. (SOLR-11646)
  3. SolrJ should provide an additional SolrRequest implementation for each v2 APIs (SOLR-15735)
    1. v2 APIs defined using the JAX-RS annotations can be detected and used to build an "OpenAPI spec" using the recently added "./gradlew resolve" command.  We should be able to use this spec in our build to auto-generate v2 SolrRequest implementations (SOLR-16380)

  4. Extend Solr's "paramsets" to allow setting request-body parameters in addition to query-parameters.  This is needed to allow PUT/POST based v2 APIs to make use of paramsets.
  5. Decide whether to keep or jettison the RequestHandler abstraction.  This, in my mind, is the biggest remaining question surrounding this SIP: does the RequestHandler abstraction provide any value, or make sense in a v2-only world, or should it be deprecated alongside Solr's v1 APIs?  There are at least two concerns and questions to weigh: 
    1. Customization  Their integration into solrconfig.xml allows RequestHandlers to be customized/configured in a way that v2 APIs currently can't be.  The Solr community needs to decide if this is functionality we want to carry forward into v2, and what it should look like (e.g. continue using the `<requestHandler>` tag?  create a new tag for v2?)
    2. Metrics Request metrics are tracked at a RequestHandler granularity.  i.e. the list-collections and split-shard APIs share the same request handler, and thus contribute to the same 'errors', 'totalTime', etc. metrics.  Further, the name used by these metrics includes the name/path of the request handler.  Do we want to want to preserve these for Solr's v2 APIs or have them reported at a different granularity and under different names?

Only when all of these steps have been completed will we be ready to announce the v1 APIs as deprecated and this SIP as "finished"

Public Interfaces

Almost every aspect of the changes discussed in this SIP involve one public interface or another.  Many of these have already been discussed at length, but to enumerate different "types" of changes again here:

  1. Modification of existing v2 APIs
  2. Addition of net-new v2 APIs to fill current gaps
  3. Deprecation of all existing v1 APIs
  4. Introduction of new SolrRequest implementations which use v2 APIs internally
  5. [Maybe?] Addition of new solrconfig.xml syntax to supercede current `<requestHandler>` functionality
  6. [Maybe?] Metrics changes related to the v2 APIs

Compatibility, Deprecation, and Migration Plan

Many of the individual changes proposed by this SIP are immune from any particular migration or compatibility concerns.  The "API modernization", "reaching parity with v1", and potential "solrconfig.xml exposure" chunks are immune by virtue of the "experimental" designation previously given to the v2 API.  The ref-guide changes are similarly immune by virtue of being, well, doc changes.  And the "v2 SolrRequest implementation" changes are also immune by virtue of being purely additive.

These are the "easy parts" - at least in terms of compatibility and migration are concerned.  This SIP proposes that all of these "immune" pieces be completed first, before starting on any of the other dependent pieces.  For simplicity in describing migration/compatibility concerns below, the term "v2-API-Complete" will be used to describe the first Solr release containing all of the "immune" work described above.

Somewhat trickier are the portions of this SIP that involve switching community-maintained code over to use the "final" v2 APIs, and handling user upgrades.  The concerns for each of these are treated below:

  • Admin UI  The current Admin UI makes API calls to the hosting node exclusively, so it should be immune from any complications.
  • bin/solr and Other Scripts As far as I can tell, the Solr community hasn't been explicit anywhere about the compatibility that users should expect from `bin/solr` and other scripts.  The command-line syntax of the scripts themselves seems to aim for backcompat within a given major release line.  But there's no clear statement anywhere about whether the scripts that ship with 9.1 should be able to work with 9.2 or 9.0 for example.  Solr's standard backcompat policy would imply "yes", but the scripts do seem like they might be a reasonable exception to that, since they ship alongside each and every distribution of Solr.  This SIP proposes that we announce/codify the scripts as an "exception" in this way and treat them as "immune" from compatibility concerns for essentially the same reason as the Admin UI.
  • Solr Operator Each Solr Operator release targets a particular range of Solr versions.  So the Solr Operator should remain using the v1 API until the minimum version it targets reaches the the first "v2-API-Complete" version of Solr.
  • Internal Solr Usage Solr's standard backcompat policy traditionally allows users to upgrade from any release in the current major line directly to any other release in the subsequent major line.  But there have been exceptions to this in the past.  This SIP proposes creating another of those exceptions here: asking/requiring users to upgrade to an interim version ("v2-API-Complete") before upgrading to a version that uses v2 internally.
    • The first upgrade to "v2-API-Complete" is safe because neither the "start" nor the "end" version would use the v2 API internally.  The second upgrade to "v2-API-used-internally" is also safe because the same v2 APIs are available on both the "start" and "end" versions.

An Example

An exact ordering of all these changes might be easier understood via the sequence described below:

  1. Tomorrow, Acme Corp deploys two fresh 9.1 clusters on their servers.
  2. Much time passes, and after much development effort, v2 <-> v1 parity is reached on `main` and `branch_9x`, and the API modernization effort winds down.
  3. A month later, 9.5.0 is released, making it the first "v2-API-Complete" version of Solr.
  4. The Admin UI and `bin/solr` scripts are switched over to use v2 on `main` and `branch_9x`.
  5. Time passes: 9.5.1, and 9.6 releases come and go.
  6. `main` and `branch_9x` have finally been updated to use the v2 APIs internally
  7. 9.7 is released, making it the first "v2-API-used-internally" version of Solr.  The v1 API appears as deprecated.
  8. Acme Corp wants to upgrade one of their 9.1 clusters to Solr 9.7  To do so, they first upgrade to the latest "v2-API-Complete" version (9.6.0).  Once that upgrade is complete, they're able to take the final step and upgrade their cluster (now 9.6.0) to 9.7.0.
  9. Big news!  Solr 10 is released!
  10. Acme Corp wants to upgrade their remaining 9.1 cluster to Solr 10  To do so, they take the same intermediate step they took for their previous 9.7 upgrade.  They first upgrade to 9.6.0, and then when that is finished they perform the major upgrade to 10.0.0.
  11. With 10.0 out, Solr Operator starts to shrink its compatibility on the 9.x line and eventually sets the minimum version to 9.5.0.
  12. Operator devs upgrade the operator to use the v2 API.

Rejected Alternatives

Several aspects of this SIP could be done differently, but were rejected for various reasons.  These aren't alternatives to the SIP as a whole per-se, but they did seem relevant to call out here.

  • The two-step-upgrade requirement this SIP would place on users could be alleviated by aligning "v2-API-Complete" and "v2-API-used-internally" along major version lines.  That is: since Solr only supports upgrades across a single major version at a time, "v2-API-Complete" and "v2-API-used-internally" could be delayed to occur on different major lines (e.g. "v2-API-Complete" on 10.0 and "v2-API-used-internally" on 11.0).  This would enforce the underlying purpose of the two-step-upgrade implicitly.  The obvious downside of this, and the reason it was rejected, is that it would introduce a huge delay into the v2 API work.  v1 deprecation couldn't happen until 11.0 - likely several years distant - and removal would be even further away, at 12.0
  • Rather than leaning on the client auto-generation that JAX-RS and OpenAPI provide us to create new SolrRequest implementations, we could attempt to swap out the internals of the existing SolrRequest implementations "in place".  This was rejected for the sheer amount of manual work involved, the difficulty in testing, and for conceptual issues that it would introduce (e.g. some SolrJ classes like "CollectionAdminRequest" don't map cleanly onto the v2 API).
  • No labels