Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Lucene 9.8 Release Highlights

Optimizations

  • Faster computation of top-k hits on boolean queries. Lucene's nightly benchmarks report a 20%-30% speedup for disjunctive queries and a 11%-13% speedup for conjunctive queries since Lucene 9.7. Disjunctive queries with many and/or high-frequency terms should see even higher speedups.
  • Faster computation of top-k hits when sorting by field. Lucene's nightly benchmarks report speedups between 7% and 33% since 9.7 depending on the type and cardinality of the field that is used for sorting.
  • Faster counts on disjunctive queries by taking advantage of the new LeafCollector#collect(DocIdStream) API. Nightly benchmarks report a ~2.5x speedup since 9.7.
  • Faster indexing of numeric doc values when index sorting is turned on.

API Changes

  • Move max vector dims limit to Codec (Mayya Sharipova)

...

  • Introduced LeafCollector#finish, a hook that runs after collection has finished running on a leaf.
  • Add `KnnCollector` to `LeafReader` and `KnnVectorReader` so that custom collection of vector search results can be provided.
    The first custom collector provides `ToParentBlockJoin[Float|Byte]KnnVectorQuery` joining child vector documents with their parent documents.
  • Add support for recursive graph bisection, also called bipartite graph partitioning, and often abbreviated BP, an algorithm for
      reordering doc IDs that results in more compact postings and faster queries, especially conjunctions.

Optimizations

  • Speed up NumericDocValuesWriter with index sorting.

  • Faster computation of top-k hits on boolean queries.

...