Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Fix links.

...

07 December 2021, Apache Lucene™ 9.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, especially cross-platformfaceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous features, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

httphttps://lucene.apache.org/core/mirrors-core-latest-redirdownloads.html

Lucene 9.0 Release Highlights

...

  • Support for indexing high-cardinality dimensionality numeric vectors to perform nearest-neighbor search, using the Hierarchical Navigable Small World graph algorithm.

File formats

  • File formats have all been changed from big-endian order to little endian order.

Optimizations

  • New Analyzers for Serbian, Nepali, and Tamil languages
  • IME-friendly autosuggest for Japanese
  • Snowball 2, adding Hindi, Indonesian, Nepali, Serbian, Tamil, and Yiddish stemmers
  • New normalization/stemming for Swedish and Norwegian

Optimizations

  • Up to 400% faster taxonomy faceting
  • 10-15% faster indexing of multi-dimensional points
  • Several times faster sorting on fields that are indexed with points. This optimization used to be an opt-in in late 8.x releases and is now opt-out as of 9.0
  • Lucene 9 takes advantage of Java VarHandles, introduced in Java 9, to speed up indexing and some queries.
  • Lucene now enforces that a field has the same schema across all documents in order to enable optimizations that take advantage of the index by default.
  • ConcurrentMergeScheduler now assumes fast I/O, likely improving indexing speed in case where heuristics would incorrectly detect whether the system had fast modern I/O or not.
  • Encoding of postings lists changed from FOR-delta to PFOR-delta .

...

  • to save further disk space

Other

  • File formats have all been changed from big-endian order to little endian order
  • Lucene 9 now no longer has split packages. This required renaming some packages outside of the lucene-core JAR, so you will need to adjust some imports accordingly. See https.
  • Using Lucene 9 with the module system should be considered experimental. We expect to make progress on this in future 9.x releases.

Further details of changes are available in the change log available at: http://lucene.apache.org/core/9_0_0/

...

changes/Changes.html

...

Further details of changes are available in the change log and the migration guide available at https:http://lucene.apache.org/core/9_0_0/changes/ChangesMIGRATE.html.

Please report any feedback to the mailing lists (http://lucene.apache.org/core/discussion.html)

Note: The Apache Software Foundation now uses a content distribution network (CDN) for distributing releases. We think it's unlikely, but the system is new, and it is possible that some glitch may cause the mirror you are using to lack the new release. If that is the case, please try another mirror. This also applies to Maven access.