...
It has the following constructs:
Name | Symbol | Example | Explanation |
SHOULD | ~ | ~foo | explicitly override the default operator to enforce should logic |
MUST_NOT | ! | !foo | Chosen over '-' to reduce conflicts with hyphenated words, not an operator at the end of a token so no conflict with exclamations (Spanish uses an upside down ! in front) |
MUST | + | +foo | Similar to standard query parser |
ANALYZED_PHRASE | "" | "foo" | phrase search including synonyms/and full analysis |
LITERAL_PHRASE | '' | 'foo' | phrase search with reduced analysis (see below for details) |
GROUP | () | (foo bar) | applies the default operator (or other specified operator to the terms within the parenthesis, and causes them to be considered as a unit. |
DISTANCE | n/#() | n/3(foo bar) | Specifies a span query where foo and bar occur (in either order) within 3 tokens of each other |
ORDERED_DISTANCE | w/#() | w/4(foo bar) | Specifies a span query where foo and bar occur within 4 tokens of each other with foo occurring before bar. |
PREFIX | * | foo* | Specifies a prefix search matching any tokens starting with 'foo' default settings require at least 3 prefix characters. |
FIELD | : | title:foo | searches the title field for foo |
RANGE | :[ TO ] | votes:[0 TO 10} | Typical lucene range searches on text, date or numeric data, inclusive and exclusive bounds supported as in standard parser |
Several Elements of other syntaxes are intentionally omitted:
...
One of the major goals of this parser is to enable a configuration that can apply synonyms to punctuated constructs that have significance to the user but are typically destroyed by the existing parsers. An example configuration of a field type to achieve this (anticipating the use of this parser) looks like this:
Code Block | ||
---|---|---|
| ||
<fieldType name="text_aqp" class="solr.TextField" |
...
> <analyzer type="index" |
...
> <tokenizer class="solr.WhitespaceTokenizerFactory"/ |
...
> <filter class="solr.PatternTypingFilterFactory" patternFile="patterns.txt"/ |
...
> <filter class="solr.TokenAnalyzerFilterFactory" asType="text_general" preserveType="true"/ |
...
> <filter class="solr.TypeAsSynonymFilterFactory" prefix="__TAS__" synFlagsMask="0" ignore="word"/ |
...
> <filter class="solr.DropIfFlaggedFilterFactory" dropFlags="2"/> </analyzer> <analyzer type="query" |
...
> <tokenizer class="solr.KeywordTokenizerFactory"/> <!-- query parser already handles splitting -- |
...
> <filter class="solr.PatternTypingFilterFactory" patternFile="patterns.txt"/ |
...
> <filter class="solr.TokenAnalyzerFilterFactory" asType="text_en_aqp" preserveType="true" / |
...
> <filter class="solr.TypeAsSynonymFilterFactory" prefix="__TAS__" synFlagsMask="0"ignore="word"/ |
...
> <filter class="solr.DropIfFlaggedFilterFactory" dropFlags="2"/ |
...
> </ |
...
analyzer> </fieldType> <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true" |
...
> <analyzer type="index" |
...
> <tokenizer class="solr.StandardTokenizerFactory"/ |
...
> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/ |
...
> <filter class="solr.LowerCaseFilterFactory"/ |
...
> <filter class="solr.EnglishPossessiveFilterFactory"/ |
...
> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/ |
...
> <filter class="solr.PorterStemFilterFactory"/ |
...
> </analyzer> <analyzer type="query" |
...
> <tokenizer class="solr.StandardTokenizerFactory"/ |
...
> <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/ |
...
> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/ |
...
> <filter class="solr.LowerCaseFilterFactory"/ |
...
> <filter class="solr.EnglishPossessiveFilterFactory"/ |
...
> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/ |
...
> <filter class="solr.PorterStemFilterFactory"/ |
...
> </analyzer></ |
...
fieldType> <fieldType name="text_general_lit" class="solr.TextField" positionIncrementGap="100" multiValued="true" |
...
> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/ |
...
> <filter class="solr.LowerCaseFilterFactory"/ |
...
> </ |
...
analyzer> <analyzer type="query" |
...
> <tokenizer class="solr.WhitespaceTokenizerFactory"/ |
...
> <filter class="solr.LowerCaseFilterFactory"/ |
...
> </ |
...
analyzer> </fieldType> ---- patterns.txt ---- |
...
2 (\d+)\(?([a-z])\)? ::: legal2_$1_ |
...
$2 2 (\d+)\(?([a-z])\)?\(?(\d+)\)? ::: legal3_$1_$2_ |
...
$3 2 C\+\+ ::: c_plus_plus |
There's a lot to unpack there, so starting from the top:
...