Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

It has the following constructs:

Name

Symbol

Example

Explanation

SHOULD

~

~foo

explicitly override the default operator to enforce should logic

MUST_NOT

!

!foo

Chosen over '-' to reduce conflicts with hyphenated words, not an operator at the end of a token so no conflict with exclamations (Spanish uses an upside down ! in front)

MUST

+

+foo

Similar to standard query parser

ANALYZED_PHRASE 

"" 

"foo" 

phrase search including synonyms/and full analysis

LITERAL_PHRASE

''

'foo'

phrase search with reduced analysis (see below for details)

GROUP

()

(foo bar)

applies the default operator (or other specified operator to the terms within the parenthesis, and causes them to be considered as a unit.

DISTANCE

n/#()

n/3(foo bar)

Specifies a span query where foo and bar occur (in either order) within 3 tokens of each other

ORDERED_DISTANCE

w/#()

w/4(foo bar)

Specifies a span query where foo and bar occur within 4 tokens of each other with foo occurring before bar.

PREFIX

*

foo*

Specifies a prefix search matching any tokens starting with 'foo' default settings require at least 3 prefix characters.

FIELD

:

title:foo

searches the title field for foo

RANGE

:[ TO ]
:{ TO }

votes:[0 TO 10}

Typical lucene range searches on text, date or numeric data, inclusive and exclusive bounds supported as in standard parser





Several Elements of other syntaxes are intentionally omitted:

...

One of the major goals of this parser is to enable a configuration that can apply synonyms to punctuated constructs that have significance to the user but are typically destroyed by the existing parsers. An example configuration of a field type to achieve this (anticipating the use of this parser) looks like this:

Code Block
languagetext
<fieldType name="text_aqp" class="solr.TextField"

...

>
  <analyzer type="index"

...

>
    <tokenizer class="solr.WhitespaceTokenizerFactory"/

...

>
    <filter class="solr.PatternTypingFilterFactory" patternFile="patterns.txt"/

...

>
    <filter class="solr.TokenAnalyzerFilterFactory" asType="text_general" preserveType="true"/

...

>
    <filter class="solr.TypeAsSynonymFilterFactory" prefix="__TAS__" synFlagsMask="0" ignore="word"/

...

>
    <filter class="solr.DropIfFlaggedFilterFactory" dropFlags="2"/>  </analyzer>  <analyzer type="query"

...

>
    <tokenizer class="solr.KeywordTokenizerFactory"/> <!-- query parser already handles splitting --

...

>
    <filter class="solr.PatternTypingFilterFactory" patternFile="patterns.txt"/

...

>
    <filter class="solr.TokenAnalyzerFilterFactory" asType="text_en_aqp" preserveType="true" /

...

>
    <filter class="solr.TypeAsSynonymFilterFactory" prefix="__TAS__" synFlagsMask="0"ignore="word"/

...

>
    <filter class="solr.DropIfFlaggedFilterFactory" dropFlags="2"/

...

>
  </

...

analyzer>
</fieldType>
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true"

...

>
  <analyzer type="index"

...

>
    <tokenizer class="solr.StandardTokenizerFactory"/

...

>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/

...

>
    <filter class="solr.LowerCaseFilterFactory"/

...

>
    <filter class="solr.EnglishPossessiveFilterFactory"/

...

>
    <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/

...

>
    <filter class="solr.PorterStemFilterFactory"/

...

>
  </analyzer>  <analyzer type="query"

...

>
    <tokenizer class="solr.StandardTokenizerFactory"/

...

>
    <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/

...

>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/

...

>
    <filter class="solr.LowerCaseFilterFactory"/

...

>
    <filter class="solr.EnglishPossessiveFilterFactory"/

...

>
    <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/

...

>
    <filter class="solr.PorterStemFilterFactory"/

...

>
  </analyzer></

...

fieldType> 
<fieldType name="text_general_lit" class="solr.TextField" positionIncrementGap="100" multiValued="true"

...

>
  <analyzer type="index">    <tokenizer class="solr.WhitespaceTokenizerFactory"/

...

>
    <filter class="solr.LowerCaseFilterFactory"/

...

>
  </

...

analyzer>
  <analyzer type="query"

...

>
    <tokenizer class="solr.WhitespaceTokenizerFactory"/

...

>
    <filter class="solr.LowerCaseFilterFactory"/

...

>
  </

...

analyzer>
</fieldType> 

---- patterns.txt ----

...

 

2 (\d+)\(?([a-z])\)? ::: legal2_$1_

...

$2
2 (\d+)\(?([a-z])\)?\(?(\d+)\)? ::: legal3_$1_$2_

...

$3
2 C\+\+ ::: c_plus_plus


There's a lot to unpack there, so starting from the top:

...