Page History

...

It has the following constructs:

Name	Symbol	Example	Explanation
SHOULD	~	~foo	explicitly override the default operator to enforce should logic
MUST_NOT	!	!foo	Chosen over '-' to reduce conflicts with hyphenated words, not an operator at the end of a token so no conflict with exclamations (Spanish uses an upside down ! in front)
MUST	+	+foo	Similar to standard query parser
ANALYZED_PHRASE	""	"foo"	phrase search including synonyms/and full analysis
LITERAL_PHRASE	''	'foo'	phrase search with reduced analysis (see below for details)
GROUP	()	(foo bar)	applies the default operator (or other specified operator to the terms within the parenthesis, and causes them to be considered as a unit.
DISTANCE	n/#()	n/3(foo bar)	Specifies a span query where foo and bar occur (in either order) within 3 tokens of each other
ORDERED_DISTANCE	w/#()	w/4(foo bar)	Specifies a span query where foo and bar occur within 4 tokens of each other with foo occurring before bar.
PREFIX	*	foo*	Specifies a prefix search matching any tokens starting with 'foo' default settings require at least 3 prefix characters.
FIELD	:	title:foo	searches the title field for foo
RANGE	:[ TO ] :{ TO }	votes:[0 TO 10}	Typical lucene range searches on text, date or numeric data, inclusive and exclusive bounds supported as in standard parser

Several Elements of other syntaxes are intentionally omitted:

...

One of the major goals of this parser is to enable a configuration that can apply synonyms to punctuated constructs that have significance to the user but are typically destroyed by the existing parsers. An example configuration of a field type to achieve this (anticipating the use of this parser) looks like this:

Code Block

language	text

<fieldType name="text_aqp" class="solr.TextField"

...

>
  <analyzer type="index"

...

>
    <tokenizer class="solr.WhitespaceTokenizerFactory"/

...

>
    <filter class="solr.PatternTypingFilterFactory" patternFile="patterns.txt"/

...

>
    <filter class="solr.TokenAnalyzerFilterFactory" asType="text_general" preserveType="true"/

...

>
    <filter class="solr.TypeAsSynonymFilterFactory" prefix="__TAS__" synFlagsMask="0" ignore="word"/

...

>
    <filter class="solr.DropIfFlaggedFilterFactory" dropFlags="2"/>  </analyzer>  <analyzer type="query"

...

>
    <tokenizer class="solr.KeywordTokenizerFactory"/> <!-- query parser already handles splitting --

...

>
    <filter class="solr.PatternTypingFilterFactory" patternFile="patterns.txt"/

...

>
    <filter class="solr.TokenAnalyzerFilterFactory" asType="text_en_aqp" preserveType="true" /

...

>
    <filter class="solr.TypeAsSynonymFilterFactory" prefix="__TAS__" synFlagsMask="0"ignore="word"/

...

>
    <filter class="solr.DropIfFlaggedFilterFactory" dropFlags="2"/

...

>
  </

...

analyzer>
</fieldType>
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true"

...

>
  <analyzer type="index"

...

>
    <tokenizer class="solr.StandardTokenizerFactory"/

...

>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/

...

>
    <filter class="solr.LowerCaseFilterFactory"/

...

>
    <filter class="solr.EnglishPossessiveFilterFactory"/

...

>
    <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/

...

>
    <filter class="solr.PorterStemFilterFactory"/

...

>
  </analyzer>  <analyzer type="query"

...

>
    <tokenizer class="solr.StandardTokenizerFactory"/

...

>
    <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/

...

>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/

...

>
    <filter class="solr.LowerCaseFilterFactory"/

...

>
    <filter class="solr.EnglishPossessiveFilterFactory"/

...

>
    <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/

...

>
    <filter class="solr.PorterStemFilterFactory"/

...

>
  </analyzer></

...

fieldType> 
<fieldType name="text_general_lit" class="solr.TextField" positionIncrementGap="100" multiValued="true"

...

>
  <analyzer type="index">    <tokenizer class="solr.WhitespaceTokenizerFactory"/

...

>
    <filter class="solr.LowerCaseFilterFactory"/

...

>
  </

...

analyzer>
  <analyzer type="query"

...

>
    <tokenizer class="solr.WhitespaceTokenizerFactory"/

...

>
    <filter class="solr.LowerCaseFilterFactory"/

...

>
  </

...

analyzer>
</fieldType> 

---- patterns.txt ----

...

 

2 (\d+)\(?([a-z])\)? ::: legal2_$1_

...

$2
2 (\d+)\(?([a-z])\)?\(?(\d+)\)? ::: legal3_$1_$2_

...

$3
2 C\+\+ ::: c_plus_plus

There's a lot to unpack there, so starting from the top:

...

Space shortcuts

Page tree

Versions Compared

Old Version 2

New Version 3

Key