Code Block

public LuceneIndexFactory {
 /**
 * Configure the way objects are converted to lucene documents for this lucene index
 * @param luceneSerializer A callback which converts a region value to a 
 * Lucene document or documents to be stored in the index.
 */
 public voidLuceneIndexFactory setLuceneSerializer(LuceneSerializer luceneSerializer);
}  
  
/**
 * An interface for writing the fields of an object into a lucene document
 * The region key will be added as a field to the returned documents.
 * @param index lucene index
 * @param value user object to be serialized into index
 */
public interface LuceneSerializer {
  Collection<Document> toDocuments(LuceneIndex index, Object value);
}

XML Configuration

<cache

xmlns

xmlns="http://geode.apache.org/schema/cache"

xmlns

xmlns:lucene="http://geode.apache.org/schema/lucene"

xmlns

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi

xsi:schemaLocation="http://geode.apache.org/schema/cache

http

http://geode.apache.org/schema/cache/cache-1.0.xsd

http

http://geode.apache.org/schema/lucene

http

http://geode.apache.org/schema/lucene/lucene-1.0.xsd"

version

version="1.0">

<region

<region name="region"

refid

refid="PARTITION">

<lucene

        <lucene:index name="index">
           <lucene:field name="a"

analyzer

 analyzer="org.apache.lucene.analysis.core.KeywordAnalyzer"/>
           <lucene:field name="b"

analyzer

 analyzer="org.apache.lucene.analysis.core.SimpleAnalyzer"/>
           <lucene:field name="c"

analyzer

analyzer="org.apache.lucene.analysis.standard.ClassicAnalyzer"/>

<lucene:serializer="org

           <lucene:serializer>
             <class-name>org.apache.

lucene

geode.

internal

cache.

repository.FlatFormatSerializer"

lucene.FlatFormatSerializer</class-name>
           </lucene:serializer>
       

/>

</lucene:index>

<

</

region>

region>
</

cache>If serializer is not specified, it will use the default HeterogeneousLuceneSerializer.

cache>

We will also provide a built-in implementation for LuceneSerializer

...

called FlatFormatSerializer(). With this example serializer users can specify nested fields using the syntax fieldnameAtLevel1.fieldnameAtLevel2

...

for both indexing and querying.

For example, in the following data model Customer object contains both a Person fieldobject and a collection of Page objects. The Person object also contains a Page fieldobject.

Code Block

public class Customer implements Serializable {
  private String name;
  private PersonCollection<String> contactphoneNumbers; // search nested object 
  private Collection<Person> contacts;
  private Page[] myHomePages;
  ......
}
public class Person implements Serializable {
  private String name;
  private String email;
  private int revenue;
  private String address;
  private String[] phoneNumbers;
  private Page homepage;
  .......
}
public class Page implements Serializable {
  private int id; // search integer in int format
  private String title;
  private String content;
  ......
}

The following example below demonstrates how to index the nested fields: contactcontacts.name, contactcontacts.email, contactcontacts.address, contactcontacts.homepage.title.

Note: each segment is a field name, not a field type, because Customer class could have more than one field of type Person; e.g. Person contact contacts and Person deliveryman. The field name is used to identify the parent field.

...

Code Block

// Get LuceneService
LuceneService luceneService = LuceneServiceProvider.get(cache);

// Create Index on fields, some are fields in nested objects:
luceneService.createIndexFactory().setLuceneSerializer(new FlatFormatSeralizerFlatFormatSerializer()) /* an out-of-box LuceneSerializer implementation */
      .addField("name").addField("contactcontacts.name").addField("contactcontacts.email").addField("contactcontacts.address").addField("contactcontacts.homepage.title")
      .create("customerIndex", "Customer");

// Now to create region
Region CustomerRegion = ((Cache)cache).createRegionFactory(shortcut).create("Customer");

gfsh command line:

Code Block

gfsh create lucene index --name=customerIndex --region=/Customer --field=name,contacts.name,contacts.email,contacts.address,contacts.homepage.title --serializer=org.apache.geode.cache.lucene.FlatFormatSerializer

The syntax for querying the nested field is the same as for a top level field, but with the additional qualifying parent field name, such as "contactcontacts.name:tzhou11*". This distinguishes which "name" field when there can potentially be more than one 'name' field at different hierarchical levels in the object.

Code Block

LuceneQuery query = luceneService.createLuceneQueryFactory().create("customerIndex", "Customer", "contactcontacts.name:tzhou11*", "name");
 
PageableLuceneQueryResults<K,Object> results = query.findPages();

Out-Of-Box implementation

We 'll will provide an out-of-box implementation for the LuceneSerializer: FlatFormatSerializer.

...

For example, the FlatFormatSerializer will convert a Customer object into a document as

(name:John11),(contactcontacts.name:tzhou11), (contactcontacts.email:tzhou11@gmail.com), (contactcontacts.address:15220 Wall St), (contactcontacts.homepage.id:11), (contactcontacts.homepage.title: Mr. tzhou11), (contactcontacts.homepage.content: xxx)

Risks and Mitigations

...

Space shortcuts

Page tree

Versions Compared

Old Version 18

New Version Current

Key

XML Configuration

Out-Of-Box implementation

Risks and Mitigations

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 18

New Version Current

Key

XML Configuration

Out-Of-Box implementation

Risks and Mitigations