THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
Lucene has four underlying types that a docvalues field can have. Currently Solr uses three of these:
Wiki Markup NUMERIC: a single-valued per-document numeric type. This is like having a large long\[\] array for the whole index, though the data is compressed based upon the values that are actually used.
- For example, consider 3 documents with these values:
In this example the field would use around 1 bit per document, since that is all that is needed.No Format doc[0] = 1005 doc[1] = 1006 doc[2] = 1005
- For example, consider 3 documents with these values:
Wiki Markup SORTED: a single-valued per-document string type. This is like having a large String\[\] array for the whole index, but with an additional level of indirection. Each unique value is assigned a term number that represents its ordinal value. So each document really stores a compressed integer, and separately there is a "dictionary" mapping these term numbers back to term values.
- For example, consider 3 documents with these values:
Value "aardvark" will be assigned ordinal 0, and "beaver" 1, creating these two data structures:No Format doc[0] = "aardvark" doc[1] = "beaver" doc[2] = "aardvark"
No Format doc[0] = 0 doc[1] = 1 doc[2] = 0 term[0] = "aardvark" term[1] = "beaver"
- For example, consider 3 documents with these values:
- SORTED_SET: a multi-valued per-document string type. Its similar to SORTED, except each document has a "set" of values (in increasing sorted order). So it intentionally discards duplicate values (frequency) within a document and loses order within the document.
- For example, consider 3 documents with these values:
Value "aardvark" will be assigned ordinal 0, "beaver" 1, and "cat" 2, creating these two data structures:No Format doc[0] = "cat", "aardvark", "beaver", "aardvark" doc[1] = doc[2] = "cat"
No Format doc[0] = [0, 1, 2] doc[1] = [] doc[2] = [2] term[0] = "aardvark" term[1] = "beaver" term[2] = "cat"
- For example, consider 3 documents with these values:
Wiki Markup BINARY: a single-valued per-document byte\[\] array. This can be used for encoding custom per-document datastructures.