ID	IEP-71
Author	Maksim Timonin
Sponsor	Maksim Timonin Andrey Mashenkov Taras Ledkov Anton Vinogradov Aleksey Plekhanov Ivan Daschinsky Maxim Muzafarov Nikolay Izhikov
Created	06.04.2021
Status	ACTIVE

Motivation

Proposal

Ignite will provide an API for querying and creating secondary (non-SQL) indexes with next features:

Public Ignite Java API (IgniteCache, ClientCache) for direct querying secondary indexes:
1. Query result is Iterator<Cache.Entry<K.V>>, result data is sorted by an index conditions;
2. Decrease an overhead on SQL stuff like query parsing, query planning, etc;
3. Clear Java syntax instead of sometimes non-intuitive Ignite SQL;
4. Flexibility in returned fields (cache key, cache val, both, only indexed fields). It helps improve performance for some cases even more.
Improved public Ignite Java API for creating secondary indexes:
1. Functional indexes with Java functional interfaces;
2. New Java API for dynamic creation of secondary indexes;
3. Remove SQL dependency (annotation QuerySqlField) if only index feature is required and SQL is not needed.
Transactional support for "select for update" index queries:
1. User will able to update cache entries for result of an index query within same transaction.

Current state

Ignite has limitations in search by non-key fields. The proposal's goal is to overcome the issues described below.

Syntax limitations for SqlFieldsQuery.
ScanQuery may be slow in case of full cache scan for a data with high selectivity.
Usage of affinity keys does not provide well performance for arbitrary range queries.
No way to provide a functional index.
No transaction guarantees ("select for update" query).

Show this limitations with an example entity: Good(long uid, @Index long ts). Task is to get all cache entries that are created since the last midnight.

1) Using SqlFieldsQuery.

FieldsQueryCursor<List<?>> result = cache.query(
	new SqlFieldsQuery("select _val from Good where ts > ?").setArgs(lastMidnightTs));

List<Good> goods = (List<Good>) result.getAll().get(0);

Pros are:

SQL query uses embedded fast B+Tree index;
Flexibility in returned fields (key only, val only, both, or only indexed fields).

Cons are:

SQL query syntax:
1. User may not use SQL anywhere in a project. So this query can be the only SQL query within a project, so it leads to bad UX.
2. SQL may be unclear for a user as it contains "_val" as column to return. Also it doesn't provide a type check, due to Java generics runtime erasion.
3. Result of a query is not a list of Cache.Entry, but a list of rows, where cache value is just one of column.
SQL does not give guarantee of using index for a query (no hints, also H2 does not automatically analyze selectivity of a data for a GridH2Table).
SQL does not support functional indexes.
SQL does not support transactions, no way to implement "select for update" query.

2) Using ScanQuery.

QueryCursor<Cache.Entry<Long, Good>> cursor = cache.query(new ScanQuery<>((k, v) -> v.ts > lastMidnightTs));

List<Cache.Entry<Long, Good>> result = cursor.getAll();

Pros are:

Clear Java syntax (generics, BiPredicate);
Returns Cache.Entry cursor.

Cons are:

Scan query may be slow as traverses all data in a cache for high selectivity data;
Scan queries does not support transactional updates, no way to implement "select for update" query;
No flexibility in returning fields (it returns just cache key-val pairs).

3) Leverage on partitioning.

Ignite provides an opportunity to define AffinityKey for a cache and optionally custom AffinityFunction.

Pros is:

Speed up queries that conditions match to an affinity:
1. ScanQuery have an attribute of partition number.
2. SqlFieldsQuery calculates required partition by self.
Custom affinity function can provide functional condition for data distribution.

Cons are:

Cache can have only one affinity function.
Affinity function maps cache entries to partitions, so range queries won't win there.
Custom affinity function can unevenly distribute data between nodes. For our example, custom affinity will distribute Goods data by date. But for some days there can be less data. So this function should be tested well before usage.

Description

It is suggested to implement the proposal step by step in 4 sequential phases.

Phase 1 (Index Queries API). User will get:

Create fast index queries with the IndexQuery class in public API (IgniteCache, ClientCache);
Possibility to query indexes created with: QueryEntity.setIndexes, or QuerySqlField.index withCacheConfiguration.setIndexedTypes.

Phase 2 (Index Creation API). User will get:

Improved a public Java API for creating indexes (annotation SecondaryIndex instead of SqlQueryFields);
New Java API for dynamic index creation;
Opportunity to create functional indexes.

Phase 3 (Return Flexibility). User will get:

Improved the IndexQuery API to get only required cache parts (cache key, cache value, both, or indexed fields only). It will make queries even more fast if only partial info is required.

Phase 4 (Select for update support):

Transactional support for "select for update" queries

Phase 1. Index Queries API.

1) Queries API

In this step we will provide Index Query API for existing indexes. Currently Ignite supports 3 ways to declare indexes:

Java API: CacheConfiguration.setQueryEntity.setIndexes(QueryIndex idxs... );
Java API: CacheConfiguration.setIndexedTypes + annotation QuerySqlField.index;
SQL API: Create index with SQL query.

Every index created with those API can be described with: CacheContext, cache value class / type, index name, fields

Index Query API will support those different index descriptions. API provides:

Public class IndexQuery for building index query:
1. it inherits public class Query<R> with R = Cache.Entry<K, V>;
2. it accepts index descriptions at a constructor;
3. it accepts index conditions (joint with AND) at object methods.
User provides index conditions that matches index fields to leverage on an index structure.

Internal changes for the query are:

Internal class IndexQueryDesc for transferring query between nodes.
Internal class IndexCondition for building index queries
New query type INDEX in the enum InternalGridCacheQueryType.

So this is how user will specify an index query:

// Creates an index query for an index, specifying value class or value type: 
// 1. Specify index description at constructor.
// 2. Index name is optional param, can try do best to find right index basing on specified criteria fields.
// 3. Index criteria (joint with AND operation) with methods. Order of fields in criteria doesn't matter

QueryCursor<Cache.Entry<Long, Good>> cursor = ignite.cache("GOOD").query(
	new IndexQuery<Long, Good>(Good.class, idxName?)  // idxName is optional.
		.setCriteria(gt("ts", lastMidnightTs), lt("price", 123.0))
);

QueryCursor<Cache.Entry<Long, Good>> cursor = ignite.cache("GOOD").query(
	new IndexQuery<Long, Good>("GOOD_TYPE", idxName?)  // idxName is optional.
		.setCriteria(gt("ts", lastMidnightTs), lt("price", 123.0))
);

Classes to implement the API:

// Public packages.

// IndexQuery extends
public IndexQuery<K, V> extends Query<Cache.Entry<K, V>> {

	private List<IndexCriteria> criteria = new ArrayList<>();

	// Index description.
	private @Nullable String idxName;
	private String valType;

	public IndexQuery lt(String field, Object val) {
		criteria.add(IndexCriteriaBuilder.lt(field, val));

		return this;
	}

	// Other methods are:
	// eq, notEq, gt, gte, lt, lte, between, in, notIn, min, max, predicate
}

// Internal packages.

class IndexCriteriaBuilder {
	public static IndexCriteria lt(String field, Object val);
}

abstract class IndexCriteria extends Serializable {
	private final List<String> fields;
}

// min, max
class MinMaxIndexCriteria extends IndexCondition {
	private final boolean max;
}

// gt, gte, lt, lte, between
class RangeIndexCriteria extends IndexCriteria {
	private final @Nullable Object lower;
	private final @Nullable Object upper;

	private final boolean lowerInclusive;
	private final boolean upperInclusive;
}

// in, notIn, eq, notEq
class InIndexCriteria extends IndexCriteria {
	private final Object[] vals;

	// Flag for not-in condition.
	private final boolean inverse;
}

// predicate
class PredicateIndexCriteria extends IndexCriteria {
	private IgnitePredicate<?> predicate;
}

2) Query Processing

Currently Ignite have infrastructure for processing index queries: GridIndexingManager, IndexingSpi, SpiQuery. So it is possible to use this code to implement processing of index queries: provide a default implementation for IndexingSpi and make IndexQuery as a wrapper on SpiQuery. But there are some issues:

GridIndexingManager (and IndexingSpi) supports 3 methods: store, remove, query.
1. Store/remove methods work with cache (K, V). Looks like it is implemented for storing the data in user defined structures, independent to Ignite storages. So SpiQuery aimed to query those user defined storages.
2. Ignite has the IndexProcessor that is responsible for storing / removing index entries, and it works directly with CacheDataRow.
3. Our goal is run queries on indexes are handled with Ignite storages (including already existing indexes).
Some users may provide a custom implementation of IndexingSpi by some reasons. Then user will have to choose between own implementation and new one.
Name "SpiQuery" isn't obvious for a goal of querying secondary indexes.
IndexingSpi.query function has the "cache" param. But cache actually is only a one of attributes describes an index. Also SQL indexes are not described with cache at all, but schema/table instead. Such method doesn't fit us well.

Then proposal is:

Introduce new type of query - INDEX;
Final query processing is performed within IndexQueryProcessor;
Entrypoint for distributed index queries is the IndexQueryProcessor.queryDistributed method, that leverage on other cache queries (ScanQuery, TextQuery):
1. the Map phase is node-local querying of index, it returns sorted data by definition;
2. filtering with IndexQuery.setFilter is analogue of ScanQuery.setFilter;
3. the Reduce phase is responsible for the Merge Sort of data from multiple nodes;
Entrypoint for local query is the IndexQueryProcessor.queryLocal method. It accepts users IndexQueryand IndexingQueryFilter for filtering result cache entries (primary partition, MVCC, etc).

Rules to write criteria:

Order of field in criteria doesn't matter.
Fields can be used any times, Ignite uses AND to build final index range.

Rules to choose index:

If index name specified - then get index with this name and check that criteria fields match it.
If index name isn't specified - then find index that matches all specified criteria fields. Fields have to be prefix of an index.

// Internal package.

public class IndexQueryProcessor extends GridProcessorAdapter {
	// Provides access to indexes.
	private final IndexProcessor processor;

	// Entrypoint for local query.
    public Iterator<Cache.Entry<?,?>> queryLocal(IndexQuery idxQuery, @Nullable IndexingQueryFilter filters) throws IgniteException {
		// 1. If user specified index name, then check a query:
		// - sort query fields in index key order;
		// - check that it's a valid index query: fields covers index keys (from left to right)
		// - fail otherwise.
		// 2. If user doesn't specified index name:
		// - get all indexes for specified cache and Value.class;
		// - find index by filtering by a query fields (index keys must be in the same order as specified query fields, try sort fields to match any index).
		// - validate index query (see validation steps from 1.)
		Index idx = index(idxQuery.desc());

		// 1. Parse index conditions.
		// 2. Validate index condition, index type.
		// 3. Maps index conditions to an index query methods.
		// 4. Perform index operations, get Cursor.
		GridCursor<IndexRow> cursor = query(idx, idxQry.conditions());

		// 1. Map IndexRow to Cache entry.
		// 2. Apply specified cache filters (primary partitions, MVCC versions, etc.)
		// 3. Wrap cursor to iterator and return.
		return map_and_filter(cursor, filters);
	}

	private GridCursor<IndexRow> query(Index idx, List<IndexCondition> conditions) {
		// eq(key) -> idx.findOne(key)
		// notEq(key) -> idx.find(null, null, current -> current != key)
		// gt(key) -> idx.find(key, null, current -> current != key)
		// gte(key) -> idx.find(key, null)
		// lt(key) -> idx.find(null, key, current -> current != key)
		// lte(key) -> idx.find(null, key)
		// between(left, right) -> idx.find(left, right), inclusive only
		// in(keys...) -> idx.findOne(keys[0]) && idx.findOne(keys[1]) ...
		// notIn(keys...) -> idx.find(null, null, current -> !keys.contains(current))
		// min() -> idx.findFirst()
		// max() -> idx.findLast()
		// predicate(p) -> idx.find(null, null, p)
	}
}

Phase 2. IndexScan.

For queries that doesn't match index Ignite will run the index scan find(null, null) and apply user conditions on it. Cases are:

index conditions joint with OR;
different sort order: idx(ts desc, price asc) but query(ts desc, price desc);
Cache is wide and IndexScan on few fields is much cheaper than ScanQuery;
setFilter operation with a custom user condition.

IndexName is optional. If it's not specified than take PK index of specified table.

// The "predicate" operation can't find best index for a user query. So user must provide an index name for this operation. 
QueryCursor<Cache.Entry<Long, Good>> cursor = ignite.cache("GOOD").query(
	new IndexQuery<>("GOOD_TBL", "GOOD_TS_DESC_PRICE_ASC_IDX")
		.setFilter((good) -> good.ts > lastMidnightTs || price > 100)
);

Phase 3. Index creation API.

Currently an index can be declared statically (with CacheConfiguration) or created dynamically only with SQL ddl query. Both ways doesn't support functional indexes. Then, implementation will have next steps:

Provide a mechanism of creating index on distributed nodes (from CacheConfiguration, or dynamically):
1. There is the function IgniteH2Indexing.dynamicIndexCreate that handles the SchemaIndexCreateOperation operation;
2. Looks like non-SQL it can be done in this way too with a slight refactoring, and handling SQL and non-SQL indexes differently. Condition (QueryIndex.tableName == null) can be used to separate SQL and non-SQL indexes;
Introduce Functional indexes. Some assumptions are:
1. Add new QueryIndexType - FUNCTIONAL_SORTED;
2. Then we can use QueryIndex class with set function name as a field name in QueryIndex definition;
3. Function is distributed between nodes with the PeerClassLoading feature as IgniteCallable;
Provide a new IndexFactory, IndexDefinition for such indexes.

API for creating non-SQL indexes:

On cache start: Introduce annotation SecondaryIndex:
1. Actually it is alias for QuerySqlField to avoid the "Sql" prefix.
2. But it can be used to annotated a function (the SQL annotation does not support it) in user class to declare functional index.
Dynamically:
1. Extend IgniteCache, ClientCache with method "createIndex" that accepts QueryIndex object.

User will apply SecondaryIndex on POJO field or function:

// Using annotation @SecondaryIndex that is actually alias for QuerySqlField.

public class Good {
	private long uid;
	@SecondaryIndex(descending = true)
	private long ts;
	private float price;
	private String title;

	@SecondaryIndex(name = "GOOD_DISCOUNT_PRICE_IDX")
	public float discontPrice() {
		return price * discont();
	}

	/** Function that depends on attributes like count of goods, geography, etc. */
	protected float discont();
}

Dynamic index creation is a new method in IgniteCache, ClientCache:

// Dynamic index creation.

QueryIndex qryIdx = new QueryIndex()
	.setName("GOOD_DISCOUNT_PRICE_IDX")
	.setFieldNames(Collections.singleton("discountPrice"))
	.setIndexType(QueryIndexType.FUNCTIONAL_SORTED);

cache.createIndex(qryIdx)

Phase 4. Flexibility of return.

Goal is to provide opportunity to a user to result partially and skip unneeded hops to a Cache:

Cache Key only:
1. If inline is enabled then cache key is stored within inlined fields. So there is no need to ask a Cache for entries at all;
2. if inlined is disabled then there will be only CacheDataRow.key() call without triggering CacheDataRow.value().
Cache Value only: if only value is required then trigger only CacheDataRow.value().
Indexed fields only:
1. if inline is enabled then just construct a new object (using reflection for user class), and fill it with inlined values.

User will specify a flag in IndexQuery: KeyOnly, ValueOnly, IndexFieldsOnly. Result for queries are:

KeyOnly: QueryCursor<Cache.Entry<K, V>> where value is always NULL;
ValueOnly: QueryCursor<Cache.Entry<K, V>> where key is always NULL;
IndexFieldsOnly: QueryCursor<Cache.Entry<K, V>> where key is always NULL, value is an object with NULL in non-indexed fields.

Phase 5. Support of SELECT FOR UPDATE for Index queries.

Goal is to provide "select for update" query with transactional support for index queries:

User provide:
1. an index query
2. an update function
3. transaction parameters (concurrency, isolation);
Ignite:
1. start a transaction
2. finds all cache entries matches for a query
3. apply the update function for every cache row
4. close transaction
5. return to user cursor with updated cache rows.

Deep design of solution is TBD.

Risks and Assumptions

Supporting SQL and non-SQL indexes on the same field - they must work equally.
Rebuild, clear, defragmentation tasks, control.sh, system views, IndexRowCache, security checks, etc(?) for non-SQL indexes - all of the tooling should work correctly with non-SQL indexes.
Need support for tracking index queries (security checks, tracing, query views).
Transactional support for "update for select" should be discovered.
There can be performance issues with IndexFieldsOnly queries due to reflection, or wrong inlining.

Dev list discussion

http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSS-IEP-71-Public-API-for-secondary-index-search-td52177.html

Jira Tickets

IGNITE-14699

IGNITE-14703

Page tree

IEP-71: Public API for secondary index search

Motivation

Proposal

Current state

1) Using SqlFieldsQuery.

2) Using ScanQuery.

3) Leverage on partitioning.

Description

Phase 1. Index Queries API.

1) Queries API

2) Query Processing

Phase 2. IndexScan.

Phase 3. Index creation API.

Phase 4. Flexibility of return.

Phase 5. Support of SELECT FOR UPDATE for Index queries.

Risks and Assumptions

Dev list discussion

Jira Tickets