ID	IEP-31
Author	Anton Vinogradov
Sponsor	Alexey Goncharuk
Created	4 March 2019
Status	ACTIVE

Motivation

Apache Ignite provides a consistency guarantee, each backup should contain the same value for the same key, at least eventually.

But this guarantee can be violated because of ...

bugs,
crashes,
non-idempotent-operation on data executed more than once locally,
cosmic rays,
hackers,
uborshitca with shwabra,

but ... most likely because of bugs.

And yes, we had some (eg. IGNITE-10078) and fixed them.

But, is there any chances we fixed all and will not write the new? While Ignite is distributed ... such bugs are possible.

Description

Cache.withReadRepair()

The main idea is to provide special read mode which will read a value from primary and all backups and will check that values are the same.

In case values are differ they should be repaired according to the appropriate strategy.

So, the final goal is to have ability to detect and fix consistency issues.

Case #1 - "offline check with eventual fix"

Currently, we able to use "idle_verify" feature to detect broken partitions.

Once we detected them, we should be able to fix them.

Case #2 - "online check"

Another way is to use this feature on each get request.

Case #3 - "background check"

One more way is to check all entries in loop way pemanently.

Possible strategies

Quorum (When majority wins)

But what if we have 3+ different values for the same key at topology?

Primary or oldest node always wins

It's not true.

Bugs able to make every node outdated.

LWW (Last Write Wins)

Not a 100% guarantee, but Simple!

Seems to be suitable because of each value related to the GridCacheVersion which is comparable.

The strategy provided by the user

Best case.

Can be implemented as an add-on.

Risks and Assumptions

1) LWW and any other strategy do not guarantee that the correct value will be chosen.

We have to record the event contains all values and the chosen one.

The event will allow to

- got we have an inconsistent state situation

- investigate which value is correct manually and re-repair if necessary

1.1) Seems, it's not possible to fix any cases.

For any transactional caches we able to perform pessimistic serializable transaction per key.

But, atomic caches cannot be fixed this way.

2) Consistency guard able to produce the following problems

- replace hot data with cold data

We have to have special configuration property allows or restricts to check (read) data available only at the persistence layer

- decrease throughput/latency metrics

Some throttle feature should be implemented

Discussion Links

Initial review request - http://apache-ignite-developers.2346864.n4.nabble.com/Consistency-check-and-fix-review-request-td41629.html

"Idle verify" to "Online verify" discussion - http://apache-ignite-developers.2346864.n4.nabble.com/quot-Idle-verify-quot-to-quot-Online-verify-quot-td41928.html

Second review request - http://apache-ignite-developers.2346864.n4.nabble.com/Read-Repair-ex-Consistency-Check-review-request-2-td42421.html

Reference Links

Cassandra's Read Repair feature - https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesReadRepair.html

Idle_verify - https://apacheignite-tools.readme.io/docs/control-script#section-verification-of-partition-checksums

Tickets

key	summary	type	created	updated	due	assignee	reporter	priority	status	resolution
JQL and issue key arguments for this macro require at least one Jira application link to be configured

Page tree

IEP-31 Consistency check and fix