Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


IDIEP-[97]
Author
Sponsor
Created

 

Status

Status
titleDRAFT


Table of Contents

Motivation

...

Code Block
languagejava
titleTable
CREATE ZONE 
	{ database_name.schema_name.distribution_zone_name | schema_name.distribution_zone_name | distribution_zone_name }
	[WITH 
		[
			<data_nodes_auto_adjust> |
			DATA_NODES_FILTER = filter |
			(<data_nodes_auto_adjust>, DATA_NODES_FILTER = filter)
		],
		[PARTITIONS = partitions],
    	[REPLICAS = replicas],
		[AFFINITY_FUNCTION = function]
	]
[;]

<data_nodes_auto_adjust> ::= [
	DATA_NODES_AUTO_ADJUST_SCALE_UP = scale_up_value |
	DATA_NODES_AUTO_ADJUST_SCALE_DOWN = scale_down_value |
	(DATA_NODES_AUTO_ADJUST_SCALE_UP = scale_up_value & DATA_NODES_AUTO_ADJUST_SCALE_DOWN = scale_down_value) | DATA_NODES_AUTO_ADJUST  = auto_adjust_value
]

...

The easiest way to understand auto_adjust semantics is to take a look at a few examples. Let’s start with a simplest one - scale_up example:

-1

start Node A;

start Node B;

start Node C;

CREATE ZONE zone1 WITH   DATA_NODES_AUTO_ADJUST_SCALE_UP = 300;

CREATE TABLE Accounts … WITH  PRIMARY_ZONE = zone1 

User starts three Ignite nodes A, B, C and creates table Accounts specifying scale up auto adjust timeout as 300 seconds. Accounts table is created on current topology, meaning that <Transaction>.dataNodes = [A,B,C]

0

start Node D ->

    Node D join/validation ->

    D enters logical topology ->

    logicalTopology.onNodeAdded(Node D) ->

    scale_up_auto_adjust(300) timer is

    scheduled for the <Accounts> table.

At time 0 seconds the user starts one more Ignite node D, that joins the cluster. On entering logical topology the onNodeAdded event is fired. This event schedules a timer of 300 seconds for table Accounts after which the dataNodes of that table transitively through the distribution zone will be recalculated from [A,B,C] to [A,B,C,D]



250

start Node E -> 

    scale_up_auto_adjust(300) is

    rescheduled for the <Accounts> table.

At 250 seconds one more node is added, that action reschedules scale_up_auto_adjust timer for another 300 seconds.

550

scale_up_auto_adjust fired ->

    set table.<Accounts>.dataNodes = 

    [NodeA, NodeB, NodeC, Node D, Node E]

At 550 seconds scale_up_time is fired, that leads to <Transaction>dataNodes recalculation by attaching the nodes that were added to logical topology - Nodes D and E in the given example.

600

start Node F ->

     <Accounts> table schedules   

     scale_up_auto_adjust(300);

At 600 seconds one more node is added, there are no active scale_up_auto_adjust timers, so given events schedules new one.

Now it’s time to expand the example above with node that exits the cluster topology:

-1

start Node A;

start Node B;

start Node C;

CREATE ZONE zone1 WITH   DATA_NODES_AUTO_ADJUST_SCALE_UP = 300, DATA_NODES_AUTO_ADJUST_SCALE_DOWN= 300_000;

CREATE TABLE Accounts … WITH  PRIMARY_ZONE = zone1 

User starts three Ignite nodes A, B, C and creates table Accounts specifying scale up auto adjust timeout as 300 seconds. Accounts table is created on current topology, meaning that <Transaction>.dataNodes = [A,B,C]

0

start Node D ->

    Node D join/validation ->

    D enters logical topology ->

    logicalTopology.onNodeAdded(Node D) ->

    scale_up_auto_adjust(300) timer is

    scheduled for the <Accounts> table.

At time 0 seconds the user starts one more Ignite node D, that joins the cluster. On entering logical topology the onNodeAdded event is fired. This event, schedules a timer of 300 seconds for table Accounts after which the dataNodes of that table will be recalculated from [A,B,C] to [A,B,C,D]

100

stop Node C -> 

    scale_down_auto_adjust(300_000) timer

    is scheduled for the <Accounts> table.

At 100 seconds the user stops Node C (or it painfully dies). TableManager detects onNodeLeft(Node C) event and starts scale_down time for 300_000 seconds for table <Accounts>. Please pay attention that the node left doesn’t affect the scale_up timer.

250

start Node E ->

    scale_up_auto_adjust(300) timer is

    re-scheduled for the <Accounts> table.

At 250 seconds Node E is added, that re-schedules scale_up_auto_adjust timer for another 300 seconds. The important part here is that adding the node doesn’t change scale_down time only scale_up one.

550

scale_up_auto_adjust fired ->

    set table.<Accounts>.dataNodes = 

    [NodeA, NodeB, NodeC, Node D, Node E]

At 550 seconds scale_up_time is fired, that leads to <Transaction>dataNodes recalculation by attaching the nodes that were added to logical topology - Nodes D and E in the given example. Please pay attention that despite the fact there's no Node C in logical topology it still takes its place in <Transaction>.dataNodes. 

300100

scale_down_auto_adjust fired -> 

    set table.<Accounts>.dataNodes = 

    [NodeA, NodeB, Node D, Node E]

At 300_100 seconds scale_down_auto_adjust timer is fired, that leads to removing Node C from <Transaction>.dataNodes.

At this point we’ve covered DATA_NODES_AUTO_ADJUST_SCALE_UP DATA_NODES_AUTO_ADJUST_SCALE_DOWNand its combination. Let's take a look at the last remaining auto adjust property - DATA_NODES_AUTO_ADJUST. The reason we have one is to eliminate excessive rebalance in case of users intention on having the same value for both scale_up and scale_down. As we saw in the example above, the events of adding and removing nodes fall into the corresponding frames with a dedicated timer each: one for expanding the topology (adding nodes), and another for narrowing it (removing nodes), which in turn leads to two rebalances - one per each frame. If the user however wants to put both types of events (adding and removing nodes) in one frame with only one dataNodes recalculation and one rebalance, he should use the DATA_NODES_AUTO_ADJUST property.

...

Data nodes evaluation rules

Compatibility Matrix

AUTO_ADJUST_SCALE_UPAUTO_ADJUST_SCALE_DOWNAUTO_ADJUSTFILTER
AUTO_ADJUST_SCALE_UP



AUTO_ADJUST_SCALE_DOWN



AUTO_ADJUST



FILTER



As you can see, the only properties that are incompatible with each other are DATA_NODES_AUTO_ADJUST with any of the properties DATA_NODES_AUTO_ADJUST_SCALE_UP or DATA_NODES_AUTO_ADJUST_SCALE_DOWN or a combination of them.

...