Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Options, Direct IO and WAL

...

Order of nodes join is not relevant, there is possible situation that oldest node has older partition state, but joining node has higher partition counter. In this case rebalancing will be triggered by coordinator. Rebalancing will be performed from the newly joined node to existing one (note this behaviour may be changed under IEP-4 Baseline topology for caches)


Advanced Configuration

 WAL History Size

In corner case we need to store WAL only for 1 checkpoint in past for successful recovery (PersistentStoreConfiguration#walHistSize )

...

This type of I/O implementation opeate with files using standard Java inferface. java.nio.channels.FileChannel is used.

This

...

implementation was used by default before 2.4.

To switch to this implementation it is required to set factory in config (DataStorageConfiguration#setFileIOFactory)  or change using system property: IgniteSystemProperties.IGNITE_USE_ASYNC_FILE_IO_FACTORY = "false".

 

This type of IO is always used for WAL files.

Async I/O

This option is default after since 2.4.

It In was introduced to protect IO module and underlying files from close by interrupt problem.

...

To set this implementation it is possible to set factory in config (DataStorageConfiguration#setFileIOFactory)  or change using system property: IgniteSystemProperties.IGNITE_USE_ASYNC_FILE_IO_FACTORY = "true".

Direct I/O

Introduction and Requirements

Since Ignite 2.4 there is plugin for enabling Direct I/O mode. 

...

Plugin works on Linux with the version of the kernel over 2.4.2. This plugin switches the input of the output for durable (page) memory to use the mode Direct IO for files. If incompatible OS or FS is used, plugin has no effect and fallbacks to regular I/O implementation. Durable memory page size should be not less than physical device block and Linux system page size. Durable memory page size should be divisible by underlying OS and FS blocks sizes. Usually both sizes are 4Kbytes, so using default page size is usually sufficient to enable plugin.

Configuration

There is no need to do additional configuration of plugin. It is sufficient to add ignite-direct-io.jar and Java Native Access (JNA) library jar (jna-xxx.jar) to classpath.

...

However, disabling plugin’s function is possible through system Property. To disable Direct IO set IgniteSystemProperties#IGNITE_DIRECT_IO_ENABLED to false.

Enabled direct input-output mode allows Ignite to bypass the system cache pages Linux is fully conveys the management of pages to Ignite.

WAL and Native IO

 

Write Ahead log does not have blocks(chunks/pages) as Durable Memory Page store has. So Direct IO currently Direct I/O mode can't be enabled for Write Ahead Log files. WAL logging. WAL always goes through the conventional system I/O calls. 

However, when the ignite-direct-io plugin was configured successfully, WAL logging still obtains benefit.

ignite-direct-io plugin allows WAL manager to advice Linux system, that file is not needed (Native call posix_fadvise, with flag POSIX_FADV_DONTNEED).

This advice gives only recommendation to Linux system, "However, when working with plugin, WAL manager applies advising Linux systems do not store the file data of in the file in page cache, as they data are not required". This results WAL data is still going to page cache first, but according this advice Linux will flush and remove these data during next page cache scan.

Direct I/O & Performance

Direct I/O can bring possible negative effects to performance of reading pages. In Direct I/O mode all pages are read bypassing Linux cache directly from the disk.

...

Direct I/O still can be used for production in case of accurate configuration. So  But currently the Direct I/O mode is not enabled by default and provided as plugin.

Benefit of using Direct I/O is more predictable time of fdatasync (fsync) operation. As all data is not accumulated in RAM and goes directly to disk, each fsync of page store requires less time, than fsync'ing all data from memory. Direct I/O does not guarantee fsync(), immediately after write, so checkpoint writers still calls fsync at the end of checkpoint.