...
Sharp Checkpointing has side-effects when throughput of data updates is greater than throughput of physical storage device. Under heavy load of writes, operations per second rate periodically drops to zero:
When offheap memory accumulates too many dirty pages (pages with data not written to disk yet), Ignite node initiates checkpoint — process of writing сonsistent snapshot of all pages to disk storage. If dirty page is changed during ongoing checkpoint before being written to disk, its previous state is copied to a special data region — checkpoint buffer:
Slow storage devices cause long-running checkpoints. And if load is high while checkpoint is slow, two bad things can happen:
...
Any of two events above will cause Ignite node to freeze all updates until the end of current checkpoint. That's why operations/sec graph falls to zero.
Since Ignite 2.3, data storage configuration has writeThrottlingEnabled property. If it's enabled, there are two possible situations that can trigger throttling:
If throttling is triggered, threads that generate dirty pages are slowed with LockSupport.parkNanos(). Throttling stops when none of two conditions above is true (or when checkpoint is finished). As a result, node will provide constant operations/sec rate at the speed of storage device instead of initial burst and following long freeze.
There are two approaches to calculate necessary time to park thread: exponential backoff (start with ultra-short park, every next park will be <factor> times longer) and speed-based (collect history of disk write speed measurements, extrapolate it to calculate "ideal" speed and bound threads that generate dirty pages with that "ideal" speed) - Ignite node chooses one of them adaptively.
There are two ways to find out that Pages Write Throttling affects data update operations.
Take a thread dump - some threads will be waiting at LockSupport#parkNanos with "throttle" classes in trace. Example stacktrace:
Code Block |
---|
"data-streamer-stripe-4-#14%pagemem.PagesWriteThrottleSandboxTest0%@2035" prio=5 tid=0x1e nid=NA waiting java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Unsafe.java:-1) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PagesWriteSpeedBasedThrottle.doPark(PagesWriteSpeedBasedThrottle.java:232) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PagesWriteSpeedBasedThrottle.onMarkDirty(PagesWriteSpeedBasedThrottle.java:220) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.allocatePage(PageMemoryImpl.java:463) at org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.allocateDataPage(AbstractFreeList.java:463) at org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:501) at org.apache.ignite.internal.processors.cache.persistence.RowStore.addRow(RowStore.java:102) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.createRow(IgniteCacheOffheapManagerImpl.java:1300) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.createRow(GridCacheOffheapManager.java:1438) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$UpdateClosure.call(GridCacheMapEntry.java:4338) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$UpdateClosure.call(GridCacheMapEntry.java:4296) at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:3051) at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.access$6200(BPlusTree.java:2945) at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1717) ... |
If throttling is applied, related statistics will be dumped to log from time to time:
Code Block |
---|
[2018-03-29 21:36:28,581][INFO ][data-streamer-stripe-0-#10%pagemem.PagesWriteThrottleSandboxTest0%][PageMemoryImpl] Throttling is applied to page modifications [percentOfPartTime=0,92, markDirty=9905 pages/sec, checkpointWrite=6983 pages/sec, estIdealMarkDirty=41447 pages/sec, curDirty=0,07, maxDirty=0,26, avgParkTime=741864 ns, pages: (total=169883, evicted=0, written=112286, synced=0, cpBufUsed=15936, cpBufTotal=241312)] |
The most relevant part of this message is percentOfPartTime metric. In the example it's 0.92 - writing threads are stuck in LockSupport.parkNanos() for 92% of the time, which means very heavy throttling.
Message will appear in log when percentOfPartTime will reach 20% border.
...