Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Percent of dirty pages is trigger for checkpointing (e.g. 75%).
  • Timeout is also trigger, do checkpoint every N seconds

 

Pages Write Throttling

 

Sharp Checkpointing has side-effects when throughput of data updates is greater than throughput of physical storage device. Under heavy load of writes, operations per second rate periodically drops to zero:

 

 

 

Image Added

 

When offheap memory accumulates too many dirty pages (pages with data not written to disk yet), Ignite node initiates checkpoint — process of writing сonsistent snapshot of all pages to disk storage. If dirty page is changed during ongoing checkpoint before being written to disk, its previous state is copied to a special data region — checkpoint buffer:

 

 

 

Image Added

 

 

 

Slow storage devices cause long-running checkpoints. And if load is high while checkpoint is slow, two bad things can happen:

 

  • Checkpoint buffer can overflow
  • Dirty pages threshold for next checkpoint can be reached during current checkpoint

 

Any of two events above will cause Ignite node to freeze all updates until the end of current checkpoint. That's why operations/sec graph falls to zero.

 

Since Ignite 2.3, data storage configuration has writeThrottlingEnabled property. If it's enabled, there are two possible situations that can trigger throttling: 

 

  • If checkpoint buffer is being filled too fast. Fill ratio more than 66% triggers throttling.
  • If percentage of dirty pages increases too rapidly

 

If throttling is triggered, threads that generate dirty pages are slowed with LockSupport.parkNanos(). Throttling stops when none of two conditions above is true (or when checkpoint is finished). As a result, node will provide constant operations/sec rate at the speed of storage device instead of initial burst and following long freeze.
There are two approaches to calculate necessary time to park thread: exponential backoff (start with ultra-short park, every next park will be <factor> times longer) and speed-based (collect history of disk write speed measurements, extrapolate it to calculate "ideal" speed and bound threads that generate dirty pages with that "ideal" speed) - Ignite node chooses one of them adaptively. 

 

How to detect that throttling is applied

 

There are two ways to find out that Pages Write Throttling affects data update operations.

 

  1. Take a thread dump - some threads will be waiting at LockSupport#parkNanos with "throttle" classes in trace. Example stacktrace:

    Code Block
    "data-streamer-stripe-4-#14%pagemem.PagesWriteThrottleSandboxTest0%@2035" prio=5 tid=0x1e nid=NA waiting
      java.lang.Thread.State: WAITING
    	  at sun.misc.Unsafe.park(Unsafe.java:-1)
    	  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
    	  at org.apache.ignite.internal.processors.cache.persistence.pagemem.PagesWriteSpeedBasedThrottle.doPark(PagesWriteSpeedBasedThrottle.java:232)
    	  at org.apache.ignite.internal.processors.cache.persistence.pagemem.PagesWriteSpeedBasedThrottle.onMarkDirty(PagesWriteSpeedBasedThrottle.java:220)
    	  at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.allocatePage(PageMemoryImpl.java:463)
    	  at org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.allocateDataPage(AbstractFreeList.java:463)
    	  at org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:501)
    	  at org.apache.ignite.internal.processors.cache.persistence.RowStore.addRow(RowStore.java:102)
    	  at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.createRow(IgniteCacheOffheapManagerImpl.java:1300)
    	  at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.createRow(GridCacheOffheapManager.java:1438)
    	  at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$UpdateClosure.call(GridCacheMapEntry.java:4338)
    	  at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$UpdateClosure.call(GridCacheMapEntry.java:4296)
    	  at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:3051)
    	  at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.access$6200(BPlusTree.java:2945)
    	  at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1717)
    	  ...


  2. If throttling is applied, related statistics will be dumped to log from time to time:

    Code Block
    [2018-03-29 21:36:28,581][INFO ][data-streamer-stripe-0-#10%pagemem.PagesWriteThrottleSandboxTest0%][PageMemoryImpl] Throttling is applied to page modifications [percentOfPartTime=0,92, markDirty=9905 pages/sec, checkpointWrite=6983 pages/sec, estIdealMarkDirty=41447 pages/sec, curDirty=0,07, maxDirty=0,26, avgParkTime=741864 ns, pages: (total=169883, evicted=0, written=112286, synced=0, cpBufUsed=15936, cpBufTotal=241312)]
    
    

    The most relevant part of this message is percentOfPartTime metric. In the example it's 0.92 - writing threads are stuck in LockSupport.parkNanos() for 92% of the time, which means very heavy throttling.
    Message will appear in log when percentOfPartTime will reach 20% border.


CRC validation

For each page a control checksum (CRC32) is calculated. The CRC calculation is performed before each page is written to the page store. When a page is read from the page store, CRC is again calculated against the read data and validated against the CRC field also stored in the page.

...