Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The Interim Caching Layer

From V6.0.0, the interim caching feature is removed, please consider other options.

When we thinking about the storage of ATS, the original desigin is to support many disks, with the same size(best work with block devices without raid), and build up the volumes(partitions), then assign each of the volumes to some domain(hostname), we find out that if we don't make big change in the storage design, we don't have a easy way to archive the multi-tiered caching storage, then we come to the following INTERIM solution:

...

  • coins:
  • we make a solid solution without big change in the storage architecture
  • the LUR help us get the hot blocks, it is efficent
  • the block layer interim caching can help on small objects and big ones too
  • pins:
  • loss data if the server process crash Fixed in TS-2275
  • limited in block device only for used as interim caching device
  • the interim caching device space is not a add-on, but a copy of the hot data on the slow devices
  • we set lower the max disk size of the storage from 0.5PB to 32TB.
  • the interim caching function is not enabled by default configuration

...

the change of Dir:

Code Block

@@ -155,15 +157,42 @@ struct FreeDir
   unsigned int reserved:8;
   unsigned int prev:16;         // (2)
   unsigned int next:16;         // (3)
+#if TS_USE_INTERIM_CACHE == 1
+  unsigned int offset_high:12;   // 8GB * 4K = 32TB
+  unsigned int index:3;          // interim index
+  unsigned int ininterim:1;          // in interim or not
+#else
   inku16 offset_high;           // 0: empty
+#endif
 #else
   uint16_t w[5];
   FreeDir() { dir_clear(this); }

we split the stat of read_success into disk, interim and ram:

Code Block

@@ -2633,6 +2888,11 @@ register_cache_stats(RecRawStatBlock *rsb, const char *prefix)
   REG_INT("read.active", cache_read_active_stat);
   REG_INT("read.success", cache_read_success_stat);
   REG_INT("read.failure", cache_read_failure_stat);
+  REG_INT("interim.read.success", cache_interim_read_success_stat);
+  REG_INT("disk.read.success", cache_disk_read_success_stat);
+  REG_INT("ram.read.success", cache_ram_read_success_stat);
   REG_INT("write.active", cache_write_active_stat);
   REG_INT("write.success", cache_write_success_stat);
   REG_INT("write.failure", cache_write_failure_stat);

...

we have a system with 160G SSD + 3 * 500G SAS, 16G RAM, 4 cores, here is the result of tsar and iostat -x:
```

Code Block

Time           --------------------ts------------------ -------------ts_cache-----------
Time              qps    cons     Bps      rt     rpc      hit  ramhit    band  ssdhit
24/06/13-10:30 901.83   18.89   22.6M   17.36   47.74    87.30   68.08   88.90   22.49
24/06/13-10:35 934.12   18.88   22.0M   14.34   49.47    87.60   68.53   90.70   22.21
24/06/13-10:40 938.14   18.92   21.7M   15.36   49.58    87.70   68.02   89.50   22.45


avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.47    0.00   15.62   25.09    0.00   53.82

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     7.33   25.67    3.33  1600.00  1438.00   104.76     0.45   15.46  12.17  35.30
sdb               0.00     0.00   28.67   11.33  1461.00  8723.00   254.60     0.74   18.47  11.21  44.83
sdc               0.00     0.00   25.67    2.00  2178.00  1373.33   128.36     0.40   14.05  11.04  30.53
sdd               0.00     0.00  196.00    4.00 14790.00  2823.00    88.06     0.13    0.66   0.41   8.30

...