Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

4844317.4031 us - 64.65489281142766%3] D_1 compare each step inside
压缩方式GZIPLZ4SNAPPYUNCOMPRESSED
datasetsyntheticsyntheticsyntheticsynthetic
pagePointNum(ppn)10000100001000010000
numOfPagesInChunk(pic)1000100010001000
chunksWritten(cw)10101010
timeEncoding(te)TS_2DIFFTS_2DIFFTS_2DIFFTS_2DIFF
valueDataType(vt)INT64INT64INT64INT64
valueEncoding(ve)PLAINPLAINPLAINPLAIN
compression(co)GZIPLZ4SNAPPYUNCOMPRESSED
totalPointNum100000000100000000100000000100000000
tsfileSize(MB)767.1312866770.8444319767.9423904781.4226151
chunkDataSize_stats_mean(MB)76.7130076177.0843643676.794126478.14216614
compressedPageSize_stats_mean(B)80375.4186780764.8144480460.4778981874
uncompressedPageSize_stats_mean(B)81874818748187481874
timeBufferSize_stats_mean(B)1872187218721872
valueBufferSize_stats_mean(B)80000800008000080000
[1] each step2] category: (A)get ChunkStatistic->(B)load on-disk Chunk->(C)get PageStatistics->(D)load in-memory PageData
[Avg&Per] (A)1_index_read_deserialize_MagicString_FileMetadataSize(us)get_chunkMetadatas101654.7566 26642.8733 us - 0.24422388846191903%9318259204986696%1191886735.6528 1136 us - 01.16062988779113208%1689451651044902%1018882918.2737 49919999999 us - 01.1309325873262339%0656107165107132%1095393896.8906 1409 us - 01.14619657707769018%2531889263513831%
[Avg&Per] (AB)2load_indexon_read_deserialize_IndexRootNode_MetaOffset_BloomFilter(us)5777.7715 us - 0.05296237408352104%5484.9663 us - 0.07392190510886776%5140.9507 us - 0.0660679126109081%disk_chunk5949593.8917000005 us - 54.53739687304131%4849630.197000001 us - 65.35936296194448%5086618.179 us - 65.36966894765757%4851386.484999999 us - 64.74924062031137%6219.5857 us - 0.08300997092132265%
[Avg&Per] (AC)3_2_index_read_deserialize_IndexRootNode_exclude_to_TimeseriesMetadata_forExactGet(us)get_pageHeader6613.381399999991 69234.1118 us - 0.6346396579532295%060622054656159316%693317120.4945 158900000014 us - 0.9343933722044904%09595969816001626%675897692.2748 346500000014 us - 0.8686102165735712%09885667184764595%767227605.6646 696300000007 us - 10.0239823783523703%10150975630087579%
[Avg&Per] (BD_1)4_data_read_deserialize_ChunkHeader(us)8684.7625 us - 0.07960952425196037%10008.712599999999 us - 0.13488927052826724%4487.9059 us - 0.05767543633654741%decompress_pageData2859428.0031000106 us - 26.211160404158996%521804.2723000004 us - 7.032452670194604%605202.8143000009 us - 7.777644443672276%498170.42259999976 us - 6.648853201570805%7069.0819 us - 0.09434780888370882%
[Avg&Per] (BD_2)5_data_read_ChunkData(us)5940909.1292 us - 54.457787348789346%4839621.4844 us - 65.22447369141621%5082130.2731 us - 65.31199351132103%decode_pageData1991910.319299996 us - 18.25899474764487%1954657.4198999994 us - 26.343279504596413%1998880.5966999987 us - 25.68821922031179%2041517.9070999995 us - 27.247207495465567%
[3] D_1 compare each step inside
[Avg&Per] (C)6D-1)7_1_data_ByteBuffer_deserializeto_PageHeaderByteArray(us)661365952.381399999991 37819999998 us - 02.060622054656159316%5136549346918415%7120108809.158900000014 34350000018 us - 059.09595969816001626%6896105506002%7692108132.346500000014 35939999981 us - 043.09885667184764595%622622294156905%7605110765.696300000007 11740000003 us - 063.10150975630087579%813731447511664%
[Avg&Per] (D-1)7_2_data_decompress_PageDataPageDataByteArray(us)28594282554687.0031000106 926599999 us - 2697.211160404158996%36728361519128%52180468904.2723000004 38600000006 us - 737.032452670194604%79892271446546%605202135345.8143000009 91170000008 us - 754.777644443672276%601079805416944%49817057547.42259999976 215800000035 us - 633.648853201570805%15396273496119%
[Avg&Per] (D-21)87_3_data_decodeByteArray_to_PageDataByteBuffer(us)1991910811.319299996 8335000000624 us - 180.25899474764487%03094155721322126%19546571239.4198999994 949800000022 us - 260.343279504596413%680200047933345%19988801184.5966999987 7460000000272 us - 250.68821922031179%47794876167766753%20415171229.9070999995 1439000000355 us - 270.247207495465567%
[2] category: (A)get ChunkStatistic->(B)load on-disk Chunk->(C)get PageStatistics->(D)load in-memory PageData
7081314098345313%
[[Avg&Per] (A)get_chunkMetadatasD-1)7_4_data_split_time_value_Buffer(us)2312.0582000000422 101654.7566 us - 0.9318259204986696%08811989290364733%867353338.1136 2513999999974 us - 1.1689451651044902%8312666870009688%829183218.49919999999 3657999999955 us - 1.0656107165107132%29834913874849%938964034.1409 201500000018 us - 1.2531889263513831%2.3241744076926314%
[3] D_2 compare each step inside
[Avg&Per] (BD-2)load8_on_disk_chunk5949593.8917000005 us - 54.53739687304131%4849630.197000001 us - 65.35936296194448%5086618.179 us - 65.36966894765757%1_createBatchData(us)5384.7852 us - 0.053292019060348375%5848.7599 us - 0.05759123169122766%5913.4963 us - 0.058362326692940975%6019.3023 us - 0.05943520403215091%4851386.484999999 us - 64.74924062031137%
[Avg&Per] (C)get_pageHeader6613.381399999991 us - 0.060622054656159316%7120.158900000014 us - 0.09595969816001626%7692.346500000014 us - 0.09885667184764595%D-2)8_2_timeDecoder_hasNext(us)1859842.2956 us - 18.406444711361424%1862234.7849 us - 18.336946086748988%1864092.3926 us - 18.397368271414525%1857778.6739 us - 18.343895858133802%7605.696300000007 us - 0.10150975630087579%
[Avg&Per] (D_1)decompress_pageData2859428.0031000106 us - 26.211160404158996%521804.2723000004 us - 7.032452670194604%605202.8143000009 us - 7.777644443672276%-2)8_3_timeDecoder_readLong(us)2074757.7936 us - 20.533415498567617%2084700.4377 us - 20.527508047369906%2063043.8916 us - 20.360888969091857%2069930.4964 us - 20.43870456313607%498170.42259999976 us - 6.648853201570805%
[Avg&Per] (D_-2)decode_pageData8_4_valueDecoder_read(us)1876012.952 1991910.319299996 us - 18.25899474764487%56648209392724%19546571881471.4198999994 5433999998 us - 2618.343279504596413%526365490982297%19988801877809.5966999987 2412 us - 2518.68821922031179%532744562964893%20415171876843.9070999995 1276 us - 2718.247207495465567%53214021585961%
[[Avg&Per] (D-12)78_1_data_ByteBuffer_to_ByteArray5_checkValueSatisfyOrNot(us)659521780379.37819999998 6374 us - 217.5136549346918415%620020492363725%1088091780782.34350000018 3133 us - 5917.6896105506002%534904586680103%1081321781949.35939999981 2049 us - 4317.622622294156905%586668929952697%1107651780599.11740000003 5789 us - 6317.813731447511664%5818216126948%
[Avg&Per] (D-12)78_2_data_decompress_PageDataByteArray6_putIntoBatchData(us)25546872507922.926599999 0072 us - 9724.36728361519128%82034518471963%689042540605.38600000006 1784 us - 3725.79892271446546%016684556527476%1353452539577.91170000008 912 us - 5425.601079805416944%063966939883077%575472536332.215800000035 2055 us - 3325.15396273496119%
[Avg&Per] (D-1)7_3_data_ByteArray_to_ByteBuffer(us)811.8335000000624 us - 0.03094155721322126%1239.949800000022 us - 0.680200047933345%1184.7460000000272 us - 0.47794876167766753%1229.1439000000355 us - 0.7081314098345313%
[Avg&Per] (D-1)7_4_data_split_time_value_Buffer(us)2312.0582000000422 us - 0.08811989290364733%3338.2513999999974 us - 1.8312666870009688%3218.3657999999955 us - 1.29834913874849%4034.201500000018 us - 2.3241744076926314%
[3] D_2 compare each step inside
[Avg&Per] (D-2)8_1_createBatchData(us)5384.7852 us - 0.053292019060348375%5848.7599 us - 0.05759123169122766%5913.4963 us - 0.058362326692940975%6019.3023 us - 0.05943520403215091%
[Avg&Per] (D-2)8_2_timeDecoder_hasNext(us)1859842.2956 us - 18.406444711361424%1862234.7849 us - 18.336946086748988%1864092.3926 us - 18.397368271414525%1857778.6739 us - 18.343895858133802%
[Avg&Per] (D-2)8_3_timeDecoder_readLong(us)2074757.7936 us - 20.533415498567617%2084700.4377 us - 20.527508047369906%2063043.8916 us - 20.360888969091857%2069930.4964 us - 20.43870456313607%
[Avg&Per] (D-2)8_4_valueDecoder_read(us)1876012.952 us - 18.56648209392724%1881471.5433999998 us - 18.526365490982297%1877809.2412 us - 18.532744562964893%1876843.1276 us - 18.53214021585961%
[Avg&Per] (D-2)8_5_checkValueSatisfyOrNot(us)1780379.6374 us - 17.620020492363725%1780782.3133 us - 17.534904586680103%1781949.2049 us - 17.586668929952697%1780599.5789 us - 17.5818216126948%
[Avg&Per] (D-2)8_6_putIntoBatchData(us)2507922.0072 us - 24.82034518471963%2540605.1784 us - 25.016684556527476%2539577.912 us - 25.063966939883077%2536332.2055 us - 25.044002546143567%

B类操作耗时超过D类操作耗时。分析原因:使用的人工数据数值是INT64类型的随机取整的数,且PLAIN编码,且四种压缩方式此时的压缩率都不高,所以磁盘数据量偏大。

中车数据实验结果

RLCompressionRealExpScripts.sh

ZT11529传感器数据如下图所示,共12,780,287个点。

Image Removed

044002546143567%


B类操作耗时超过D类操作耗时。分析原因:使用的人工数据数值是INT64类型的随机取整的数,且PLAIN编码,且四种压缩方式此时的压缩率都不高,所以磁盘数据量偏大。

中车数据实验结果

RLCompressionRealExpScripts.sh

ZT11529传感器数据如下图所示,共12,780,287个点。

Image Added

压缩方式GZIPLZ4SNAPPYUNCOMPRESSED
dataset/disk/rl/zc_data/ZT11529.csv/disk/rl/zc_data/ZT11529.csv/disk/rl/zc_data/ZT11529.csv/disk/rl/zc_data/ZT11529.csv
pagePointNum(ppn)10000100001000010000
numOfPagesInChunk(pic)100100100100
chunksWritten(cw)13131313
timeEncoding(te)TS_2DIFFTS_2DIFFTS_2DIFFTS_2DIFF
valueDataType(vt)DOUBLEDOUBLEDOUBLEDOUBLE
valueEncoding(ve)GORILLAGORILLAGORILLAGORILLA
compression(co)GZIPLZ4SNAPPYUNCOMPRESSED
totalPointNum12780287127802871278028712780287
tsfileSize(MB)19.3486261423.7774105123.1564121236.30773735
chunkDataSize_stats_mean(MB)1.5151396591.860070071.8131843412.837062438
compressedPageSize_stats_mean(B)15824.1666719440.27518948.6929684.7575
uncompressedPageSize_stats_mean(B)29684.757529684.757529684.757529684.7575
timeBufferSize_stats_mean(B)11461.462511461.462511461.462511461.4625
valueBufferSize_stats_mean(B)18221.2683318221.2683318221.2683318221.26833
[2] category: (A)get ChunkStatistic->(B)load on-disk Chunk->(C)get PageStatistics->(D)load in-memory PageData
[Avg&Per] (A)get_chunkMetadatas87316.85010000001 us - 8.19883087518919%100289.6416 us - 12.59103802007131%89466.6045 us - 11.199530364576052%88760.777 us - 10.335699109318087%
[Avg&Per] (B)load_on_disk_chunk160699.6025 us - 15.089285299443361%176784.88239999997 us - 22.19476647997349%105802.9384 us - 13.244531050378352%191518.85280000002 us - 22.301305860612082%
[Avg&Per] (C)get_pageHeader2436.5129999999995 us - 0.22878239411203669%2198.3668000000007 us - 0.27599779517870476%2319.9517000000005 us - 0.29041416798711567%3134.228800000001 us - 0.3649635223062446%
[Avg&Per] (D_1)decompress_pageData356587.5454999999 us - 33.482666568996265%31158.160799999983 us - 3.911805656191469%115629.7179 us - 14.474658381255693%29640.983400000016 us - 3.4515277590088287%
[Avg&Per] (D_2)decode_pageData457950.96670000016 us - 43.000434862259155%486085.0215000002 us - 61.02639204858503%485623.25309999986 us - 60.790866035802786%545723.8052999998 us - 63.54650374875476%
[3] D_1 compare each step inside
[Avg&Per] (D-1)7_1_data_ByteBuffer_to_ByteArray(us)1269.7615999999996 us - 0.36377892330196115%1808.5776999999991 us - 7.207505165727991%1687.9916000000003 us - 1.412377327892232%3197.6313 us - 41.86667970732365%
[Avg&Per] (D-1)7_2_data_decompress_PageDataByteArray(us)345856.23130000004 us - 99.08569249502276%21247.39440000001 us - 84.67466169480036%116100.69619999993 us - 97.14384305311928%2432.4552000000012 us - 31.84820675254647%
[Avg&Per] (D-1)7_3_data_ByteArray_to_ByteBuffer(us)374.18470000000025 us - 0.10720162531460037%442.10720000000003 us - 1.761876157051776%424.26030000000026 us - 0.35498732863644405%421.1930000000002 us - 5.51469221169019%
[Avg&Per] (D-1)7_4_data_split_time_value_Buffer(us)1547.4220999999993 us - 0.443326956360674%1594.8989000000004 us - 6.355956982419887%1301.2614999999998 us - 1.0887922903520593%1586.372500000001 us - 20.770421328439692%
[3] D_2 compare each step inside
[Avg&Per] (D-2)8_1_createBatchData(us)3430.8672 us - 0.22855030377902266%4174.8958 us - 0.275155541260775%3438.1465 us - 0.22874736939899784%3730.6205 us - 0.24721094086707432%
[Avg&Per] (D-2)8_2_timeDecoder_hasNext(us)234016.77980000002 us - 15.589238228946503%236135.9462 us - 15.56305048087338%235278.4135 us - 15.65358490817499%234469.6457 us - 15.537217392727715%
[Avg&Per] (D-2)8_3_timeDecoder_readLong(us)357893.9434 us - 23.841426880277478%360253.1425 us - 23.743262865502565%358341.1524 us - 23.84121675993312%363063.0738 us - 24.058508247673558%
[Avg&Per] (D-2)8_4_valueDecoder_read(us)353821.6809 us - 23.570149451806063%359477.8899 us - 23.692168165422427%356440.4841 us - 23.714761161335133%355773.1939 us - 23.5754416723178%
[Avg&Per] (D-2)8_5_checkValueSatisfyOrNot(us)223758.3096 us - 14.905861011513531%224938.2721 us - 14.825043539994219%224282.1638 us - 14.921980483485841%226053.6587 us - 14.979528915812129%
[Avg&Per] (D-2)8_6_putIntoBatchData(us)328221.5562 us - 21.864774123677403%332305.597 us - 21.90131940694663%325251.7878 us - 21.639709317671908%325993.7043 us - 21.602092830601723%


  • 相对其它压缩方法,GZIP的压缩率最高,同时它的D-1解压缩步骤耗时占比也更高。
  • 真实数据集的压缩率高,磁盘数据量少,D类操作耗时超过B类操作耗时,即整体耗时瓶颈在D类操作。
  • 真实数据集的压缩率高,D-1步骤内部的主要耗时瓶颈是子步骤7_2_data_decompress_PageDataByteArray(us)。
    • 人工数据实验里发现另一个子步骤7_1_data_ByteBuffer_to_ByteArray(us)的占比也高,其主要因为人工数据实验里数据压缩率很低,从而子步骤7_2_data_decompress_PageDataByteArray(us)耗时少,从而相对来说7_1_data_ByteBuffer_to_ByteArray(us)占比高了。
  • 本实验里,D-2类操作内部没有突出的耗时瓶颈子步骤。


后续:可以增大真实数据集的数据量之后再实验看看。


改变编码方式

人工数据实验结果

RLValueEncodingSynExpScripts.sh
压缩方式GZIPLZ4SNAPPYUNCOMPRESSED
dataset



pagePointNum(ppn)



numOfPagesInChunk(pic)



chunksWritten(cw)



timeEncoding(te)



valueDataType(vt)



valueEncoding(ve)



compression(co)



totalPointNum



tsfileSize(MB)



chunkDataSize_stats_mean(MB)



compressedPageSize_stats_mean(B)



uncompressedPageSize_stats_mean(B)



timeBufferSize_stats_mean(B)



valueBufferSize_stats_mean(B)
压缩方式GZIPLZ4SNAPPYUNCOMPRESSEDdataset/disk/rl/zc_data/ZT11529.csv/disk/rl/zc_data/ZT11529.csv/disk/rl/zc_data/ZT11529.csv/disk/rl/zc_data/ZT11529.csvpagePointNum(ppn)10000100001000010000numOfPagesInChunk(pic)100100100100chunksWritten(cw)13131313timeEncoding(te)TS_2DIFFTS_2DIFFTS_2DIFFTS_2DIFFvalueDataType(vt)DOUBLEDOUBLEDOUBLEDOUBLEvalueEncoding(ve)GORILLAGORILLAGORILLAGORILLAcompression(co)GZIPLZ4SNAPPYUNCOMPRESSEDtotalPointNum12780287127802871278028712780287tsfileSize(MB)19.3486261423.7774105123.1564121236.30773735chunkDataSize_stats_mean(MB)1.5151396591.860070071.8131843412.837062438compressedPageSize_stats_mean(B)15824.1666719440.27518948.6929684.7575uncompressedPageSize_stats_mean(B)29684.757529684.757529684.757529684.7575timeBufferSize_stats_mean(B)11461.462511461.462511461.462511461.4625valueBufferSize_stats_mean(B)18221.2683318221.2683318221.2683318221.26833[1] each step[Avg&Per] (A)1_index_read_deserialize_MagicString_FileMetadataSize(us)10294.8866 us - 0.9666637540862395%17480.267 us - 2.1945906166045948%10013.4112 us - 1.2534900973847278%12118.2366 us - 1.4111012934589997%[Avg&Per] (A)2_index_read_deserialize_IndexRootNode_MetaOffset_BloomFilter(us)5024.5339 us - 0.4717909959598364%8883.9004 us - 1.115344774578661%4736.353 us - 0.5929020055841159%5782.8723 us - 0.6733833355290506%[Avg&Per] (A)3_2_index_read_deserialize_IndexRootNode_exclude_to_TimeseriesMetadata_forExactGet(us)71997.4296 us - 6.760376125143112%73925.4742 us - 9.281102628888053%74716.8403 us - 9.353138261607208%70859.6681 us - 8.251214480330036%[Avg&Per] (B)4_data_read_deserialize_ChunkHeader(us)8457.0026 us - 0.7940911055429293%4862.6532 us - 0.6104902793831642%4758.252799999999 us - 0.5956434472253724%2203.0548 us - 0.25653348589027036%[Avg&Per] (B)5_data_read_ChunkData(us)152242.5999 us - 14.295194193900432%171922.22919999997 us - 21.584276200590327%101044.6856 us - 12.648887603152982%189315.798 us - 22.04477237472181%[Avg&Per] (C)6_data_deserialize_PageHeader(us)2436.5129999999995 us - 0.22878239411203669%2198.3668000000007 us - 0.27599779517870476%2319.9517000000005 us - 0.29041416798711567%3134.228800000001 us - 0.3649635223062446%[Avg&Per] (D-1)7_data_decompress_PageData(us)356587.5454999999 us - 33.482666568996265%31158.160799999983 us - 3.911805656191469%115629.7179 us - 14.474658381255693%29640.983400000016 us - 3.4515277590088287%[Avg&Per] (D-2)8_data_decode_PageData(us)457950.96670000016 us - 43.000434862259155%486085.0215000002 us - 61.02639204858503%485623.25309999986 us - 60.790866035802786%545723.8052999998 us - 63.54650374875476%




[2] category: (A)get ChunkStatistic->(B)load on-disk Chunk->(C)
get PageStatistics->(D)load in-memory PageData[Avg&Per] (A)get_chunkMetadatas87316.85010000001 us - 8.19883087518919%100289.6416 us - 12.59103802007131%89466.6045 us - 11.199530364576052%88760.777 us - 10.335699109318087%
get PageStatistics->(D)load in-memory PageData
[Avg&Per] (
B
A)
load_on_disk_chunk160699.6025 us - 15.089285299443361%176784.88239999997 us - 22.19476647997349%105802.9384 us - 13.244531050378352%
get_chunkMetadatas



[Avg&Per] (B)load_on_disk_chunk
191518.85280000002 us - 22.301305860612082%




[Avg&Per] (C)get_pageHeader
2436.5129999999995 us - 0.22878239411203669%2198.3668000000007 us - 0.27599779517870476%2319.9517000000005 us - 0.29041416798711567%




[Avg&Per] (D_1)decompress_pageData



[Avg&Per] (D_2)decode_pageData



[3] D_1 compare each step inside
3134.228800000001 us - 0.3649635223062446%
[Avg&Per] (D-1)7_1
)decompress_pageData356587.5454999999 us - 33.482666568996265%31158.160799999983 us - 3.911805656191469%115629.7179 us - 14.474658381255693%
_data_ByteBuffer_to_ByteArray(us)



[Avg&Per] (D-1)7_2_data_decompress_PageDataByteArray(us)
29640.983400000016 us - 3.4515277590088287%




[Avg&Per] (D
_2)decode_pageData457950.96670000016 us - 43.000434862259155%486085.0215000002 us - 61.02639204858503%485623.25309999986 us - 60.790866035802786%
-1)7_3_data_ByteArray_to_ByteBuffer(us)



[Avg&Per] (D-1)7_4_data_split_time_value_Buffer(us)
545723.8052999998 us - 63.54650374875476%




[3] D_
1
2 compare each step inside
[Avg&Per] (D-
1
2)
7
8_1_
data_ByteBuffer_to_ByteArray3197.6313 us - 41.86667970732365%
createBatchData(us)
1269.7615999999996 us - 0.36377892330196115%1808.5776999999991 us - 7.207505165727991%1687.9916000000003 us - 1.412377327892232%




[Avg&Per] (D-2)8_2_timeDecoder_hasNext(us)



[Avg&Per] (D-
1
2)
7
8_
2
3_
data
timeDecoder_
decompress_PageDataByteArray
readLong(us)
345856.23130000004 us - 99.08569249502276%21247.39440000001 us - 84.67466169480036%116100.69619999993 us - 97.14384305311928%




[Avg&Per] (D-2)8_4_valueDecoder_read(us)



[Avg&Per] (D-2)8_5_checkValueSatisfyOrNot(us)
2432.4552000000012 us - 31.84820675254647%




[Avg&Per] (D-
1)7_3_data_ByteArray_to_ByteBuffer(us)374.18470000000025 us - 0.10720162531460037%442.10720000000003 us - 1.761876157051776%424.26030000000026 us - 0.35498732863644405%421.1930000000002 us - 5.51469221169019%[Avg&Per] (D-1)7_4_data_split_time_value_Buffer(us)1547.4220999999993 us - 0.443326956360674%1594.8989000000004 us - 6.355956982419887%1301.2614999999998 us - 1.0887922903520593%1586.372500000001 us - 20.770421328439692%[3] D_2 compare each step inside[Avg&Per] (D-2)8_1_createBatchData(us)
2)8_6_putIntoBatchData(us)





中车数据实验结果

RLValueEncodingRealExpScripts.sh
压缩方式GZIPLZ4SNAPPYUNCOMPRESSED
dataset



pagePointNum(ppn)



numOfPagesInChunk(pic)



chunksWritten(cw)



timeEncoding(te)



valueDataType(vt)



valueEncoding(ve)



compression(co)



totalPointNum



tsfileSize(MB)



chunkDataSize_stats_mean(MB)



compressedPageSize_stats_mean(B)



uncompressedPageSize_stats_mean(B)



timeBufferSize_stats_mean(B)



valueBufferSize_stats_mean(B)



[2] category: (A)get ChunkStatistic->(B)load on-disk Chunk->(C)get PageStatistics->(D)load in-memory PageData
3430.8672 us - 0.22855030377902266%4174.8958 us - 0.275155541260775%3438.1465 us - 0.22874736939899784%3730.6205 us - 0.24721094086707432%
[Avg&Per] (
D-2
A)
8_2_timeDecoder_hasNext(us)234016.77980000002 us - 15.589238228946503%236135.9462 us - 15.56305048087338%235278.4135 us - 15.65358490817499%
get_chunkMetadatas



[Avg&Per] (B)load_on_disk_chunk
234469.6457 us - 15.537217392727715%




[Avg&Per] (
D-2
C)
8_3_timeDecoder_readLong(us)357893.9434 us - 23.841426880277478%360253.1425 us - 23.743262865502565%358341.1524 us - 23.84121675993312%
get_pageHeader



[Avg&Per] (D_1)decompress_pageData



[Avg&Per] (D_2)decode_pageData



[3] D_1 compare each step inside
363063.0738 us - 24.058508247673558%
[Avg&Per] (D-
2)8_4_valueDecoder_read(us)353821.6809 us - 23.570149451806063%359477.8899 us - 23.692168165422427%356440.4841 us - 23.714761161335133%355773.1939 us - 23.5754416723178%
1)7_1_data_ByteBuffer_to_ByteArray(us)



[Avg&Per] (D-
2)8_5_checkValueSatisfyOrNot226053.6587 us - 14.979528915812129%
1)7_2_data_decompress_PageDataByteArray(us)
223758.3096 us - 14.905861011513531%224938.2721 us - 14.825043539994219%224282.1638 us - 14.921980483485841%




[Avg&Per] (D-1)7_3_data_ByteArray_to_ByteBuffer(us)



[Avg&Per] (D-
2)8_6_putIntoBatchData(us)328221.5562 us - 21.864774123677403%332305.597 us - 21.90131940694663%325251.7878 us - 21.639709317671908%325993.7043 us - 21.602092830601723%
  • 相对其它压缩方法,GZIP的压缩率最高,同时它的D-1解压缩步骤耗时占比也更高。
  • 真实数据集的压缩率高,磁盘数据量少,D类操作耗时超过B类操作耗时,即整体耗时瓶颈在D类操作。
  • 真实数据集的压缩率高,D-1步骤内部的主要耗时瓶颈是子步骤7_2_data_decompress_PageDataByteArray(us)。
    • 人工数据实验里发现另一个子步骤7_1_data_ByteBuffer_to_ByteArray(us)的占比也高,其主要因为人工数据实验里数据压缩率很低,从而子步骤7_2_data_decompress_PageDataByteArray(us)耗时少,从而相对来说7_1_data_ByteBuffer_to_ByteArray(us)占比高了。
  • 本实验里,D-2类操作内部没有突出的耗时瓶颈子步骤。

后续:可以增大真实数据集的数据量之后再实验看看。

改变编码方式

人工数据实验结果

RLValueEncodingSynExpScripts.sh

中车数据实验结果

...

1)7_4_data_split_time_value_Buffer(us)



[3] D_2 compare each step inside
[Avg&Per] (D-2)8_1_createBatchData(us)



[Avg&Per] (D-2)8_2_timeDecoder_hasNext(us)



[Avg&Per] (D-2)8_3_timeDecoder_readLong(us)



[Avg&Per] (D-2)8_4_valueDecoder_read(us)



[Avg&Per] (D-2)8_5_checkValueSatisfyOrNot(us)



[Avg&Per] (D-2)8_6_putIntoBatchData(us)