实验目标
- 测试不同数据结构的内存占用情况
- 测试不同读写比下不同数据结构的读写性能
实验环境
- OS: mac OS
- memory:8G
- CPU:双核 2.7 GHz Intel Core i5
- java version: 1.8.074
- 数据乱序比例:20%
实验数据结构
(1)数组实现(Array)
实现数据结构:int[][]
写入:追加写入数组的最后位置
查询:先拷贝一份数组,做排序后做查询
(2)跳表实现(SkipList)
实现数据结构:ConcurrentSkipList<Integer>
写入:通过跳表插入到正确的有序位置(写入排序)
查询:直接查询跳表的引用
实验设置
(1)不同数据结构的内存占用
(2)不同数据结构写入及查询延迟
(3)不同数据结构读写混合负载下写入及查询吞吐量
不同数据结构内存占用
负载 / 数据结构 | Array | SkipList |
---|---|---|
单序列一百万个点 | 4 MB | 52 MB |
十万序列每序列十个点 | 5.6 MB | 59.6 MB |
不同数据结构写入及查询延迟
写入延迟
负载 / 数据结构 | Array | SkipList |
---|---|---|
单序列一百万个点 | 360 ms | 8921 ms |
一千序列每序列一万个点 | 428 ms | 9138 ms |
十万序列每序列一百个点 | 429 ms | 10129 ms |
查询延迟
负载 / 数据结构 | Array | SkipList |
---|---|---|
单序列一百万个点 | 449ms | 76ms |
一千序列每序列一万个点 | 393ms | 92ms |
十万序列每序列一百个点 | 172ms | 119ms |
不同数据结构读写混合负载下写入及查询吞吐量
测试方法
单线程写入,同时有 n 个线程查询,测试在有查询的负载下,写入线程最终完成写入的总时间
查询线程是一个无限循环,每次随机选择一个时间序列查询全部数据
写入吞吐
负载 / 查询线程数 | n = 1 (points / ms) | n = 3 (points / ms) | n = 5 (points / ms) | |||
---|---|---|---|---|---|---|
数据结构 | Array | SkipList | Array | SkipList | Array | SkipList |
单序列一千万个点 | 18.93 K | 0.81 K | 14.25 K | 0.70 K | 8.84 K | 0.61 K |
一千序列每序列一万个点 | 16.58 K | 0.36 K | 9.91 K | 0.25 K | 8.19 K | 0.19 K |
十万序列每序列一百个点 | 9.47 K | 0.97 K | 6.50 K | 0.86 K | 4.85 K | 0.72 K |
查询吞吐
负载 / 查询线程数 | n = 1 (points / ms) | n = 3 (points / ms) | n = 5 (points / ms) | |||
---|---|---|---|---|---|---|
数据结构 | Array | SkipList | Array | SkipList | Array | SkipList |
单序列一千万个点 | 28.40 K | 40.76 K | 40.01 K | 114.63 K | 42.44 K | 162.16 K |
一千序列每序列一万个点 | 27.31 K | 42.31 K | 66.73 K | 96.48 K | 89.31 K | 112.91 K |
十万序列每序列一百个点 | 32.20 K | 8.40 K | 60.43 K | 28.92 K | 82.36 K | 35.22 K |
Array/Skiplist查询延迟比随序列中的点数变化趋势
Array/Skiplist查询吞吐比随序列中的点数变化趋势
结论
(1)内存占用:skiplist 的内存占用为 array 的10倍左右
(2)写延迟:array 的写性能大约为 skiplist 的20倍
(3)读延迟:由于内存拷贝及排序,内存中的点数越多,array查询的性能越差,在1000万点时,skiplist 的查询性能约为array的5倍,100点时,skiplist 的查询性能约为array的1.5倍
(4)读写混合负载查询吞吐:skiplist的随着每序列的点数减少,skiplist的查询性能越来越低,最终会低于Array的查询性能
关键实验代码
单线程:
package data_structure; import java.util.Random; public class Main { private static int timeseriesNum = 1; private static int size = 10000000; static Random random = new Random(); public static void main(String[] args) throws InterruptedException { multiWriteAndRead(); // for memory calculate Thread.sleep(100000000); } private static void multiWriteAndRead() throws InterruptedException { MemTable[] arrayTable = new MemTable[timeseriesNum]; MemTable[] skipListTable = new MemTable[timeseriesNum]; // skip list long curTime = System.nanoTime(); for (int i = 0; i < timeseriesNum; i++) { skipListTable[i] = new SkipListTable(size); write(skipListTable[i]); } long finishTime = System.nanoTime(); System.out.println("SkipList write time: " + (finishTime - curTime) / 1_000_000 + " ms."); curTime = System.nanoTime(); for (int i = 0; i < timeseriesNum; i++) { query(skipListTable[i]); } finishTime = System.nanoTime(); System.out.println("SkipList query time: " + (finishTime - curTime) / 1_000_000 + " ms."); // array curTime = System.nanoTime(); for (int i = 0; i < timeseriesNum; i++) { arrayTable[i] = new ArrayTable(size); write(arrayTable[i]); } finishTime = System.nanoTime(); System.out.println("Array write time: " + (finishTime - curTime) / 1_000_000 + " ms."); curTime = System.nanoTime(); for (int i = 0; i < timeseriesNum; i++) { query(arrayTable[i]); } finishTime = System.nanoTime(); System.out.println("Array query time: " + (finishTime - curTime) / 1_000_000 + " ms."); System.out.println("finished"); } private static int query(MemTable table){ return table.query(); } public static void write(MemTable table){ for (int i = 0; i < size; i++) { if(random.nextDouble() < 0.2){ table.add(random.nextInt(100000) - 50000); } else{ table.add(i); } } } }
多线程:
package data_structure; import java.util.Random; import java.util.concurrent.atomic.AtomicLong; public class MultiMain { private static int timeseriesNum = 1000_00; private static int size = 100; static Random random = new Random(); static MemTable[] memtable = new MemTable[timeseriesNum]; private static int queryThreadNum = 1; static AtomicLong count = new AtomicLong(); public static void main(String[] args) { for (int i = 0; i < timeseriesNum; i++) { memtable[i] = new SkipListTable(size); } Thread[] queryThread = new Thread[queryThreadNum]; for (int i = 0; i < queryThreadNum; i++) { queryThread[i] = new Thread(MultiMain::queryThread); queryThread[i].start(); } writeThread(); } public static void writeThread(){ long curTime = System.nanoTime(); for (int i = 0; i < size; i++) { for (int j = 0; j < timeseriesNum; j++) { int cur = random.nextDouble() < 0.05 ? random.nextInt(100000) - 50000 : i; memtable[j].add(cur); } } long finishTime = System.nanoTime(); long cur = count.get(); System.out.println("Finish time: " + (finishTime - curTime) / 1_000_000 + " ms."); System.out.println("Read points: " + cur / 1000 / 1000); } private static void queryThread(){ while(true){ int loc = random.nextInt(timeseriesNum); count.addAndGet(memtable[loc].query()); } } }
Array实现:
package data_structure; import java.util.Arrays; public class ArrayTable implements MemTable { int[] res; int index; public ArrayTable(int size){ res = new int[size]; } @Override public void add(int i) { res[index] = i; index++; } @Override public int query() { int len = index; int[] query = new int[len]; System.arraycopy(res, 0, query,0, len); Arrays.sort(query); int sum = 0; for (int i : query) { sum += 1; } return sum; } }
SkipList实现:
package data_structure; import java.util.concurrent.ConcurrentSkipListSet; public class SkipListTable implements MemTable { ConcurrentSkipListSet<Integer> set; public SkipListTable(int size){ set = new ConcurrentSkipListSet<>(); } @Override public void add(int i) { set.add(i); } @Override public int query() { int sum = 0; for (int i : set) { sum += 1; } return sum; } }