...
实验目标
...
- 测试不同数据结构的内存占用情况
...
- 测试不同读写比下不同数据结构的读写性能
...
实验环境
...
- OS: mac OS
...
- memory:8G
...
- CPU:双核 2.7 GHz Intel Core i5
...
- java version: 1.8.074
...
# 内存占用
## 单序列一百万个点
Array: 4MB
SkipList:52MB (Integer:16MB, SkipListNode:36MB)
## 十万序列每序列十个点
SkipList:59.6 MB (SkipList: 58MB, ref: 1.6MB)
Array: 5.6MB (data: 4MB, ref: 1.6MB)
# 单线程写入+查询
## 单序列一千万个点
### 写入
Array:360ms
SkipList:8921ms
### 查询
Array:449ms
SkipList:76ms
## 十万序列每序列一百个点
### 写入
Array:429ms
SkipList:10129ms
### 查询
Array:449ms
SkipList:76ms
...
- 数据乱序比例:20%
实验数据结构
(1)数组实现(Array)
实现数据结构:int[]
写入:追加写入数组的最后位置
查询:先拷贝一份数组,做排序后做查询
(2)跳表实现(SkipList)
实现数据结构:ConcurrentSkipList<Integer>
写入:通过跳表插入到正确的有序位置(写入排序)
查询:直接查询跳表的引用
实验设置
(1)不同数据结构的内存占用
(2)不同数据结构写入及查询延迟
(3)不同数据结构读写混合负载下写入及查询吞吐量
不同数据结构内存占用
负载 / 数据结构 | Array | SkipList |
---|---|---|
单序列一百万个点 | 4 MB | 52 MB |
十万序列每序列十个点 | 5.6 MB | 59.6 MB |
不同数据结构写入及查询延迟
写入延迟
负载 / 数据结构 | Array | SkipList |
---|---|---|
单序列一百万个点 | 360 ms | 8921 ms |
一千序列每序列一万个点 | 428 ms | 9138 ms |
十万序列每序列一百个点 | 429 ms | 10129 ms |
查询延迟
负载 / 数据结构 | Array | SkipList |
---|---|---|
单序列一百万个点 | 449ms | 76ms |
一千序列每序列一万个点 | 393ms | 92ms |
十万序列每序列一百个点 | 172ms | 119ms |
不同数据结构读写混合负载下写入及查询吞吐量
测试方法
单线程写入,同时有 n 个线程查询,测试在有查询的负载下,写入线程最终完成写入的总时间
查询线程是一个无限循环,每次随机选择一个时间序列查询全部数据
写入吞吐
负载 / 查询线程数 | n = 1 (points / ms) | n = 3 (points / ms) | n = 5 (points / ms) | |||
---|---|---|---|---|---|---|
数据结构 | Array | SkipList | Array | SkipList | Array | SkipList |
单序列一千万个点 | 18.93 K | 0.81 K | 14.25 K | 0.70 K | 8.84 K | 0.61 K |
一千序列每序列一万个点 | 16.58 K | 0.36 K | 9.91 K | 0.25 K | 8.19 K | 0.19 K |
十万序列每序列一百个点 | 9.47 K | 0.97 K | 6.50 K | 0.86 K | 4.85 K | 0.72 K |
查询吞吐
负载 / 查询线程数 | n = 1 (points / ms) | n = 3 (points / ms) | n = 5 (points / ms) | |||
---|---|---|---|---|---|---|
数据结构 | Array | SkipList | Array | SkipList | Array | SkipList |
单序列一千万个点 | 28.40 K | 40.76 K | 40.01 K | 114.63 K | 42.44 K | 162.16 K |
一千序列每序列一万个点 | 27.31 K | 42.31 K | 66.73 K | 96.48 K | 89.31 K | 112.91 K |
十万序列每序列一百个点 | 32.20 K | 8.40 K | 60.43 K | 28.92 K | 82.36 K | 35.22 K |
实验对比图:
对实验数据中具有趋势性的数据表示为图表,如下四张图分别表示:
(1)内存占用:十万序列每序列十个点,跳表的内存占用为数组的10倍左右
(2)数组写入查询延迟 / 跳表写入查询延迟:跳表的写延迟大约为数组的20倍,由于内存拷贝及排序,内存中的点数越多,数组查询的性能越差,在1000万点时,数组的查询延迟约为跳表的5倍,100点时,数组的查询延迟约为跳表的1.5倍
(3)数组写入吞吐 / 跳表写入吞吐:随着每序列的点数增加,数组比跳表写入吞吐越来越高,从9倍升高到40倍
(4)数组查询吞吐 / 跳表查询吞吐:随着每序列的点数增加,数组的查询吞吐一开始比跳表高,超过5000点后低于跳表
结论
(1)内存占用:跳表的内存占用为 数组的10倍左右
(2)写延迟:跳表的写延迟大约为 数组的20倍
(3)读延迟:由于内存拷贝及排序,内存中的点数越多,数组查询的性能越差,在1000万点时,数组的查询延迟约为跳表的5倍,100点时,数组的查询延迟约为跳表的1.5倍
(4)读写混合负载写入吞吐:随着每序列的点数增加,数组比跳表写入吞吐越来越高,从9倍升高到40倍
(5)读写混合负载查询吞吐:随着每序列的点数增加,数组的查询吞吐一开始比跳表高,超过5000点后低于跳表
关键实验代码
单线程:
Code Block |
---|
package data_structure;
import java.util.Random;
public class Main {
private static int timeseriesNum = 1;
private static int size = 10000000;
static Random random = new Random();
public static void main(String[] args) throws InterruptedException {
multiWriteAndRead();
// for memory calculate
Thread.sleep(100000000);
}
private static void multiWriteAndRead() throws InterruptedException {
MemTable[] arrayTable = new MemTable[timeseriesNum];
MemTable[] skipListTable = new MemTable[timeseriesNum];
// skip list
long curTime = System.nanoTime();
for (int i = 0; i < timeseriesNum; i++) {
skipListTable[i] = new SkipListTable(size);
write(skipListTable[i]);
}
long finishTime = System.nanoTime();
System.out.println("SkipList write time: " + (finishTime - curTime) / 1_000_000 + " ms.");
curTime = System.nanoTime();
for (int i = 0; i < timeseriesNum; i++) {
query(skipListTable[i]);
}
finishTime = System.nanoTime();
System.out.println("SkipList query time: " + (finishTime - curTime) / 1_000_000 + " ms.");
// array
curTime = System.nanoTime();
for (int i = 0; i < timeseriesNum; i++) {
arrayTable[i] = new ArrayTable(size);
write(arrayTable[i]);
}
finishTime = System.nanoTime();
System.out.println("Array write time: " + (finishTime - curTime) / 1_000_000 + " ms.");
curTime = System.nanoTime();
for (int i = 0; i < timeseriesNum; i++) {
query(arrayTable[i]);
}
finishTime = System.nanoTime();
System.out.println("Array query time: " + (finishTime - curTime) / 1_000_000 + " ms.");
System.out.println("finished");
}
private static int query(MemTable table){
return table.query();
}
public static void write(MemTable table){
for (int i = 0; i < size; i++) {
if(random.nextDouble() < 0.2){
table.add(random.nextInt(100000) - 50000);
}
else{
table.add(i);
}
}
}
}
|
多线程:
Code Block | ||||
---|---|---|---|---|
| ||||
package data_structure;
import java.util.Random;
import java.util.concurrent.atomic.AtomicLong;
public class MultiMain {
private static int timeseriesNum = 1000_00;
private static int size = 100;
static Random random = new Random();
static MemTable[] memtable = new MemTable[timeseriesNum];
private static int queryThreadNum = 1;
static AtomicLong count = new AtomicLong();
public static void main(String[] args) {
for (int i = 0; i < timeseriesNum; i++) {
memtable[i] = new SkipListTable(size);
}
Thread[] queryThread = new Thread[queryThreadNum];
for (int i = 0; i < queryThreadNum; i++) {
queryThread[i] = new Thread(MultiMain::queryThread);
queryThread[i].start();
}
writeThread();
}
public static void writeThread(){
long curTime = System.nanoTime();
for (int i = 0; i < size; i++) {
for (int j = 0; j < timeseriesNum; j++) {
int cur = random.nextDouble() < 0.05 ? random.nextInt(100000) - 50000 : i;
memtable[j].add(cur);
}
}
long finishTime = System.nanoTime();
long cur = count.get();
System.out.println("Finish time: " + (finishTime - curTime) / 1_000_000 + " ms.");
System.out.println("Read points: " + cur / 1000 / 1000);
}
private static void queryThread(){
while(true){
int loc = random.nextInt(timeseriesNum);
count.addAndGet(memtable[loc].query());
}
}
}
|
Array实现:
Code Block | ||
---|---|---|
| ||
package data_structure;
import java.util.Arrays;
public class ArrayTable implements MemTable {
int[] res;
int index;
public ArrayTable(int size){
res = new int[size];
}
@Override
public void add(int i) {
res[index] = i;
index++;
}
@Override
public int query() {
int len = index;
int[] query = new int[len];
System.arraycopy(res, 0, query,0, len);
Arrays.sort(query);
int sum = 0;
for (int i : query) {
sum += 1;
}
return sum;
}
}
|
SkipList实现:
Code Block | ||
---|---|---|
| ||
package data_structure;
import java.util.concurrent.ConcurrentSkipListSet;
public class SkipListTable implements MemTable {
ConcurrentSkipListSet<Integer> set;
public SkipListTable(int size){
set = new ConcurrentSkipListSet<>();
}
@Override
public void add(int i) {
set.add(i);
}
@Override
public int query() {
int sum = 0;
for (int i : set) {
sum += 1;
}
return sum;
}
}
|
## 单序列一千万个点
### n = 1
Array:528ms
SkipList: 12413ms
### n = 3
Array:702ms
SkipList: 14202ms
### n = 5
Array:1131ms
SkipList: 16304ms
## 十万序列每序列一百个点
### n = 1
Array:1056ms
SkipList:10243ms
### n = 3
Array:1539ms
SkipList:11620ms
### n = 5
Array:2064ms
...