1. Background

Kylin 通过 System Cube 监控系统，System Cube 记录了查询和任务相关的指标，能够有效的帮助系统运维和发现问题。当用户开启 System Cube 后，就可以在 Kylin 的 Web 界面查看项目 KYLIN_SYSTEM，在这个项目下有 5 个 Cube，它们分别从不同的维度记录系统的监控数据。System Cube 服务于 DashBoard 和 Cube Planner 第二阶段，用户也可以通过 System Cube 进行更多场景的分析，以便更好的运维监控 Kylin。

2. The Hive Tables for System Cube

用户在 Kylin 中的每一个查询或构建操作，都会被记录在 Hive 表中，共有 5 张 Hive 表，它们分别对应了 5 个 System Cube 的事实表：

Hive Table Name	Description	System Cube Name
hive_metrics_query_qa	Collect query related information	hive_metrics_query_qa
hive_metrics_query_cube_qa	Collect query related information	hive_metrics_query_cube_qa	Related to Cube Planner
hive_metrics_query_rpc_qa	Collect query related information	hive_metrics_query_rpc_qa
hive_metrics_job_qa	Collect job related information	hive_metrics_job_qa
hive_metrics_job_exception_qa	Collect job related information	hive_metrics_job_exception_qa

以下列出与 Hive 表相关的 5 个配置项：

kylin.metrics.prefix：The system will automatically create the above 5 tables in database named 'kylin' by default. You can customize the database by modifying the configuration item kylin.metrics.prefix=<name>. <name> is also the prefix of the System Cube name;
kylin.metric.subject-suffix：You can customize the suffix of the hive tables, the default is 'qa', so the table name is 'hive_metrics_query_qa';
kylin.metrics.monitor-enabled：Whether to record metrics in Hive, control the above 5 tables, the default is false, which means not to record;
kylin.metrics.reporter-query-enabled：Whether to record metrics in Hive, control the above 3 tables about query, the default is false, which means not to record;
kylin.metrics.reporter-job-enabled：Whether to record metrics in Hive, control the above 2 tables about job, the default is false, which means not to record;

2.1 How to record query metrics into Hive

Cube 中可以有多个 Segments，每个 Segment 的数据可能存储在不同的 RPC 服务器中，当用户发送一条查询时，查询可能击中多个 Cube，扫描每个 Cube 下的多个 Segments，扫描每个 Segment 下面多个 RPC 服务器中存储的数据。那么对于发送的一条查询：

每击中一个 Cube，在表 hive_metrics_query_qa 中记录一行数据；
当查询击中 Cube，每扫描 Cube 下的一个 Segment，在表 hive_metrics_query_cube_qa 中记录一行数据；
当查询需要扫描 Cube 下的一个 Segment，每扫描一个 RPC 服务器中的数据，在表 hive_metrics_query_rpc_qa 中记录一行数据。（提示：展开 Cube 详情，在 Storage 标签下查看 Region Count，可以看出一个 Segment 中的数据存储在多少个 RPC 目标服务器中）

2.2 How to record job metrics into Hive

对于表 hive_metrics_job_qa，每个成功的任务生成一条数据；
对于表 hive_metrics_job_exception_qa，每个失败的任务生成一条数据；

2.3 Some tips about recording metrics

相关指标插入 Hive 表中有一定的延迟，系统一般会“在一定时间后”或“将一个固定批量的数据”一次性插入到 Hive 表中。默认的“一定的时间”是 10 分钟，“一个固定批量的数据”是 10 条。如果希望快速验证，可以采用以下两种方法：

修改配置文件 $KYLIN_HOME/tomcat/webapps/kylin/WEB-INF/classes/kylinMetrics.xml 中的配置项，“index=1” 的配置项表示批量（累计多少条数据必然会插入 Hive），“index=2” 的配置项表示时间间隔（累计多长时间必然会插入 Hive，单位分钟）；

重启 Kylin 会立即记录所有需要被记录的数据；

接下来可以进入 Hive，查询相应的表以确认数据已插入 Hive，例如：

hive
use kylin;
select * from hive_metrics_query_cube_qa;

2.4 The Relationship Between Hive Tables And System Cube

开启 System Cube 后，Kylin 中会出现一个项目 KYLIN_SYSTEM，里面包含 5 个 Cube，每个 Cube 中包含一张事实表（无维度表），每个事实表对应一个 Hive Table。

我们将介绍以下信息：

Column: the name of Hive table column
Type: the type of this column
Description: the description of this column
Sample: Sample data in Hive
D/M: this column is set as Dimension or Measure in System Cube
度量函数: If it is set as Measure, the function of the measure
备注：其他补充信息

2.4.1 hive_metrics_query_qa

Column	Type	Description	Sample	Dimension or Measure in System Cube	度量函数	备注
query_hash_code	bigint	query unique id	7708685990456150000	M	COUNT_DISTINCT	每个 SQL 对应一个唯一的 query_hash_code，当再次查询同样的 SQL 时，不会生成新的 query_hash_code
host	string	the host of server for query engine	cdh-client:10.1.3.91	D
kuser	string	user name	ADMIN	D
project	string	project name	LEARN_KYLIN	D
realization	string	cube name	kylin_sales_cube_SIMPLE	D
realization_type	int	the storage type	2	D
query_type	string	CACHE，OLAP，LOOKUP_TABLE，HIVE (users can query on different data sources)	OLAP, CACHE	D
exception	string	It's for classifying different exception types (when doing query, exceptions may happen)	NULL, java.lang.NumberFormatException	D
query_time_cost	bigint	the time cost for the whole query	1392	M	MIN/SUM/MAX/PERCENTILE_APPROX
calcite_count_return	bigint	the row count of the result Calcite returns	3	M	SUM/MAX	Calcite 返回给 Kylin 的数据行数，如 n1
storage_count_return	bigint	the row count of the input to Calcite	3	M	SUM/MAX	底层数据返回给 Calcite 的数据行数，如 n2
calcite_count_aggregate_filter	bigint	the row count of Calcite aggregates and filters	0	M	SUM/MAX	在 Calcite 中，被过滤或被聚合的数据行数，n2-n1
ktimestamp	bigint	query begin time (timestamp)	1600938970920
kyear_begin_date	string	query begin time (year)	2020/1/1	D
kmonth_begin_date	string	query begin time (month)	2020/9/1	D
kweek_begin_date	string	query begin time (week, begin with Sunday)	2020/9/20	D
kday_time	string	query begin time (time)	17:16:10	D
ktime_hour	int	query begin time (hour)	17	D
ktime_minute	int	query begin time (minute)	16	D
ktime_second	int	query begin time (second)	10
kday_date	string	query begin time (day)	2020/9/24	Hive 表分区列

2.4.2 hive_metrics_query_cube_qa

Column	Type	Description	Sample	D/M	度量函数	备注
host	string	the host of server for query engine	cdh-client:10.1.3.91
project	string	project name	PEARVIDEOAPP
cube_name	string	cube name	UserActionPhaseOneCube	D
segment_name	string	segment name	20201011000000_20201012000000	D
cuboid_source	bigint	source cuboid parsed based on query and Cube design	12582912	D		与查询模式最匹配的 Cuboid，可能尚未构建
cuboid_target	bigint	target cuboid already precalculated and served for source cuboid	13041664	D		查询实际使用的Cuboid，可能需要后计算才能回答查询
if_match	boolean	whether source cuboid and target cuboid are equal	FALSE	D
filter_mask	bigint		4194304	D
if_success	boolean	whether a query on this Cube is successful or not	TRUE	D
weight_per_hit	double	单条查询击中的 Cube 数的倒数	1	M	SUM
storage_call_count	bigint	the number of rpc calls for a query hit on this Cube	1	M	SUM/MAX
storage_call_time_sum	bigint	sum of time cost for the rpc calls of a query	268	M	SUM/MAX
storage_call_time_max	bigint	max of time cost among the rpc calls of a quer	268	M	SUM/MAX
storage_count_skip	bigint	the sum of row count skipped for the related rpc calls	0	M	SUM/MAX
storage_count_scan	bigint	the sum of row count scanned for the related rpc calls	929	M	SUM/MAX
storage_count_return	bigint	the sum of row count returned for the related rpc calls	45	M	SUM/MAX
storage_count_aggregate_filter	bigint	the sum of row count aggregated and filtered for the related rpc calls，= STORAGE_COUNT_SCAN - STORAGE_COUNT_RETURN	884	M	SUM/MAX
storage_count_aggregate	bigint	the sum of row count aggregated for the related rpc calls	36	M	SUM/MAX
ktimestamp	bigint	query begin time (timestamp)	1603462676906
kyear_begin_date	string	query begin time (year)	2020/1/1	D
kmonth_begin_date	string	query begin time (month)	2020/10/1	D
kweek_begin_date	string	query begin time (week, begin with sumday)	2020/10/18	D
kday_time	string	query begin time (time)	22:17:56	D
ktime_hour	int	query begin time (hour)	22	D
ktime_minute	int	query begin time (minute)	17	D
ktime_second	int	query begin time (second)	56
kday_date	string	query begin time (day)	2020/10/23	Hive 表分区列

2.4.3 hive_metrics_query_rpc_qa

Column	Type	Description	Sample	D/M	度量函数
host	string	the host of server for query engine	cdh-client:10.1.3.91	D
project	string	project name	LEARN_KYLIN	D
realization	string	cube name	kylin_sales_cube_SIMPLE	D
rpc_server	string	the rpc related target server	cdh-worker-2	D
exception	string	the exception of a rpc call. If no exception, "NULL" is used	NULL	D
call_time	bigint	the time cost of a rpc all	60	M	SUM/MAX/PERCENTILE_APPROX
count_return	bigint	the row count actually return	3	M	SUM/MAX
count_scan	bigint	the row count actually scanned	3	M	SUM/MAX
count_skip	bigint	based on fuzzy filters or else，a few rows will be skiped. This indicates the skipped row count	0	M	SUM/MAX
count_aggregate_filter	bigint	the row count actually aggregated and filtered，= COUNT_SCAN - COUNT_RETURN	0	M	SUM/MAX
count_aggregate	bigint	the row count actually aggregated	0	M	SUM/MAX
ktimestamp	bigint	query begin time (timestamp)	1600938970918
kyear_begin_date	string	query begin time (year)	2020/1/1	D
kmonth_begin_date	string	query begin time (month)	2020/9/1	D
kweek_begin_date	string	query begin time (week, begin with sumday)	2020/9/20	D
kday_time	string	query begin time (time)	17:16:10	D
ktime_hour	int	query begin time (hour)	17	D
ktime_minute	int	query begin time (minute)	16	D
ktime_second	int	query begin time (second)	10
kday_date	string	query begin time (day)	2020/9/24	Hive 表分区列

2.2.4 hive_metrics_job_qa

Column	Type	Description	Sample	D/M	度量函数
job_id	string	job id	51b40173-1f6c-7e55-e0ca-fbc84d242ac0
host	string	the host of server for job engine	cdh-client:10.1.3.91
kuser	string	user name	ADMIN	D
project	string	project name	LEARN_KYLIN	D
cube_name	string	cube name	kylin_sales_cube_poi	D
job_type	string	job type: build, merge, optimize	BUILD	D
cubing_type	string	in kylin，there are two cubing algorithms，Layered & Fast(InMemory)	NULL	D
duration	bigint	the duration from a job start to finish	945001	M	SUM/MAX/MIN/PERCENTILE_APPROX
table_size	bigint	the size of data source in bytes	227964845	M	SUM/MAX/MIN
cube_size	bigint	the size of created Cube segment in bytes	35693596	M	SUM/MAX/MIN
per_bytes_time_cost	double	DURATION / TABLE_SIZE	0.00414538	M	SUM/MAX/MIN
wait_resource_time	bigint	a job may includes serveral MR(map reduce) jobs. Those MR jobs may wait because of lack of Hadoop resources.	158146	M	SUM/MAX/MIN
step_duration_distinct_columns	bigint		138586	M	SUM/MAX
step_duration_dictionary	bigint		5311	M	SUM/MAX
step_duration_inmem_cubing	bigint		89	M	SUM/MAX
step_duration_hfile_convert	bigint		75382	M	SUM/MAX
ktimestamp	bigint	query begin time (timestamp)	1600938458385
kyear_begin_date	string	query begin time (year)	2020/1/1	D
kmonth_begin_date	string	query begin time (month)	2020/9/1	D
kweek_begin_date	string	query begin time (week, begin with sumday)	2020/9/20	D
kday_time	string	query begin time (time)	17:07:38	D
ktime_hour	int	query begin time (hour)	17	D
ktime_minute	int	query begin time (minute)	7	D
ktime_second	int	query begin time (second)	38
kday_date	string	query begin time (day)	2020/9/24	Hive 表分区列

2.2.5 hive_metrics_job_exception_qa

Column	Type	Description	Sample	D/M
job_id	string	job id	a333a36d-8e33-811f-9326-f04579cc2464
host	string	the host of server for job engine	cdh-client:10.1.3.91
kuser	string	user name	ADMIN	D
project	string	project name	LEARN_KYLIN	D
cube_name	string	cube name	kylin_sales_cube	D
job_type	string	job type: build, merge, optimize	BUILD	D
cubing_type	string	in kylin，there are two cubing algorithms，Layered & Fast(InMemory)	LAYER	D
exception	string	when running a job，exceptions may happen. It's for classifying different exception types	org.apache.kylin.job.exception.ExecuteException	D
ktimestamp	bigint	query begin time (timestamp)	1600936844611
kyear_begin_date	string	query begin time (year)	2020/1/1	D
kmonth_begin_date	string	query begin time (month)	2020/9/1	D
kweek_begin_date	string	query begin time (week, begin with sumday)	2020/9/20	D
kday_time	string	query begin time (time)	16:40:44	D
ktime_hour	int	query begin time (hour)	16	D
ktime_minute	int	query begin time (minute)	40	D
ktime_second	int	query begin time (second)	44	D
kday_date	string	query begin time (day)	2020/9/24	Hive 表分区列

3. How To Enable System Cube

System Cube 的开启方法请参考官方文档 http://kylin.apache.org/cn/docs/tutorial/setup_systemcube.html，请注意 Kylin v2.x 和 Kylin v3.x 版本开启 System Cube 的方法不同，请参考正确的文档版本。

下面是一个开启 System Cube 的演示视频：

system cube.mp4

Space shortcuts

Page tree

1. Background

2. The Hive Tables for System Cube

2.1 How to record query metrics into Hive

2.2 How to record job metrics into Hive

2.3 Some tips about recording metrics

2.4 The Relationship Between Hive Tables And System Cube

2.4.1 hive_metrics_query_qa

2.4.2 hive_metrics_query_cube_qa

2.4.3 hive_metrics_query_rpc_qa

2.2.4 hive_metrics_job_qa

2.2.5 hive_metrics_job_exception_qa

3. How To Enable System Cube

Space shortcuts

Page tree

System Cube Introduction_CN

1. Background

2. The Hive Tables for System Cube

2.1 How to record query metrics into Hive

2.2 How to record job metrics into Hive

2.3 Some tips about recording metrics

2.4 The Relationship Between Hive Tables And System Cube

2.4.1 hive_metrics_query_qa

2.4.2 hive_metrics_query_cube_qa

2.4.3 hive_metrics_query_rpc_qa

2.2.4 hive_metrics_job_qa

2.2.5 hive_metrics_job_exception_qa

3. How To Enable System Cube