THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!

Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Welcome to Kylin Wiki.

1. Background

Kylin 通过 System Cube 监控系统,System Cube 记录了查询和任务相关的指标,能够有效的帮助系统运维和发现问题。当用户开启 System Cube 后,就可以在 Kylin  Web 界面查看项目 KYLIN_SYSTEM,在这个项目下有 5  Cube,它们分别从不同的维度记录系统的监控数据。System Cube 服务于 DashBoard 和 Cube Planner 第二阶段,用户也可以通过 System Cube 进行更多场景的分析,以便更好的运维监控 Kylin。

2. The Hive Tables for System Cube

用户在 Kylin 中的每一个查询或构建操作,都会被记录在 Hive 表中,共有 5  Hive 表,它们分别对应了  System Cube 的事实表:

Hive Table Name

Description

System Cube Name


hive_metrics_query_qa

Collect query related information

hive_metrics_query_qa


hive_metrics_query_cube_qa

Collect query related information

hive_metrics_query_cube_qa

Related to Cube Planner

hive_metrics_query_rpc_qa

Collect query related information

hive_metrics_query_rpc_qa


hive_metrics_job_qa

Collect job related information

hive_metrics_job_qa


hive_metrics_job_exception_qa

Collect job related information

hive_metrics_job_exception_qa


以下列出与 Hive 表相关的 5 个配置项:

  • kylin.metrics.prefix:The system will automatically create the above 5 tables in database named 'kylin' by default. You can customize the database by modifying the configuration item kylin.metrics.prefix=<name>. <name> is also the prefix of the System Cube name;
  • kylin.metric.subject-suffix:You can customize the suffix of the hive tables, the default is 'qa', so the table name is 'hive_metrics_query_qa';
  • kylin.metrics.monitor-enabled:Whether to record metrics in Hive, control the above 5 tables, the default is false, which means not to record;
  • kylin.metrics.reporter-query-enabled:Whether to record metrics in Hive, control the above 3 tables about query, the default is false, which means not to record;
  • kylin.metrics.reporter-job-enabled:Whether to record metrics in Hive, control the above 2 tables about job, the default is false, which means not to record;

2.1 How to record query metrics into Hive

Cube 中可以有多个 Segments,每个 Segment 的数据可能存储在不同的 RPC 服务器中,当用户发送一条查询时,查询可能击中多个 Cube,扫描每个 Cube 下的多个 Segments,扫描每个 Segment 下面多个 RPC 服务器中存储的数据。那么对于发送的一条查询:

  • 每击中一个 Cube,在表 hive_metrics_query_qa 中记录一行数据;
  • 当查询击中 Cube,每扫描 Cube 下的一个 Segment,在表 hive_metrics_query_cube_qa 中记录一行数据;
  • 当查询需要扫描 Cube 下的一个 Segment,每扫描一个 RPC 服务器中的数据,在表 hive_metrics_query_rpc_qa 中记录一行数据。(提示:展开 Cube 详情,在 Storage 标签下查看 Region Count,可以看出一个 Segment 中的数据存储在多少个 RPC 目标服务器中)

2.2 How to record job metrics into Hive

  • 对于表 hive_metrics_job_qa,每个成功的任务生成一条数据;
  • 对于表 hive_metrics_job_exception_qa,每个失败的任务生成一条数据;

2.3 Some tips about recording metrics

相关指标插入 Hive 表中有一定的延迟,系统一般会在一定时间后将一个固定批量的数据一次性插入到 Hive 表中。默认的“一定的时间”是 10 分钟,“一个固定批量的数据”是 10 条。如果希望快速验证,可以采用以下两种方法:

  • 修改配置文件 $KYLIN_HOME/tomcat/webapps/kylin/WEB-INF/classes/kylinMetrics.xml 中的配置项,“index=1” 的配置项表示批量(累计多少条数据必然会插入 Hive),“index=2” 的配置项表示时间间隔(累计多长时间必然会插入 Hive,单位分钟);

  • 重启 Kylin 会立即记录所有需要被记录的数据;

接下来可以进入 Hive,查询相应的表以确认数据已插入 Hive,例如:

hive
use kylin;
select * from hive_metrics_query_cube_qa;

2.4 The Relationship Between Hive Tables And System Cube

开启 System Cube 后,Kylin 中会出现一个项目 KYLIN_SYSTEM,里面包含 5 个 Cube,每个 Cube 中包含一张事实表(无维度表),每个事实表对应一个 Hive Table。

我们将介绍以下信息:

  • Column: the name of Hive table column
  • Type: the type of this column
  • Description: the description of this column
  • Sample: Sample data in Hive
  • D/M: this column is set as Dimension or Measure in System Cube
  • 度量函数: If it is set as Measure, the function of the measure
  • 备注:其他补充信息

2.4.1 hive_metrics_query_qa

ColumnTypeDescriptionSampleDimension or Measure in System Cube度量函数备注
query_hash_code     bigintquery unique id7708685990456150000MCOUNT_DISTINCT

每个 SQL 对应一个唯一的 query_hash_code,当再次查询同样的 SQL 时,不会生成新的 query_hash_code

host                stringthe host of server for query enginecdh-client:10.1.3.91D 
kuser               stringuser nameADMIND 
project             stringproject nameLEARN_KYLIND 
realization         stringcube namekylin_sales_cube_SIMPLED 


realization_type    intthe storage type2D 
query_type          stringCACHE,OLAP,LOOKUP_TABLE,HIVE (users can query on different data sources)OLAP, CACHED 
exception           stringIt's for classifying different exception types (when doing query, exceptions may happen)NULL, java.lang.NumberFormatExceptionD 


query_time_cost     bigintthe time cost for the whole query1392MMIN/SUM/MAX/PERCENTILE_APPROX
calcite_count_returnbigintthe row count of the result Calcite returns3MSUM/MAXCalcite 返回给 Kylin 的数据行数,如 n1
storage_count_returnbigintthe row count of the input to Calcite3MSUM/MAX底层数据返回给 Calcite 的数据行数,如 n2
calcite_count_aggregate_filterbigintthe row count of Calcite aggregates and filters0MSUM/MAX

在 Calcite 中,被过滤或被聚合的数据行数,n2-n1

ktimestamp          bigintquery begin time (timestamp)1600938970920  
kyear_begin_date    stringquery begin time (year)2020/1/1D 
kmonth_begin_date   stringquery begin time (month)2020/9/1D 
kweek_begin_date    stringquery begin time (week, begin with Sunday)2020/9/20D 
kday_time           stringquery begin time (time)17:16:10D 
ktime_hour          intquery begin time (hour)17D 
ktime_minute        intquery begin time (minute)16D 
ktime_second        intquery begin time (second)10  
kday_date           stringquery begin time (day)2020/9/24Hive 表分区列 

2.4.2 hive_metrics_query_cube_qa

ColumnTypeDescriptionSampleD/M度量函数备注
host                stringthe host of server for query enginecdh-client:10.1.3.91  
project             stringproject namePEARVIDEOAPP  
cube_name           stringcube nameUserActionPhaseOneCubeD 
segment_name        stringsegment name20201011000000_20201012000000D 
cuboid_source       bigintsource cuboid parsed based on query and Cube design12582912D 与查询模式最匹配的 Cuboid,可能尚未构建
cuboid_target       biginttarget cuboid already precalculated and served for source cuboid13041664D 查询实际使用的Cuboid,可能需要后计算才能回答查询
if_match            booleanwhether source cuboid and target cuboid are equalFALSED 
filter_mask         bigint 4194304D 
if_success          booleanwhether a query on this Cube is successful or notTRUED 
weight_per_hit      double单条查询击中的 Cube 数的倒数1MSUM
storage_call_count  bigintthe number of rpc calls for a query hit on this Cube1MSUM/MAX
storage_call_time_sumbigintsum of time cost for the rpc calls of a query268MSUM/MAX
storage_call_time_maxbigintmax of time cost among the rpc calls of a quer268MSUM/MAX
storage_count_skip  bigintthe sum of row count skipped for the related rpc calls0MSUM/MAX
storage_count_scan  bigintthe sum of row count scanned for the related rpc calls929MSUM/MAX
storage_count_returnbigintthe sum of row count returned for the related rpc calls45MSUM/MAX
storage_count_aggregate_filterbigintthe sum of row count aggregated and filtered for the related rpc calls,= STORAGE_COUNT_SCAN - STORAGE_COUNT_RETURN884MSUM/MAX
storage_count_aggregatebigintthe sum of row count aggregated for the related rpc calls36MSUM/MAX
ktimestamp          bigintquery begin time (timestamp)1603462676906  
kyear_begin_date    stringquery begin time (year)2020/1/1D 
kmonth_begin_date   stringquery begin time (month)2020/10/1D 
kweek_begin_date    stringquery begin time (week, begin with sumday)2020/10/18D 
kday_time           stringquery begin time (time)22:17:56D 
ktime_hour          intquery begin time (hour)22D 
ktime_minute        intquery begin time (minute)17D 
ktime_second        intquery begin time (second)56  
kday_date           stringquery begin time (day)2020/10/23Hive 表分区列 

2.4.3 hive_metrics_query_rpc_qa

ColumnTypeDescriptionSampleD/M度量函数
host                stringthe host of server for query enginecdh-client:10.1.3.91D 
project             stringproject nameLEARN_KYLIND 
realization         stringcube namekylin_sales_cube_SIMPLED 
rpc_server          stringthe rpc related target servercdh-worker-2D 
exception           stringthe exception of a rpc call. If no exception, "NULL" is usedNULLD 
call_time           bigintthe time cost of a rpc all60MSUM/MAX/PERCENTILE_APPROX
count_return        bigintthe row count actually return3MSUM/MAX
count_scan          bigintthe row count actually scanned3MSUM/MAX
count_skip          bigintbased on fuzzy filters or else,a few rows will be skiped. This indicates the skipped row count0MSUM/MAX
count_aggregate_filterbigintthe row count actually aggregated and filtered,= COUNT_SCAN - COUNT_RETURN0MSUM/MAX
count_aggregate     bigintthe row count actually aggregated0MSUM/MAX
ktimestamp          bigintquery begin time (timestamp)1600938970918  
kyear_begin_date    stringquery begin time (year)2020/1/1D 
kmonth_begin_date   stringquery begin time (month)2020/9/1D 
kweek_begin_date    stringquery begin time (week, begin with sumday)2020/9/20D 
kday_time           stringquery begin time (time)17:16:10D 
ktime_hour          intquery begin time (hour)17D 
ktime_minute        intquery begin time (minute)16D 
ktime_second        intquery begin time (second)10  
kday_date           stringquery begin time (day)2020/9/24Hive 表分区列 

2.2.4 hive_metrics_job_qa

ColumnTypeDescriptionSampleD/M度量函数
job_id              stringjob id51b40173-1f6c-7e55-e0ca-fbc84d242ac0  
host                stringthe host of server for job enginecdh-client:10.1.3.91  
kuser               stringuser name ADMIND 
project             stringproject nameLEARN_KYLIND 
cube_name           stringcube namekylin_sales_cube_poiD 
job_type            stringjob type: build, merge, optimizeBUILDD 
cubing_type         stringin kylin,there are two cubing algorithms,Layered & Fast(InMemory)NULLD 
duration            bigintthe duration from a job start to finish945001MSUM/MAX/MIN/PERCENTILE_APPROX
table_size          bigintthe size of data source in bytes227964845MSUM/MAX/MIN
cube_size           bigintthe size of created Cube segment in bytes35693596MSUM/MAX/MIN
per_bytes_time_cost doubleDURATION / TABLE_SIZE0.00414538MSUM/MAX/MIN
wait_resource_time  biginta job may includes serveral MR(map reduce) jobs. Those MR jobs may wait because of lack of Hadoop resources.158146MSUM/MAX/MIN
step_duration_distinct_columnsbigint
138586MSUM/MAX 
step_duration_dictionarybigint 5311MSUM/MAX 
step_duration_inmem_cubingbigint 89MSUM/MAX 
step_duration_hfile_convertbigint 75382MSUM/MAX 
ktimestamp          bigintquery begin time (timestamp)1600938458385  
kyear_begin_date    stringquery begin time (year)2020/1/1D 
kmonth_begin_date   stringquery begin time (month)2020/9/1D 
kweek_begin_date    stringquery begin time (week, begin with sumday)2020/9/20D 
kday_time           stringquery begin time (time)17:07:38D 
ktime_hour          intquery begin time (hour)17D 
ktime_minute        intquery begin time (minute)7D 
ktime_second        intquery begin time (second)38  
kday_date           stringquery begin time (day)2020/9/24Hive 表分区列 

2.2.5 hive_metrics_job_exception_qa

ColumnTypeDescriptionSampleD/M度量函数
job_id              stringjob ida333a36d-8e33-811f-9326-f04579cc2464  
host                stringthe host of server for job enginecdh-client:10.1.3.91  
kuser               stringuser nameADMIND 
project             stringproject nameLEARN_KYLIND 
cube_name           stringcube namekylin_sales_cubeD 
job_type            stringjob type: build, merge, optimizeBUILDD 
cubing_type         stringin kylin,there are two cubing algorithms,Layered & Fast(InMemory)LAYERD 
exception           stringwhen running a job,exceptions may happen. It's for classifying different exception typesorg.apache.kylin.job.exception.ExecuteExceptionD 
ktimestamp          bigintquery begin time (timestamp)1600936844611  
kyear_begin_date    stringquery begin time (year)2020/1/1D 
kmonth_begin_date   stringquery begin time (month)2020/9/1D 
kweek_begin_date    stringquery begin time (week, begin with sumday)2020/9/20D 
kday_time           stringquery begin time (time)16:40:44D 
ktime_hour          intquery begin time (hour)16D 
ktime_minute        intquery begin time (minute)40D 
ktime_second        intquery begin time (second)44D 
kday_date           stringquery begin time (day)2020/9/24Hive 表分区列 

3. How To Enable System Cube

System Cube 的开启方法请参考官方文档 http://kylin.apache.org/cn/docs/tutorial/setup_systemcube.html,请注意 Kylin v2.x 和 Kylin v3.x 版本开启 System Cube 的方法不同,请参考正确的文档版本。

下面是一个开启 System Cube 的演示视频:

system cube.mp4


  • No labels