Apache Kylin : Analytical Data Warehouse for Big Data
Page History
...
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
[ { "sink": "hive", "storage_type": 2, "cube_desc_override_properties": { "kylin.cube.algorithm": "INMEM", "kylin.cube.max-building-segments": "1" } }, { "sink": "kafka", "storage_type": 3, "cube_desc_override_properties": { "kylin.cube.algorithm": "INMEM", "kylin.stream.cube.window": 28800, "kylin.stream.cube.duration": 3600, "kylin.stream.segment.retention.policy": "fullBuild", "kylin.cube.max-building-segments": "20" }, "table_properties": { "bootstrap.servers": "{YOUR_SERVERS_LIST}" } } ] |
3. 生成 System Cube (Real-time OLAP) 元数据
Code Block | ||||
---|---|---|---|---|
| ||||
./bin/kylin.sh org.apache.kylin.tool.metrics.systemcube.SCCreator -inputConfig sink.json -output system-cube |
...
5. 导入 Cube 元数据
Code Block | ||||
---|---|---|---|---|
| ||||
sh bin/metastore.sh restore system-cube/ |
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<beans xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.springframework.org/schema/beans" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.1.xsd"> <description>Kylin Metrics Related Configuration (System Cube)</description> <!-- A Reservoir which don't staged metrics message at all, emit it in no time. Maybe good for debug purpose.--> <bean id="instantReservoir" class="org.apache.kylin.metrics.lib.impl.InstantReservoir"/> <bean id="kafkaSink" class="org.apache.kylin.metrics.lib.impl.kafka.KafkaSink"/> <bean id="initMetricsManager" class="org.springframework.beans.factory.config.MethodInvokingFactoryBean"> <property name="targetClass" value="org.apache.kylin.metrics.MetricsManager"/> <property name="targetMethod" value="initMetricsManager"/> <property name="arguments"> <list> <!-- Sink of System Cube. --> <ref bean="kafkaSink"/> <!-- Bind properties for each ActiveReservoirReporter. --> <map key-type="org.apache.kylin.metrics.lib.ActiveReservoir" value-type="java.util.List"> <!-- Each ActiveReservoir can have multi ReservoirReporter --> <entry key-ref="instantReservoir"> <list> <bean class="org.apache.kylin.common.util.Pair"> <!-- Implementation of ReservoirReporter--> <property name="first" value="org.apache.kylin.metrics.lib.impl.kafka.KafkaReservoirReporter"/> <!-- Properties for specific ReservoirReporter--> <property name="second"> <props> <prop key="bootstrap.servers">cdh-master:9092,cdh-worker-1:9092,cdh-worker-2:9092</prop> </props> </property> </bean> </list> </entry> </map> </list> </property> </bean> </beans> |
7. 启用 启动 Kylin 并启用 System Cube
四
...
. 对比
通过将 Sink 设置为 Kafka, 并且使用 Realtime OLAP 来摄入和构建新的 System Cube, 我们可以实现:
- Metrics 消息从其产生到其对 Kylin 查询系统可见的延迟, 从之前的小时级别延迟下降为秒级延迟. 这将大大提高 Dashboard 数据展示结果的实时性, 十分有利于问题的及时发现和诊断.
- 避免之前的定时调度 Cube 构建任务的步骤, 可以依赖于 Real-time OLAP 来完成摄入和自动化调度构建任务.
五. 参考链接
http://kylin.apache.org/docs/tutorial/setup_systemcube.html
...
Overview
Content Tools
ThemeBuilder
Apps