Apache Kylin : Analytical Data Warehouse for Big Data
Page History
Table of Contents |
---|
Source code
Code Block | ||||
---|---|---|---|---|
| ||||
git clone https://github.com/apache/kylin.git -b kylin-on-parquet-v2 # Compile mvn clean install -DskipTests |
Environment on the dev machine
Install Maven
The latest maven can be found at http://maven.apache.org/download.cgi, we create a symbolic so that mvn
can be run anywhere.
Code Block | ||
---|---|---|
| ||
cd ~ wget http://xenia.sote.hu/ftp/mirrors/www.apache.org/maven/maven-3/3.2.5/binaries/apache-maven-3.2.5-bin.tar.gz tar -xzvf apache-maven-3.2.5-bin.tar.gz ln -s /root/apache-maven-3.2.5/bin/mvn /usr/bin/mvn |
Install Spark
Manually install the Spark binary in in a local folder like /usr/local/spark. Kylin support community version of Spark. You can go to apache spark official website and download spark2.4.6 .
How to Debug
There are two modes to debug source code : Debug with local metadata(recommend) and debug with hadoop sandbox.
Configuration
Debug with local metadata
- Edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties
...
VM options "-Dspark.local=true" is for query engine.
Debug with Hadoop sandbox
Local configuration must be modified to point to your hadoop sandbox (or CLI) machine.
...
An alternative to the host replacement is updating your hosts
file to resolve sandbox
and sandbox.hortonworks.com
to the IP of your sandbox machine.
Launch Kylin Web Server
Copy server/src/main/webapp/WEB-INF to webapp/app/WEB-INF
...
Check Kylin Web at http://localhost:7070/kylin
(user:ADMIN, password:KYLIN)
Setup IDE code formatter
In case you’re writting code for Kylin, you should make sure that your code in expected formats.
For Eclipse users, just format the code before committing the code.
For intellij IDEA users, you have to do a few more steps:
Install “Eclipse Code Formatter” and use “org.eclipse.jdt.core.prefs” and “org.eclipse.jdt.ui.prefs” in core-common/.settings to configure “Eclipse Java Formatter config file” and “Import order”
- Go to Preference => Code Style => Java, set “Scheme” to Default, and set both “Class count to use import with ‘*’” and “Names count to use static import with ‘*’” to 99.
Disable intellij IDEA’s “Optimize imports on the fly”
Format the code before committing the code.
Setup IDE license header template
Each source file should include the following Apache License header
Code Block |
---|
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. |
The checkstyle plugin will check the header rule when packaging also. The license file locates under dev-support/checkstyle-apache-header.txt
. To make it easy for developers, please add the header as Copyright Profile
and set it as default for Kylin project.
How to Package and Deploy
Code Block | ||
---|---|---|
| ||
cd ${KYLIN_SOURCE_CODE} # For HDP2.x ./build/script/package.sh # For CDH5.7 ./build/script/package.sh -P cdh5.7 # After finished, the package will be avaliable in the directory ${KYLIN_SOURCE_CODE}/dist/ # If running on HDP, you need to uncomment the following properties in kylin.properties kylin.engine.spark-conf.spark.driver.extraJavaOptions=-Dhdp.version=current kylin.engine.spark-conf.spark.yarn.am.extraJavaOptions=-Dhdp.version=current kylin.engine.spark-conf.spark.executor.extraJavaOptions=-Dhdp.version=current |
...