THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!

Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Make sure the hadoop version ( HDP or CDH ) of build cluster and query cluster is supported by Kylin.
  2. The hadoop version of build cluster and query cluster must be the same.
  3. Check commands like hdfs and hive are all working properly and can access cluster resources.
  4. If the two clusters have enabled the HDFS NameNode HA, please make sure their HDFS Nameservice names are different. If they are the same, please change one of them to avoid conflict.
  5. Please make sure the network latency between the two clusters is low enough, as there will be a large number of data moved back and forth during model build process.

How to

...

  1. Install Kylin 4.0 with the installation guide in a node which contains Hadoop client configuration of query cluster.
  2. Create a directory called build_hadoop_conf in the $KYLIN_HOME and then copy the hadoop configuration files of the build cluster into this directory ( Note: make sure copy the real configuration files, not the symbolic links ).
  3. Set the value of the configuration 'kylin.engine.submit-hadoop-conf-dir' in the '$KYLIN_HOME/conf/kylin.properties' to the directory created in the step 2.
  4. Copy the hive-site.xml from the build cluster to query cluster with the right directory (the hive configuration on , for example: /etc/hive/conf).
  5. The value of the configuration 'kylin.engine.spark-conf.spark.yarn.queue' in the '$KYLIN_HOME/conf/kylin.properties' should be configured as the queue of the Yarn on build cluster.

...