You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 34 Next »

 

Setup Development Env

By this tutorial, you will be able to build griffin dev environment to go through all griffin data quality process as below

  • explore data assets,
  • create measures,
  • schedule measures,
  • execute measures in compute clusters and  emit metrics
  • navigate metrics in dashboard.

Dev dependencies

Java :

we prefer java 8, but java 7 is fine for us.

Maven : 

Prerequisities version is 3.2.5

Scala

Prerequisities version is 2.10

Angular

We are using 1.5.8

Env dependencies

Hadoop

Prerequisities version is 2.6.0

Hive

Prerequisities version is 1.2.1

Spark

Prerequisities version is 1.6.x

Mysql

Prerequisities version is 5.0

Elastic search

Prerequisities version is 5.x.x

Make sure you can access your elastic search instance by http protocol.

 

Livy

Griffin submit jobs to spark by Livy( http://livy.io/quickstart.html )

#livy has one bug (https://issues.cloudera.org/browse/LIVY-94), so we need to make these three jars in spark classpath
datanucleus-api-jdo-3.2.6.jar
datanucleus-core-3.2.10.jar
datanucleus-rdbms-3.2.9.jar

 

Setup Dev Env

Git clone

git clone https://github.com/apache/incubator-griffin.git

Project layout

There are three modules in griffin

measure : core algorithms for calculate metrics by different measure dimension.

#app
org.apache.griffin.measure.batch.Application

 

service : web service for data assets, measure metadata, and job schedulers.

#spring boot app
org.apache.griffin.core.GriffinWebApplication

 

ui : front end 

Update several files to reflect your dev env

create a griffin working directory in hdfs
hdfs dfs -mkdir -p <griffin working dir>
init quartz tables by service/src/main/resources/Init_quartz.sql
mysql -u username -p quartz < service/src/main/resources/Init_quartz.sql

 

update service/src/main/resources/application.properties
spring.datasource.url = jdbc:mysql://<MYSQL-IP>:3306/quartz?autoReconnect=true&useSSL=false
spring.datasource.username = <user name>
spring.datasource.password = <password>

hive.metastore.uris = thrift://<HIVE-IP>:9083
hive.metastore.dbname = <hive database name>    # default is "default"
update measure/src/main/resources/env.json with your elastic search instance, and copy env.json to griffin working directory in hdfs.
/*Please update as your elastic search instance*/
"api": "http://<ES-IP>:9200/griffin/accuracy"

update service/src/main/resources/sparkJob.properties file
sparkJob.file = hdfs://<griffin working directory>/griffin-measure.jar
sparkJob.args_1 = hdfs://<griffin working directory>/env.json
sparkJob.jars_1 = hdfs://<pathTo>/datanucleus-api-jdo-3.2.6.jar
sparkJob.jars_2 = hdfs://<pathTo>/datanucleus-core-3.2.10.jar
sparkJob.jars_3 = hdfs://<pathTo>/datanucleus-rdbms-3.2.9.jar
sparkJob.uri = http://<LIVY-IP>:8998/batches

 

update ui/js/services/service.js
#make sure you can access es by http
ES_SERVER = "http://<ES-IP>:9200"

Build

 

cd incubator-griffin
mvn clean install -DskipTests#cp jars to hdfd griffin working dircp /measure/target/measure-0.1.3-incubating-SNAPSHOT.jar /measure/target/griffin-measure.jarhdfs dfs -put griffin-measure.jar <griffin working dir>

Run

#Please find the service jar with version in target folder.
java -jar service/target/service.xxx.jar
#open from your browser
http://<YOUR-IP>://8080

 

License Header File

Each source file should include the following Apache License header

Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

 

 

 

 

 

  • No labels