Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In this tutorial, you will download and compile REEF .

Requirements

With these requirements met, the instructions below should work regardless of OS choice and command line interpreter. We have had success compiling REEF on:

On Windows, you might find this tutorial helpful in setting up PowerShell with Maven, GitHub and Java. You will still have to install the Protocol Buffers Compiler, though.

Cloning the repository

Comitters

git clone https://git-wip-us.apache.org/repos/asf/incubator-reef.git

Users

git clone git://git.apache.org/incubator-reef.git

Compiling the code

REEF is built using Maven. Hence, a simple 

mvn clean install

should suffice. Not that we have quite a few integration tests in the default build. Hence, you might be better off using

mvn -TC1 -DskipTests clean install

This runs one thread per core (-TC1) and skips the tests (-DskipTests)

 

...

### Checkout the code from GitHub
REEF is under active development at the moment. Hence, it is best to clone the repositories for [REEF](https://github.com/Microsoft-CISL/REEF), [Wake](https://github.com/Microsoft-CISL/Wake) and [Tang](https://github.com/Microsoft-CISL/TANG) and compile the latest version of the `master` branch.

### Download a release
Alternatively, you can download the latest releases of these from GitHub. REEF, Wake and Tang have coordinated releases. Example: To compile REEF 0.1, you will need Wake and Tang also in version 0.1.

## Compiling REEF
As mentioned above, REEF depends on Wake and Tang. Wake itself depends on Tang. Hence, we start the compilation chain with Tang. Tang, Wake and REEF are standard maven projects. Hence, there should be few, if any, surprises if you are already familiar with it.

### Compile and install Tang
```powershell
> cd Tang
> mvn clean install
```
This will produce quite some log output which ideally should end with:
```
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] TANG-Project ...................................... SUCCESS [0.574s]
[INFO] Tang Test Jar A ................................... SUCCESS [3.224s]
[INFO] Tang Test Jar B ................................... SUCCESS [0.415s]
[INFO] Tang Test Jar AB .................................. SUCCESS [0.431s]
[INFO] Tang Test Jar B conflict A ........................ SUCCESS [0.305s]
[INFO] Tang .............................................. SUCCESS [9.174s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
```

### Compile and install Wake
Now on to wake, which is also compiled and installed into your local repository (`$HOME/.m2/repository`) via maven:
```powershell
> cd Wake
> mvn clean install
```
Again we expect this build to succeed:
```
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Wake-Project ...................................... SUCCESS [0.386s]
[INFO] Wake .............................................. SUCCESS [7.877s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
```

### Compile and install REEF
With the requirements met, we can now compile & install REEF. Again, this is done in standard maven fashion via:
```powershell
> cd REEF
> mvn clean install
```
Success of this build phase is indicated by:
```
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] REEF .............................................. SUCCESS [0.481s]
[INFO] REEF Annotations .................................. SUCCESS [2.029s]
[INFO] REEF Checkpoint ................................... SUCCESS [1.935s]
[INFO] REEF Utilities .................................... SUCCESS [2.039s]
[INFO] REEF Common ....................................... SUCCESS [8.066s]
[INFO] REEF Runtime Local ................................ SUCCESS [1.420s]
[INFO] REEF Runtime for YARN ............................. SUCCESS [13.936s]
[INFO] REEF IO ........................................... SUCCESS [5.944s]
[INFO] REEF Examples ..................................... SUCCESS [2.333s]
[INFO] REEF Tests ........................................ SUCCESS [2.481s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
```

**Note:** You will see many exception printouts during the compilation of REEF. Those are not, in fact, problems with the build: REEF guarantees that exceptions thrown on remote machines get serialized and shipped back to the Driver. We have extensive unit tests for that feature that produce the confusing printouts.

...

The module REEF Examples in the folder `reef-examples` contains several simple programs built on REEF to help you get started with development. As always, the simplest of those is our "Hello World": Hello REEF. Upon launch, it grabs a single Evaluator and submits a single Task to it. That Actvity, fittingly, prints 'Hello REEF!' to stdout. To launch it:

```powershell
> cd reef-examples
> mvn -PHelloREEF
```
This invokes the profile `HelloREEF` in the maven build which launches HelloREEF on the local runtime of REEF. During the run, you will see something similar to this output:

```
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building REEF Examples 0.1-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- exec-maven-plugin:1.2.1:exec (default-cli) @ reef-examples ---
[...]
Powered by
___________ ______ ______ _______
/ ______ / / ___/ / ___/ / ____/
/ _____/ / /__ / /__ / /___
/ /\ \ / ___/ / ___/ / ____/
/ / \ \ / /__ / /__ / /
/__/ \__\ /_____/ /_____/ /__/ version 0.1.0

From Microsoft CISL
[...]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 7.743s
[INFO] Finished at: Mon Nov 18 13:20:05 PST 2013
[INFO] Final Memory: 9M/183M
[INFO] ------------------------------------------------------------------------
```

#### Where's the output?
The local runtime simulates a cluster of machines: It executes each Evaluator in a separate process on your machine. Hence, the Evaluator that printed "Hello, REEF" is not executed in the same process as the program you launched above. So, how do you get to the output of the Evaluator? The local runtime creates one folder per job it executes in a configurable root folder. In our builds, these folders are generated in the `target` folder of the maven build:

...

is available on both Linux and Windows and supports developing applications in the Java or C# programming languages.

REEF Git Repositories

Building on Linux

Building on Windows

Compiling

Java build instructions

C# build instructions

 

 

The job folder names are comprised of the job's name (here, `HelloREEF`) and the time stamp of its submission (here, `1375482338468`). If you submit the same job multiple times, you will get multiple folders here.

Let's move on:
```powershell
> cd HelloREEF-1375482338468
> dir
Mode LastWriteTime Length Name
---- ------------- ------ ----
d---- 8/2/2013 3:25 PM driver
d---- 8/2/2013 3:25 PM Node-1-1375482339266
```

Inside of the job's folder, you will find one folder for the job's Driver (named `driver`) and one per Evaluator. Their name comprises of the virtual node simulated by the local runtime (here, `Node-1`) followed by the time stamp of when this Evaluator was allocated on that node, here `1375482339266`. As the HelloREEF example program only allocated one Evaluator, we only see one of these folders here. Let's peek inside:

```powershell
> cd Node-1-1375482339266
> dir *.txt
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 2013-11-18 13:20 3 PID.txt
-a--- 2013-11-18 13:20 18911 STDERR.txt
-a--- 2013-11-18 13:20 3044 STDOUT.txt
```

`STDERR.txt` contains the output on stderr of this Evaluator, which mostly consists of logs helpful in debugging. `STDOUT.txt` contains the output on stdout. And, sure enough, this is where you find the "Hello, REEF!" message.