Helium - Brings Zeppelin to data analytics application platform

Motivation

Zeppelin is providing pluggable Interpreter architecture that results wide varity of backend system support.
Each interpreter abstracts underlying computing frameworks (eg. SparkInterpreter abstracts Spark cluster) with their own interface (eg. SparkInterpreter provides scala/sql/python for the interface).

Also there're has powerful feature called Angular Display system that enables user creates own front-end interface that interacts with interpreter.
And Dependency loader enables loads libraries from remote repository.

Put this all gother, I could imagine application platform on top of Apache Zeppelin.
So i propose framework Helium that brings Zeppelin to data anlytics application platform, by

- Leveraging computing resources provided by Interpreters
- Generalizing dependency loader
- Providing SDK on top of Angular Display system
- With package repository

What is Helium Application?

Helium Application = View + Algoirthm + Access to Resoruce

View

Anything you want to display inside of Zeppelin notebook.
Can be any standard html, css, javascript.
Your view and algorithm can interact.

Algorithm

The code you want to run, which is any code that runs on JVM.

Resource

Provided by interpreter or provided by the other Helium Application.

Every interpreter automatically provides result of last run.
Additionally they can provide their own resource (eg. SparkContext).
Also any user code, in Helium Application can provide any resource they want.

The resource can be any java object.
So it can be data, it can be abstraction of computing (eg. SparkContext), it can be anything.

How Helium Application runs

Application packaged into Jar and published into maven repository.
Also it adds spec file in package registry.

Then depends on Resource that resource pool has, Zeppelin automatically suggest possible Application user can run.
When user selects Application, application is being downloaded and runs on the interpreter process where resource exists.

SDK

User application extends org.apache.zeppelin.helium.Application class in SDK.
SDK provides development mode, so you can actually run application inside of Zeppelin without deployment.
Development mode application automatically refreshes it's view as it's html/css/javascript resources changes without restart.

Package Repository and spec file

Helium Application packaged into Jar there for it can be distributed by maven repository.
Package Repository is actually collectino of spec file. Each spec file provides information of

- Name of Application
- Artifact name in maven repository
- Resources this application requires

The package repsoitory is going to to be maintained as separate gitrepo with it's own homepage. (like spark-packages.org for spark package), so any user can add their spec file without PMC review.
There will be a bot that automatically merge pull request of specfile into the master branch.

I propose the repository
https://github.com/zeppelin-project/helium-packages

Implementation.

There're proof of concept implementation.
https://github.com/Leemoonsoo/incubator-zeppelin/tree/helium

Application examples

I have created some example applications based on PoC implementation.

Git commit data - datasource
https://github.com/Leemoonsoo/zeppelin-gitcommitdata

Wordcloud - visualize the paragraph's table result
https://github.com/Leemoonsoo/zeppelin-wordcloud

SparkMon - appliction that access spark
https://github.com/Leemoonsoo/zeppelin-sparkmon

Page tree

Helium - Brings Zeppelin to data analytics application platform

Motivation

What is Helium Application?

View

Algorithm

Resource

How Helium Application runs

SDK

Package Repository and spec file

Implementation.

Application examples

Video
https://www.youtube.com/watch?v=8Wdc70e6QVI&feature=youtu.be

Page tree

Helium proposal

Helium - Brings Zeppelin to data analytics application platform

Motivation

What is Helium Application?

View

Algorithm

Resource

How Helium Application runs

SDK

Package Repository and spec file

Implementation.

Application examples

Videohttps://www.youtube.com/watch?v=8Wdc70e6QVI&feature=youtu.be

Video
https://www.youtube.com/watch?v=8Wdc70e6QVI&feature=youtu.be