You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Dual streaming and batch engine

Description: Natively support both blocking and pipelined mode of execution for both batch (DataSet) and stream (DataStream) programs. Batch (DataSet) programs will be able to use a combination of blocking and pipelining. Stream (DataStream) programs will use pipelining. Interactive programs (programs that bring back results to the client) will use blocking. Note that the notion of batch/streaming is an API notion, and the notion of blocking/pipelining is a runtime engine concept. The ways that these will interleave is the following:

 Batch API (DataSet)Streaming API (DataStream)
Blocking executionyesno
Pipelined executionyesyes


Associated JIRA:

Expected: Q1 2015

Fine-grained fault tolerance for batch programs

Description: Currently, recovery upon failure backtracks until the data sources. This will add an option to checkpoint intermediate DataSets and backtrack from checkpoints.

Associated JIRA:

Expected: Q2 2015

Interactive programs

Description: Programs that are partially executed in the cluster and partially in the client, They consist of many small programs submitted by the driver program, with driver-side logic in-between.

Associated JIRA:

Expected: Q1 2015

Interactive Scala shell

Description: Be able to run Flink interactive programs from a Scala shell

Associated JIRA:

Expected: Q2/Q3 2015

Machine Learning library

Description: Create common code infrastructure (data types) and popular algorithms.

Associated JIRA:

Expected: Initial version with k-means, ALS, optimizationn in Q1 2015

Machine Learning library

Description: Create common code infrastructure (data types) and popular algorithms.

Associated JIRA:

Expected: Initial version with k-means, ALS, logistic regression in Q1 2015

Integrate with Mahout linear algebra DSL

Description: Make Flink a backend of Mahout DSL

Associated JIRA:

Expected: Q2 2015

Graph processing library

Description: Create a library of common graph operations on a distrivuted Graph data type. The library currently lives in this github repository: https://github.com/project-flink/flink-graph

Associated JIRA:

Expected: Q1 2015

Logical Query Integration

Description: Enable SQL-style queries that use a Row data type with a logical schema. 

Associated JIRA:

Expected: Q2 2015

SQL on Flink

Description: Enable some variant of SQL (likely HiveQL) to run on top of Flink, both in embedded/mixed mode and by submitting queries from a client. 

Associated JIRA:

Expected: Q3/Q4 2015

Integrate with Tachyon

Description: 

Associated JIRA:

Expected: 


Integrate with Zeppelin

Description: 

Associated JIRA:

Expected: 

Integrate with Tez

Description: 

Associated JIRA:

Expected: 

Integrate with Samoa

Description: 

Associated JIRA:

Expected: 

Semantic annotations for optimization

Description: 

Associated JIRA:

Expected: 

Improved statistics for the optimizer

Description: 

Associated JIRA:

Expected: 

Use off-heap memory

Description: 

Associated JIRA:

Expected: 

Dynamic memory allocation

Description: 

Associated JIRA:

Expected: 



 

  • No labels