You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »



1. Status

 

Current State: [UNDER DISCUSSION]

Discussion Thread: [...]

JIRA: ZEPPELIN-2019


2. Motivation

Apache Zeppelin provides valuable features for table manipulations such as built-in visualizations, pivoting and CSV download. However, these features are limited from the table size perspective. Currently, they are executed on the browser side and the table size is limited (configurable and 1000 rows by default). Thus moving these computations from in-browser to backend will be a starting point for handling large data and improving pivoting, filtering, full csv download, pagination and other functionalities.

 Furthermore, the tables across interpreter processes currently can’t be shared. For example, table from JDBC interpreter wouldn’t be accessible from SparkSQL or Python interpreters. So the idea here is to extend Zeppelin ResourcePool to share Table resources across interpreters. It would allow also to have one central Table menu to access and view table information of registered Table resources.

 Thus the critical question is “How Zeppelin can support large data handling and share across interpreters?”. Here are already resolved issues and they can be clues to solving the problem. 


Based on these works, this proposal aims to build a mechanism for handling table resource in backend and design API for the resource pool. This will bring Zeppelin to 

  • register the table result as a shared resource

  • list all available (registered) tables

  • preview tables including its meta information (e.g columns, types, ..)

  • download registered tables as CSV, and other formats.

  • pivoting / filtering in backend to transforming larger data

  • cross join tables in different interpreters (e.g Spark interpreter uses a table result generated from JDBC interpreter)

 

For more future work tasks, please refer the 6. Potential Future Work section.



 

 

 

  • No labels