Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Apache Zeppelin provides valuable features for table manipulations such as built-in visualizations, pivoting and CSV download. However, these features are limited from the table size perspective. Currently, they are executed on the browser side and the table size is limited (configurable and 1000 rows by default). Thus moving these computations from in-browser to backend will be a starting point for handling large data and improving pivoting, filtering, full CSV download, pagination, and other functionalities.

 Furthermore, the tables across interpreter processes currently can’t be shared. For example, table from JDBC interpreter wouldn’t be accessible from SparkSQL or Python interpreters. So the idea here is to extend existing Zeppelin resource pool to share Table resources across interpreters. It would allow also to have one central Table menu to access and view table information of registered Table resources.

...

The issues we discussed above can be implemented in this sequence.the following order of priority

  • ZEPPELIN-TBD: Adding pivot, filter methods to TableData

  • ZEPPELIN-TBD: ResourceRegistry

  • ZEPPELIN-TBD: Rest API for resource pool

  • ZEPPELIN-TBD: UI for Table page

  • ZEPPELIN-TBD: Apply pivot, filter methods for built-in visualizations

  • ZEPPELIN-TBD: SparkTableData, SparkSQLTableData, JDBCTableData, etc.

  • ZEPPELN-2029: ACL for ResourcePool

  • ZEPPELIN-2022: Zeppelin resource pool as a Spark Data Source

...