...
- Metrics
- Nexmark
- Gradle
Works In Progress
Portability Framework
The primary Beam vision: Any SDK on any runner. This is a cross-cutting effort across Java, Python, and Go, and every Beam runner.
Apache Spark 2.0 Runner
- Feature branch: runners-spark2
- Contact: Jean-Baptiste Onofré
JStorm Runner
- Docs
- Feature branch: jstorm-runner
- JIRA: runner-jstorm / BEAM-1899
- Contact: Pei He
MapReduce Runner
- Feature branch: mr-runner
- JIRA: runner-mapreduce / BEAM-165
- Contact: Pei He
Tez Runner
- Feature branch: tez-runner
- JIRA: runner-tez / BEAM-2709
Go SDK
- Contact: Henning Rohde
Python 3 Support
Work is in progress to add Python 3 support to Beam. Current goal is to make Beam codebase compatible both with Python 2.7 and Python 3.4.
Contributions are welcome! If you are interested to help, you can select an unassigned issue in the Kanban board and assign it to yourself. Comment on the issue if you cannot assign it yourself. When submitting a new PR, please tag @RobbeSneyders, @aaltay, and @tvalentyn.
Next Java LTS version support (Java 11 / 18.9)
Work to support the next LTS release of Java is in progress. For more details about the scope and info on the various tasks please see the JIRA ticket.
- JIRA: BEAM-2530
- Contact: Ismaël Mejía
IO Performance Testing
We are also working on writing Performance Tests for IOs and developing a Performance Testing Framework for them. Contributions are welcome in the following areas:
- developing more IO Performance Tests (IOITs)
- providing necessary kubernetes infrastructure (eg. for databases or filesystems to be used in tests)
- running Performance Tests on runners other than Dataflow and Direct
- improving existing Performance Testing Framework and it’s documentation
See the documentation and the initial proposal(for file based tests).
If you’re willing to help in this area, tag the following people in PRs: @chamikaramj, @DariuszAniszewski, @lgajowy, @szewi, @kkucharc
Euphoria Java 8 DSL
Easy to use Java 8 DSL for the Beam Java SDK. Provides a high-level abstraction of Beam transformations, which is both easy to read and write. Can be used as a complement to existing Beam pipelines (convertible back and forth). You can have a glimpse of the API at WordCount example.
- Feature branch: dsl-euphoria
- JIRA: dsl-euphoria / BEAM-3900
- Contact: David Moravek
Improving the contributor experience
Making it easier to write code, run tests, and release. Investigating using docker for jenkins builds, automating the release process, and improving the reliability of tests.
Ideas and help welcome! Contact: Alan Myrvold, Mark Liu, Yifan Zou
Beam SQL
Beam SQL has lots of areas to contribute: support for new operators, new connectors, performance measurement and improvement, more full specification and testing, etc.
- JIRA: dsl-sql
- Contact: Kenneth Knowles
Add benchmarks to continuous integration
Run Nexmark benchmark queries after each commit for Spark, Flink and Direct Runner and export response times to performance dashboards
- JIRA: nexmark-perfkit
- Contact: Etienne Chauchot
Extract metrics in a runner agnostic way
Metrics are pushed by the runners to configurable sinks (HTTP REST sink available). It is already enabled in Filnk and Spark runner. Work is in progress for Dataflow
- JIRA: runner-agnostic-metrics
- Contact: Etienne Chauchot