Looking for something to contribute to? A great place to start is the user-facing Roadmap. But there's a lot going on that isn't necessarily listed there. On this page you can find efforts and points of contact for lots of work happening in Beam where you could help out.

Portability Framework

The primary Beam vision: Any SDK on any runner. This is a cross-cutting effort across Java, Python, and Go, and every Beam runner.

JStorm Runner

MapReduce Runner

Tez Runner

Go SDK

Python 3 Support

Work is in progress to add Python 3 support to Beam. Current goal is to make Beam codebase compatible both with Python 2.7 and Python 3.4.

Contributions are welcome! If you are interested to help, you can select an unassigned issue in the Kanban board and assign it to yourself. Comment on the issue if you cannot assign it yourself. When submitting a new PR, please tag @RobbeSneyders@aaltay, and @tvalentyn.

Next Java LTS version support (Java 11 / 18.9)

Work to support the next LTS release of Java is in progress. For more details about the scope and info on the various tasks please see the JIRA ticket.

IO Performance Testing

We are also working on writing Performance Tests for IOs and developing a Performance Testing Framework for them. Contributions are welcome in the following areas:

  • developing more IO Performance Tests (IOITs)
  • providing necessary kubernetes infrastructure (eg. for databases or filesystems to be used in tests)
  • running Performance Tests on runners other than Dataflow and Direct
  • improving existing Performance Testing Framework and it’s documentation

See the documentation and the initial proposal(for file based tests).

If you’re willing to help in this area, tag the following people in PRs: @chamikaramj@DariuszAniszewski@lgajowy@szewi@kkucharc

Euphoria Java 8 DSL

Easy to use Java 8 DSL for the Beam Java SDK. Provides a high-level abstraction of Beam transformations, which is both easy to read and write. Can be used as a complement to existing Beam pipelines (convertible back and forth). You can have a glimpse of the API at WordCount example.

Improving the contributor experience

Making it easier to write code, run tests, and release. Investigating using docker for jenkins builds, automating the release process, and improving the reliability of tests.

Ideas and help welcome! Contact: Alan MyrvoldMark LiuYifan Zou

Beam SQL

Beam SQL has lots of areas to contribute: support for new operators, new connectors, performance measurement and improvement, more full specification and testing, etc.

Structured streaming spark runner

Create a new runner from scratch based on Spark structured streaming framework

  • No labels