Flink currently follows a mono-repository approach. Splitting the repository would divide the build time problem into smaller ones. This approach has some additional benefits and issues outside of the build time.

Benefits:

  • Split build time issues
  • Unstable (or worse, permanently failing) tests affect the entire project (the probability for this is increasing with the project)
  • Easier to track pull requests per repository


Problems:

  • Git history
    • fork off new repositories
    • rewrite history
  • Shared Maven dependencies / plugin configurations
    • Idea: set up a new parent pom
  • Building a common documentation out of many repositories
  • End 2 end tests
    • how to share / split tooling
    • In general, tooling will be spread across different repos
  • Releases / Versioning / "internal" dependencies
    • a) Single release across all repositories
    • b) Synced releases 
    • c) separate releases
  • Requires stable API for downstream projects to rely on (advantage once we have it)
  • Higher development friction when debugging Flink problems in downstream projects

Approaches:

  • Split into "flink-main" and "flink-connectors"
    • Saves ~1 hour of build time
  • One repository, multiple Maven projects


  • No labels