#1: 10 Aug 2020 - Planning & Scoping
Attendees
Name | Company |
---|---|
Kaxil Naik | Astronomer |
Tomek Urbaszek | Polidea |
James Timmins | Astronomer |
Jarek Potiuk | Polidea |
Tobiasz Kędzierski | Polidea |
Anita Fronczak | |
Goran Obradovic | |
Mateusz Henc | |
Rafal Biegacz | |
Daniel Imberman | Astronomer |
Vikram Koka | Astronomer |
QP Hou | Scribd |
Ephraim Anierobi | - |
Felix Uellendall | Digitas Pixelpark |
Shekhar Singh | Gojek |
Kamill Breguła | Polidea |
Key Decisions
- The tentative date for Airflow 2.0 Beta: 1st Week of October 2020 (can be revised based on the progress in the upcoming weeks)
- Following Functional items were unanimously agreed that they should be a part of Airflow 2.0 and if need be Airflow 2.0 can be delayed by a few weeks if these items aren't complete:
- Airflow REST API
- Functional DAGs
- Most of the proposal is already merged. The only pending piece is "@dag" decorator
- Production-ready Docker Image
- Already in good shape. Would be good to add docker-compose files and docs about using them in Quick Start guide.
- Production-ready Helm chart (with KEDA)
- Providers Packages
- A separate call to discuss some open questions on implications of not having providers directory from Airflow 2.0
- Some of the issues that were mentioned were Dependency handling when new providers packages are released and its compatibility.
- Scheduler HA
- No Open-source PRs have been opened yet but significant progress has been made on this by Ash.
- DAG Versioning will not be a part of Airflow 2.0 (and would be deferred for now) since the Scope has increased significantly after the proposal of changing the execution behaviour using possibly DAG Fetcher / DAG Manifest.
- Airflow will strictly follow Semantic Versioning from Airflow 2.0. Backwards compatibility will be preserved in Airflow 2.0 where-ever possible based on the Approach described here.
- Following non-functional items were discussed:
- Docs:
- Docs can be better organized for Airflow 2.0. We will have a separate call to discuss the exact details on what needs to be changed/re-organized.
- Add missing documentation where-ever needed. Create Github issues which will allow new contributors to take on that work-load.
- Schedule Interval / Execution at Start of Schedule or End of the Schedule
- Deferring this to Post 2.0 as we don't have a clear agreement on this. But we all agree that adding a config to decide the edge is not a good solution as it would cause more confusion when this config is changed.
- SubDAGs:
- We would like to merge all the improvements to SubDAGs to make it a first-class citizen and implementing AIP-34. Based on current discussion the consensus is to make it UI-only feature introducing the concept of TaskGroup. Mailing List Thread Link
- Docs:
#2: 24 Aug 2020
Attendees
Name | Company |
---|---|
Kaxil Naik | Astronomer |
Tomek Urbaszek | Polidea |
James Timmins | Astronomer |
Jarek Potiuk | Polidea |
Daniel Imberman | Astronomer |
Vikram Koka | Astronomer |
QP Hou | Scribd |
Xiaodong DENG | - |
Greg Neiheisel | Astronomer |
Ry Walker | Astronomer |
Kamil Olszewski | Polidea |
Key Decisions
- Smart Sensors – in 2.0 or 2.1
- AIP-17 | PR: https://github.com/apache/airflow/pull/5499
- We have not come to a conclusion yet on whether this should be included in 2.0 or not. The majority is towards adding it in 2.0 (as it supports Airflow 2.0's Scalability story) and marking it as experimental.
- There were some questions raised around supporting this new feature. So we decided that everyone would take a look at the PR itself and we will spend a few minutes in the next meeting to decide whether it is 2.0 or not
- Simplification of KubernetesExecutor / KubernetesPodOperator
- PR: https://github.com/apache/airflow/pull/10393
- This will be part of Airflow 2.0
- Airflow Upgrade Check (airflow upgrade-check) command
- WIP PR: PR: https://github.com/apache/airflow/pull/9467 | Design Doc: https://docs.google.com/document/d/17tB9KZrH871q3AEafqR_i2I7Nrn-OT7le_P49G65VzM/edit#heading=h.vv80w6y621gv
- Scope:
- Users bash script won’t be included but anything in the core Airflow would be covered
DAG Definitions:
- Changes in Path for contrib to Providers packages
- DAG Interfaces: changes in arguments of a DAG / BaseOperator
- Configurations:
- Option to auto-replace deprecated configs with new options
- Run-time Core items:
- Changes like "Connection type can't be null". The upgrade-check should at least shown warning if it can't provide option to detect the type.
- CLI refactor is out-of-scope
- Automatic refactor is out-of-scope as it is too difficult to cover all the cases in the Users bash scripts.
- This will be covered by docs or by showing warnings via the upgrade-check command
- Experimental API to New API refactor is out-of-scope (will be covered by Migration docs)
- We agreed that the airflow upgrade-check command needs to be available in the last release before Airflow 2.0 (1.10.x or 1.11.x)
- Potential problems with time-consuming DB Migration were also discussed. If we identify such a DB Migration (example the one involving TaskInstance table) should be noted separately in Updating.md to provide a warning to the users.
- DEV Calls Feedback
- We agreed on having Weekly calls from 7 September onwards
- Calls will start with a 5-min reviewing the progress from the last call towards 2.0
- Process
- A 2.0.0-test branch will be created on
- Changelog:
- The current way of Changelog is OK. We don't need further categorization like Webserver, Scheduler etc.
- Separate Changelog would be created for Providers Packages
- We need to figure a way to tag/label PRs & Issues with correct categories. Some options that were discussed were:
- Adding labels on the PRs & Issues via Bot
- A field in PR template for PR authors to add, the bot would then read the field which would be used to label the PR
- Add rules, for example Committers needs to add appropriate labels to the PR before merging it. We could have a scheduled Github Actions workflow that would fail if it finds PRs without labels.
#3: 7 Sep 2020
Attendees
Name | Company |
---|---|
Kaxil Naik | Astronomer |
Ash-Berlin Taylor | Astronomer |
Tomek Urbaszek | Polidea |
Kevin Yang | Airbnb |
Jarek Potiuk | Polidea |
Daniel Imberman | Astronomer |
Vikram Koka | Astronomer |
Xiaodong DENG | - |
Ry Walker | Astronomer |
Shekhar Singh | Gojek |
Rafal Biegacz | |
Anita Fronczak |
Key Decisions
- Smart Sensors
- Will be included in 2.0 as an early-access feature with a clear note that this feature might potentially change in future Airflow version with breaking changes.
- Airbnb team would be happy to help on the support side answering questions related to Smart Sensor.
- Add docs around different execution modes for Sensor: Poke mode, Reschedule mode vs Smart Sensor
- Providers Packages
- We had a consensus on releasing providers packages separately mainly because of the following reasons:
- Separate cadence for providers compared to Airflow, so bugs in operator/hooks can be fixed lot faster.
- Enterprises generally would not like to upgrade the “core” (Scheduler) as a small bug can break the deployment and affect all the DAGs
- Breaking library changes (new version of a library) can be fixed with a new version of Backport/Providers
- Upgrades of backport providers are not “that” destructive i.e. even if you upgrade to a newer version and find a bug, you could go back to the previous version without causing any issues at all.
- Open questions / Action Items:
- How would users figure out “breaking changes” with CALVER Versioning (which is very clear with SEMVER)?
- Use plugin Mechanism to:
- Register Connections from an external provider to allow custom field or hide existing form fields.
- Register Operator Extra links for operators in providers so that a change is not required in Airflow
- Backport providers will only be supported/released for three months after 2.0.0 released
- We had a consensus on releasing providers packages separately mainly because of the following reasons:
- Timeline to Airflow 2.0
- Only critical fixes (fixes to bugs that takedown Production system) will be backported to 1.10.x core for six months after Airflow 2.0 is released.
Date | Milestone |
---|---|
Week of 7 Sep 2020 | Create the 2.0.0-test branch |
While the scope is fluid, we would be rebasing this test branch from master. After we completely freeze the scope, we would only cherrypick commits from Airflow Master to v2-0-test branch if they are “in-scope”. Normal development would continue on Master branch i.e. PRs would be created against Airflow Master. | |
Week of 28 Sep 2020 | Cut Functionally complete 2.0 alpha release |
Week of 12 Oct 2020 | Cut first 2.0 beta release |
Beta snapshots would be published to the Airflow Community to test and create issues to make sure Airflow is functioning and backwards compatible. | |
Week of 19 Oct 2020 | Cut bridge release based on 1.10.x - jump-off point to 2.0. Probably 1.10.13 or 1.10.14 containing upgrade check scripts for 2.0 |
Week of 26 Oct 2020 | Cut second 2.0 beta release |
Week of 9 Nov 2020 | Cut third 2.0 beta release |
Week of 23 Nov 2020 | Cut first 2.0 release candidate (2.0.0rc1) |
#4: 14 Sep 2020
Attendees
Name | Company |
---|---|
Kaxil Naik | Astronomer |
Ash-Berlin Taylor | Astronomer |
Vikram Koka | Astronomer |
Tomek Urbaszek | Polidea |
Jarek Potiuk | Polidea |
Daniel Imberman | Astronomer |
James Timmins | Astronomer |
Xiaodong DENG | - |
Kamil Breguła | Polidea |
Ry Walker | Astronomer |
Rafal Biegacz | |
Anita Fronczak | |
QP Hou | Scribd |
Key Decisions
- Updates
- Airflow v2-0-test branch has already been cut and currently manually rebased on top of the Master. Currently, we don't run CI as the branch is in-sync with Master. As soon as we have a PR / commit that we don't want to have it in 2.0 we will diverge v2-0-test branch from Master and start running tests against it.
- The upgrade-check PR was merged, we now need to define more rules to add more checks.
- API
- Progress:
- Project Board: https://github.com/apache/airflow/projects/1
- The issues labelled with "Enhancement" are not a requirement for 2.0
- Endpoints:
- Task Instance Endpoint is WIP, all the other endpoints have been implemented.
- Permissions Model:
- On-going discussion on the PR but close to completion.
- The next piece of work to be done is migrating existing Views to use resource-based permissions. (Github issue). This is mainly for standardizing the permissions model across API and UI.
- Project Board: https://github.com/apache/airflow/projects/1
- Progress:
- Improvements to SubDags / Concept of TaskGroup
- AIP-34 | PR introduced the concepts of TaskGroup and will be included in Airflow 2.0.
- The PR implements TaskGroups for Graph View, the Tree View will be implemented in follow-up PRs.
- Follow-up items from the discussion:
- Discuss on mailing list whether we should deprecate SubDags in favour of TaskGroup in 2.0 or wait until Airflow 2.1 or 2.2
- Add docs around when to use TaskGroup vs SubDag and potentially listing PROs and CONS.
- AIP-34 | PR introduced the concepts of TaskGroup and will be included in Airflow 2.0.
- Scheduler HA (AIP-15 )
- A Draft PR has been created to enable code reviews and to allow the members of the community to start testing it with various setups.
- To get the most benefit of Scheduler HA on MySQL, users will need to use MySQL 8. This is because MySQL 5.7 does not support SKIP LOCK feature but note that MySQL 5.7 will still continue to work with at least the same or improved performance as now.
- Astronomer has done performance testing with different Scenarios and will publish benchmarks over the coming weeks. Google Composer Team + Polidea said that they would be happy to carry out various tests for Scheduler HA as well.
- There were some concerns raised around LOCKING Timeout periods and the usage of DAG Serialization. More testing in the upcoming weeks should help mitigate any concerns and help fix the bugs if discovered.
- Docs:
- Explicitly mention that for HA Scheduler reads some of the properties from serialized_dag table. Users can turn on/off DAG Serialization in the Webserver but the Scheduler will continue using it.
- Do we recommend 2 schedulers for Production deployments?
- X Schedulers vs single Scheduler. Use case when one would be better than the other.
- Some kind of Bell Curve showing an increase in Schedulers stops improving performance and maybe also degrades. This is intended to give guidance around what number of schedulers to run based on expected load, since this decision could be based on multiple factors.
- Follow up items:
- Create mailing list thread to discuss "Removing Pickling from Airflow 2.0". Currently, pickled dags are only supported by CeleryExecutor and we have a flag on airflow scheduler (--do-pickle) and "--ship-dag" on airflow tasks run command. If we want to remove pickling Airflow 2.0 is the right time or we shouldn't do it until 3.0
- Helm Chart
- We will continue focusing on getting Airflow 2.0 out so the first official release of Helm Chart might need to wait.
- The issue with Helm Chart sources was fixed and there are no blockers currently if we were to release it at some point in the near future.
- Enhancements (but not blockers) are:
- Better Test Coverage with integration tests
- Docs pointing to the chart on the Airflow Website or the docsite
- The artifacts for the Helm chart would be published at https://downloads.apache.org/airflow/
- There is still an open question around Helm Chart Versioning Policy i.e. do we want to tie-in Airflow Versions with Helm Chart? Or do we just start from 1.0.0? This needs to be decided before the release of the Helm Chart.
#5: 21 Sep 2020
Attendees
Name | Company |
---|---|
Kaxil Naik | Astronomer |
Ash-Berlin Taylor | Astronomer |
Vikram Koka | Astronomer |
Tomek Urbaszek | Polidea |
Jarek Potiuk | Polidea |
Daniel Imberman | Astronomer |
Shekhar Singh | Gojek |
Ry Walker | Astronomer |
Rafal Biegacz |
Key Decisions
- API
- Progress:
- Project Board: https://github.com/apache/airflow/projects/1
- The issues labelled with "Enhancement" are not a requirement for 2.0
- Endpoints:
- Task Instance Endpoint is WIP, all the other endpoints have been implemented.
- Permissions Model:
- PR has been merged.
- The next piece of work to be done is migrating existing Views to use resource-based permissions. (Github issue). This is mainly for standardizing the permissions model across API and UI.
- Project Board: https://github.com/apache/airflow/projects/1
- Progress:
- Providers
- Vote on AIP-8 took place on the mailing list.
- There is an ongoing discussion on the same thread about SemVer vs CalVer for the Providers package.
- The people involved on the call were leaning towards SemVer to make a clear distinction about a breaking release. This will potentially increase the work on release managers but some automation around releasing (similar to backport providers) and automation around the generation of the changelog for the providers would make the effort less painful.
- Version Per Provide: Each Providers package would have a separate versioning i.e. we might release "google-providers 3.1" and "amazon-providers" 3.7 at the same time but the versioning for a particular provider will be independent of other providers.
- DEV
- Would be good to have a release policy on when we can deprecate a feature, our release cadence. A good example is https://docs.djangoproject.com/en/3.1/internals/release-process/#release-cadence
- SubDag Deprecation
- There is a mailing list thread on whether or not we want to deprecate SubDags in favor of Taskgroups, the majority on the call agreed that we should not deprecate the Subdags yet and wait till people have used TaskGroups and it has feature parity with SubDags.
- However, we should clearly recommend using TaskGroups compared to SubDags in our docs and state limitations of the SubDags.
- Helm Chart Release
- Deferred until 2.0 is out
- Will be available to use from the source code of Airflow on Github but the first official release of the Helm chart will only happen after Airflow 2.0
- Docs
- Mailing list thread to get some feedback has been created and cross-posted across Slack and Twitter. Once we have enough feedback, Kaxil will create Github issues for them so that anyone willing to help on it can start working on it.
- A separate section for Upgrading to 2.0 would be ideal, can be a duplicate of Updating.md but with a better structure and more organized.
- UI Changes
- Github Issue: https://github.com/apache/airflow/issues/10953
- There are some proposals from Ryan for the UI changes for which he has created some PRs (links below) and in the process of creating few more.
- Task Instance Modal UX Enhancements · Issue #10944 · apache/airflow
- Replace JS package toggle w/ pure CSS solution #11035
- Task Instance header/navigation pattern UX cleanup – Suggestions / VOTE needed here if anyone has strong opinions
- Scheduler HA
- Reminder: A draft PR for Scheduler HA is available for review. It would be good to get some more feedback from the wider community with their own DEV setup if possible.
- Process
- Any new PRs would continue to be merged until we complete the items for 2.0 and release alphas.
- NOTE: The Timeline shown on the Planning page will be revisited every week on the Dev Call and updated if needed based on the progress towards the major features of Airflow 2.0
#6: 28 Sep 2020
Attendees
Name | Company |
---|---|
Kaxil Naik | Astronomer |
Ash-Berlin Taylor | Astronomer |
Vikram Koka | Astronomer |
Jarek Potiuk | Polidea |
Daniel Imberman | Astronomer |
Ry Walker | Astronomer |
QP Hou | Scribd |
James Timmins | Astronomer |
Xiaodong DENG | |
Prakshal Jain |
Key Decisions
- Scheduler HA
- Locking discussion: Document how Scheduler HA would work with a HA database configuration, such as an Active / Passive database configuration with multiple read replicas, but a single writer.
- Add MySQL 8 to the our CI pipeline https://github.com/apache/airflow/issues/11164
- Progress update: Ash said that the unit tests should be green by tomorrow. One thing left is the change_state_for_to_without_dagrun - currently called for every time (expensive), will change to call every 30 seconds. PR is feature complete after that is done and will no longer be "in draft". Will need to move first functionally complete build to next week instead of current week, to give time for all unit tests to pass and for reviews.
- Benchmarks: Will be run based on this branch, since the current benchmarks are based on an initial draft branch before rebase with master.
- API
- Task Instance Endpoint: Only thing open. Kaxil to speak with Kamil to get status.
- Existing permissions map to UI: WIP PRs. Feedback requested by James on this PR https://github.com/apache/airflow/pull/11158/
- Migrations: Is there a different migration process needed? Can be run as a standard alembic migration.
- Clients: QP said that the GO client based on the REST API is already completed and he is using it as part of his airflow-terraform module.
- UI Improvements
- New HomePage: Ryan has updated the UX (look) and shared on slack. Split Actions and Buttons, using Google Material Fonts. Feedback from meeting was very positive. PR to be final tomorrow.
- Splitting Providers Package
- Separate provider packages from core: Need some help to get this wrapped up. https://github.com/apache/airflow/issues/11163
- Build optimization: This would really help speed-up builds by only running tests for changed providers https://github.com/apache/airflow/issues/10507
- SemVer: Kaxil to send email to dev list confirming the decision in the meeting about using SemVer (lazy-consensus)
- Backport Providers in next 1.10.x release
- EmailOperator: Location has changed (moved to core)
- CNCF.Kubernetes providers: Need to be checked if these can be released. Insert github issue link created by Kamil for core operators
- Timeline update:
- Functionally complete build: Will be on the week of Oct 5th, instead of this week as described above in the Scheduler HA notes.
#7: 5 Oct 2020
Attendees
Name | Company |
---|---|
Kaxil Naik | Astronomer |
Ash-Berlin Taylor | Astronomer |
Vikram Koka | Astronomer |
Jarek Potiuk | Polidea |
Daniel Imberman | Astronomer |
Ry Walker | Astronomer |
James Timmins | Astronomer |
Tomasz Urbaszek | Polidea |
Sumit Maheshwari | |
Rafal Beigacz |
Key Decisions
- Scheduler HA
Progress
- MySQL 8 tests have been added
- Adoption of orphaned tasks have been added to Kubernetes and to the new CeleryKubernetesExecutor (PR)
- All (but one) Tests are passing
- Instructions to run the PR with Multiple Schedulers using docker-compose have been added (link to the comment)
- Received some reviews on Monday -- Should be addressed by today/tomorrow
- All Big/Critical Concerns have been addressed (escape hatch for locks, MySQL 8 tests, Moving callbacks to DagFileProcessor)
- Room to improve code readability which isn't a blocker and would be addressed in follow-up PR
- Ash is in talks with Kevin to test his Giant DAG to verify Scheduler Performance to address one of Kevin Yang's concern
- There would be a separate PR for retrying transactions to avoid deadlocks as suggested by MySQL guide on handling deadlocks
- API
- Kamil and Omair are hoping to complete work on the Task Instance Endpoint PR by the end of this week
- James is working to map existing permissions to the UI, hoping to complete by end of the week too. Link to PRs
- Backport Providers
- Backport packages 2020.10.5 will be released without cncf.kubernetes. We will create a separate release for cncf.kubernetes that will address the issues Kamil found in RCs. Link to VOTE thread.
- Airflow 1.10.13 Release
- Will be released as per the planned timeline (Airflow 2.0 - Planning [Archived]) i.e. Last Week of October.
- Airflow 2.0 Alpha
- Revisit on Thursday based on the progress with Scheduler HA & API whether we would be able to release first alpha this Friday.
- Others:
- Kaxil to create Github Issues for the feedback received on the Documentation Improvement Thread and create threads asking for VOTES for previously discussed items like removing pickles.
- Vikram will write up, share and all can review on what we mean by ALPHA and BETAs.
#8: 12 Oct 2020
Attendees
Name | Company |
---|---|
Kaxil Naik | Astronomer |
Ash-Berlin Taylor | Astronomer |
Vikram Koka | Astronomer |
Jarek Potiuk | Polidea |
Daniel Imberman | Astronomer |
Ry Walker | Astronomer |
James Timmins | Astronomer |
Tomasz Urbaszek | Polidea |
Sumit Maheshwari | |
QP Hou |
Key Decisions
- Scheduler HA
PR Merged
- Github Project for tracking Alpha & Beta
- Vikram to send an email for Github Issues triage with other folks
- Blockers for releasing Alpha-1
- Renaming Functional DAGs
- Docs:
- Clearly state "Functional DAGs" (decorator and not setting dependencies) only work with PythonOperator
- Adding docs around using “Task.output” (i.e. tasks not using PythonOpertator)
API
- Task Instance Endpoint
- Draft/WIP PR: https://github.com/apache/airflow/pull/9597
- Should be done by the end of this week
- Existing Permission and Map them to UI (Flask View Permissions).
- WIP PRs: https://github.com/apache/airflow/pulls/jhtimmins
- Should be done by the end of this week
#9: 19 Oct 2020
Attendees
Name | Company |
Vikram Koka | Astronomer |
Ash-Berlin Taylor | Astronomer |
James Timmins | Astronomer |
Jarek Potiuk | Polidea |
Daniel Imberman | Astronomer |
Kamil Breguła | Polidea |
Rafal | |
Kevin | AirBnb |
XD | |
Brett Koonce | |
Ping Zhang | AirBnb |
Cong Zhu | AirBnb |
Yingbao | AirBnb |
Ry Walker | Astronomer |
QP Hou | Scribd |
Discussion and Key Decisions
- Providers
- Need to split each release into a separate PyPI package
- Possibly not needed for alpha2, possibly for beta
- Also need to split documentation for 2.0
- Need to figure out updating and maintaining versions across providers
- Not needed immediately, but needed for weeks to come
- Provider versioning strategy to be documented especially for contributors and committers for the future - should be automated in the future
- Alpha build / testing status
- Great to see significant engagement from the community in 2.0 testing
- Vikram reported that alpha testing going well, but still in progress
- Alpha2 build: Targeting a second alpha build by the end of this week
- Beta1 build pushed to next week to allow more time testing with alpha builds
- Google cloud composer team - will start testing with beta1
- API
- From the API permissions standpoint, all of the changes are made and merged.
- Kamil made updates to the task instance endpoint.
- API is feature complete
- Scheduler HA
- Jarek reported that they saw a deadlock once with MySQL 5.7 and added logging to see if it shows up again. Not a blocker since it is possible to disable locking on MySQL 5.7, but wanted to know if anyone else sees this.
- Ping and Kevin testing Scheduler HA with a very large DAG with 21,000 tasks to benchmark in the AirBnb environment. Ash, Kevin, Ping to discuss offline.
- Smart Sensors
- Yingbao commented that it was being used in AirBnb for almost a year, very stable.
- Ry suggested that a demo should be created for 2.0 production release to explain the capability
- Upgrade script to 2.0 / doc
- Daniel working on upgrade doc “Upgrading from 1.10.x to 2.0”
- We need to check to see what all can we add to the upgrade check scripts to make it easier to migrate to 2.0
- Database migration tests - need to be validated as well, before 2.0 RC, including the performance of the migration to 2.0.
- Kubernetes Executor
- Not much to report there.
- Already talking about the KubernetesCelery Executor. Need to figure out if any additional testing is needed.
- UI / UX improvements
- Very positive feedback on the UX and the auto-refresh button
- Ryan Hamilton is also making accessibility-related changes to the UI.
#10: 26 Oct 2020
Attendees
Name | Company |
Kaxil Naik | Astronomer |
Vikram Koka | Astronomer |
James Timmins | Astronomer |
Ry Walker | Astronomer |
Jarek Potiuk | Polidea |
Daniel Imberman | Astronomer |
Rafal | |
XD Deng | |
QP Hou | Scribd |
Tomasz Urbaszek | Polidea |
Cindy Rogers |
Discussion and Key Decisions
- Alpha2
- Released on 26 Oct along with the providers too
- Providers
- Vote in process for Backport Providers 2020.10.29 - VOTE THREAD
- Pending items:
- Extra link registration of Providers to core Airflow
- Docs
- Split documentation from Core Airflow Docs
- Installation of Providers
- FAQ Section
- Upgrading apache-airflow and providers package
- Why do Providers have separate versioning than apache-airflow core?
- Why extras to install providers? And what version of providers are installed when installing providers via Extras
- KubernetesExecutor Update:
- Draft PR to allow passing pod_template_file for overriding Pod Specification: PR Link
- Airflow 2.0 Updating Guide
- https://github.com/apache/airflow/blob/master/UPGRADING_TO_2.0.md contains the information required to upgrade to 2.0 from 1.10.x. This is a WIP doc and the idea is that this place should be a single place for Users to look at when they want to Upgrade to 2.0 so it should contain all the details in a digestible manner. If anyone has any suggestions on this document please create a Github issue or bring it to the mailing list.
- ToDo:
- Add a dedicated step to install the latest backport-providers before Upgrade.
- Details around how long 1.10.x & backport providers will be supported should be added to Upgrading doc – Github Issue Link
- 2.0 Beta1
- We will wait until Friday to see how we are progressing i.e. if all the bugs found in alpha2 are fixed or not: https://github.com/apache/airflow/milestone/10.
- If we are not done by Friday, we will move the beta1 release to next week
- Airflow 1.10.13
- Milestone: https://github.com/apache/airflow/milestone/7
- Let's triage the Milestone and start the release process for 1.10.13
#11: 2 Nov 2020
Attendees
Name | Company |
Kaxil Naik | Astronomer |
Vikram Koka | Astronomer |
James Timmins | Astronomer |
Ry Walker | Astronomer |
Jarek Potiuk | Polidea |
Daniel Imberman | Astronomer |
XD Deng | |
QP Hou | Scribd |
Tomasz Urbaszek | Polidea |
Discussion and Key Decisions
- 2.0 Beta1
- Milestone: https://github.com/apache/airflow/milestone/10.
- We will cut out beta1 at some point this week as all the outstanding items in the milestone should be done in a day or two.
- While announcing the beta we will ask users to check out the documentation too, so that we can identify missing gaps.
- We need to specifically add instructions (Both Slack and Email) for users who find this missing notes to at least creates Github Issues describing what is missing and ideally if they can create PRs to fix it.
- Python wheel, tar.gz and Dockerfile for beta1 will be published to facilitate easier testing
- PRs to enable Black and Pyupgrade across the code-base will be rebased and merged before publishing beta1. This might cause some PRs to have conflicts and we will recommend users to squash all commits in their PRs so that it is easy to resolve conflicts as it should just be the matter of running pre-commit hook for black and pyupgrade.
- Separating Upgrade checks from 1.10.13
- upgrade-check tool will now be released separately from Airflow 1.10.13. The main reason for deciding to do so is that we will be able to add more upgrade checks even after 1.10.13 is released.
- Github Issue: https://github.com/apache/airflow/issues/11112 . Please add any suggestions or ask any question over there. If needed, we can do a separate call otherwise async discussion will continue on that Github issue.
- Airflow 1.10.13
- Milestone: https://github.com/apache/airflow/milestone/7
- Triage and Release process has started for 1.10.13. We aim to release it in the next 1-3 weeks.
- Releasing in in the next two weeks will allow people testing upgrade from 1.10.13 to Airflow 2.0.0beta2 which would be critical.
#12: 9 Nov 2020
Attendees
Name | Company |
Kaxil Naik | Astronomer |
Vikram Koka | Astronomer |
Ry Walker | Astronomer |
Jarek Potiuk | Polidea |
QP Hou | Scribd |
Discussion and Key Decisions
- 2.0 Beta1
- Beta: https://github.com/apache/airflow/milestone/10
- Airflow 2.0.0b1 was cut out on Monday, 9 Nov 2020
- 2.0 Beta2
- Airflow 2.0.0b2 was cut out on Tuesday, 10 Nov 2020 after a bug was discovered that prevented testing of beta1
- Testing: To everyone who wants to help with Airflow 2.0.0 beta testing, the following are the main areas we need more help from the community:
- Kubernetes Executor and KubernetesPodOperator:
- CeleryKubernetesExecutor
- KEDA AutoScaling + CeleryKubernetesExecutor
- 2.0 Beta3
- Main Purpose
- Test upgrade from 1.10.12 / 1.10.13 to 2.0.0b3
- Regression from beta2
- Providers discovery from beta2
- Main Purpose
- Separating Upgrade Checks from v1-10-test / 1.10.13
- A reminder to Committers and any-one interested to add comments on https://github.com/apache/airflow/issues/11112 to decide whether we will keep the source code for the upgrade-check plugin inside airflow-repo or a separate repo
Airflow 1.10.13
- Milestone: https://github.com/apache/airflow/milestone/7
- Should be released in the next 2-3 weeks.
- New Time for Dev calls
- The regular Monday dev calls will now happen an hour earlier i.e. 5:30 PM GMT every Monday
#13: 16 Nov 2020
Attendees
Name | Company |
Kaxil Naik | Astronomer |
Vikram Koka | Astronomer |
Daniel Imberman | Astronomer |
Ry Walker | Astronomer |
Leah Cole | |
James Timmins | Astronomer |
Jarek Potiuk | Polidea |
XD Deng |
Discussion and Key Decisions
- Make KubernetePodOperator Backwards-compatible
- PR: https://github.com/apache/airflow/pull/12384
- Release cncf.kubernetes backport-provider package with that PR in to make it backwards compatible with 1.10.x
- Docs
- Add Docs explaining the different components of Airflow, especially for users migrating from 1.10.x who are not following community news:
- Providers
- Backport-providers
- Airflow Core
- Helm Chart
- Docker Image
- Existing docs for Providers are at https://airflow.readthedocs.io/en/latest/provider-packages.html
- Add Docs explaining the different components of Airflow, especially for users migrating from 1.10.x who are not following community news:
- Upgrade Checks
- Github Issue: https://github.com/apache/airflow/issues/11112
- The package will be released separately to 1.10.13.
- However, there is an on-going discussion whether the sources should remain in v1-10-test branch or not.
- Call to discuss this will happen this Thursday, 19 Nov at 3-4 PM GMT. If anyone is interested to join, use this Zoom link
- We will also set a date when this separate package should be released and when the alpha/beta will be out to test with Airflow 1.10.x
- After the decision, this information will be available on the planning page.
- Providers Packages
- PR for adding new Connections for a Provider using provider.yaml file is out. Each provider folder now contains a YAML file that is used for documentation, PR that introduced it: https://github.com/apache/airflow/pull/12304
- This will be extended to discover Extra links too.
- For now, this should be sufficient but in future, we can take a look at Pytest's plugin mechanism suggested by Ash.
- 2.0 Beta3
- 2.0.0beta3 will be cut by the end of this week
- Regular betas will be cut every week until we cut RCs
Airflow 1.10.13
- Milestone: https://github.com/apache/airflow/milestone/7
- Jarek has cherry-picked more than 300 commits to v1-10-test that syncs breeze, Helm Chart changes with Master, updated dependencies and test fixes.
- Kaxil working on further fixes and cherry-picks and we are targeting RCs this Friday
#14: 23 Nov 2020
Attendees
Name | Company |
Kaxil Naik | Astronomer |
Vikram Koka | Astronomer |
Daniel Imberman | Astronomer |
Ry Walker | Astronomer |
James Timmins | Astronomer |
Jarek Potiuk | Polidea |
Discussion and Key Decisions
- 1.10.13 to be released this week
- First version of upgrade-check to be released this week
- Composer team + Polidea will test 2.0.0beta3 next week on MySQL 5.7 + MySQL 8
- Don’t need new beta for testing providers, Committers / PMC will test it themselves (Providers’s Packages discovery + Connections / Extra Links)
- Jarek is working on "Registering Extra link & Connections for Providers"
#15: 30 Nov 2020
Attendees
Name | Company |
Vikram Koka | Astronomer |
Daniel Imberman | Astronomer |
Ry Walker | Astronomer |
Jarek Potiuk | Polidea |
Tomasz Urbaszek | Polidea |
XD Deng | |
Kosteev | |
Mary | |
Bin | |
Rafal |
Discussion and Key Decisions
- Bridge release
- Airflow 1.10.14 to be built this week and vote started to release next week.
- Airflow 1.10.13 was released last week, but critical bug discovered after release.
- Upgrade Checks
- First version of upgrade-check released last week.
- Additional checks and enhancements to support custom checks being added.
- Airflow DB upgrade
- The topic of Airflow Database migration was brought up and discussed.
- We agreed that it would be ideal to recommend this approach for upgrading from 1.10.14 to 2.0, but we need to add a FAQ around how long this database upgrade would take and what possible steps could be taken to reduce this time.
- For example, by running a script or a DAG to reduce the amount of task history before the database upgrade.
- A collection of FAQs around this would be important to have as part of the 2.0 release.
- Vikram to add issues to capture this.
- Airflow 2.0 Release Candidate:
- Overall feedback on Airflow 2.0 core is positive, but polishing still needed.
- Targeting RC1 build creation to start in the middle of next week.
- Command line PR not yet merged, but would be good to have in RC1.
- Documentation work is ongoing, would be good to clean up before RC1 release.
- Milestone Issue Cleanup
- Issues currently tagged for 2.0 RC1 should be cleaned up.
- Those which are not core such as Providers or upgrade scripts, should be tagged for those milestones rather than the 2.0 core RC1 milestone.
- All of us to do this cleanup this week and review any open items next Monday.
#16: 7 Dec 2020
Attendees
Name | Company |
Kaxil Naik | Astronomer |
Vikram Koka | Astronomer |
Daniel Imberman | Astronomer |
James Timmins | Astronomer |
Ash Berlin-Taylor | Astronomer |
Ry Walker | Astronomer |
Jarek Potiuk | Polidea |
Tomasz Urbaszek | Polidea |
Xiaodong DENG | |
John Jackson | Amazon |
QP Hou | Scribd |
Rafal |
Discussion and Key Decisions
- Airflow 2.0 Release Candidate:
- Targeting 2.0.0rc1 Wednesday - 9 Dec.
- Upgrade Checks
- Targeting 1.1.0rc1 Friday - 11 Dec.
- Providers 1.0.0rc1
- Targeting with 2.0.0rc1 on Wednesday - 9 Dec.
- Airflow 1.10.14
1.10.14rc4 has been cut and submitted for a vote. If vote passes, 1.10.14 would be released on Thursday 10 Dec
- Deprecating Features
- Follow strict SEMVER.
- Features won’t be removed until a major version even if they are deprecated
- Document a policy about that in CONTRIBUTING.rst or somewhere more appropriate
- Ash to write a proposal about having a major release (e.g 3.0 release) with just removed features
#17: 14 Dec 2020
Attendees
Name | Company |
Kaxil Naik | Astronomer |
Vikram Koka | Astronomer |
Daniel Imberman | Astronomer |
Ash Berlin-Taylor | Astronomer |
Jarek Potiuk | Polidea |
John Jackson | Amazon |
Rafal |
Discussion and Key Decisions
- Airflow 2.0
- If the vote succeeds, Airflow 2.0 will be released on Thursday, 17 December
- Provider 1.0.0
- Apache Airflow Providers 1.0.0 has been released: https://pypi.org/search/?q=%22apache-airflow-providers%22&o=
- Airflow upgrade-check
- Airflow upgrade-check 1.1.0 has been released: https://pypi.org/project/apache-airflow-upgrade-check/1.1.0/
- Snowflake Issues
- Snowflake monkey-patches urllib -- which causes issues for example when using S3 logging in 2.0: "'SSLSocket' has no attribute Connection"
- GH links here:
- https://github.com/snowflakedb/snowflake-connector-python/issues/324
- https://github.com/apache/airflow/issues/12881
- Escalate and Push Snowflake to fix it and release a new version as early as they can
- Dev calls after 2.0
- Fortnightly meeting (once every two weeks) each Wednesday at 7:30 PM GMT after 2.0 is released, starting from 6 Jan 2021.
- Check https://www.timeanddate.com/worldclock/fixedtime.html?msg=8&iso=20210106T1930 to find the time in your Timezone
5 Comments
Vikram Koka
Thank you Kaxil, this is a good summary of the meeting and the discussion.
fsqdqfd
nice
Kamil Bregula
Kaxil Naik Thanks for the effort you put into releasing this version.
Kaxil Naik
Thank you for all the support and your work too, Kamil.
fsqdqfd
thank you