Skip to main content

Migrating Airflow 2 to 3

We released support for Airflow 3. We will be supporting both Airflow 2 and Airflow 3 until 31 of December 2025. This means that you have 6 months to migrate your Airflow 2 environments to Airflow 3. From 15 August 2025 onwards the default Airflow version for new environments will be Airflow 3. From 1 October 2025 onwards, the creation of new Airflow 2 environments will no longer be possible, but existing environments will keep on working. Any Airflow 2 environments that are not migrated to Airflow 3 by 31 December 2025 will be updated automatically.

This guide describes how to migrate your Airflow 2 environments to Airflow 3. For more information about the (breaking) changes in Airflow 3, see the Airflow 3 migration guide This should give you some background on the changes that you need to make in your DAGs.

Breaking changes for your DAGs and Airflow 3

  • In Airflow 3, the @provide_session decorator is no longer supported in Jinja templates. If you require this functionality, reach out to the Conveyor team for further assistance.
  • We do not support the slack and opsgenie providers anymore in Airflow 3 due to breaking changes. We suggest either using the Conveyor built-in Airflow alerts or creating your own utility which uses the Python requests library to send a message to the Opsgenie or Slack webhook.
  • Datahub does not support Airflow 3 yet, so you cannot migrate DAGs that depend on Datahub in Airflow 3. The Datahub maintainers are working on a solution, which is tracked in Github.
  • In Airflow 3 we also removed support for all datafy references from the DAGs. This means you cannot import DatafySparkSubmitOperatorV2 or DatafyContainerOperatorV2 but we also removed datafy references in jinja templates (e.g. macros.datafy.env(), macros.datafy.image('<some-image>'),...). Replace all these references with their Conveyor equivalents, e.g. macros.conveyor.env(), macros.conveyor.image('<some-image>'). If you do not know the exact replacement, take a look at the Datafy migration documentation.

Migration steps

  1. Check and update your DAGs to make them compatible with Airflow 3.
    • You can do this by running conveyor project upgrade-dags for your project to check the changes.
    • Update part of the changes automatically by running conveyor project upgrade-dags --fix. Continue by fixing issues manually until, upgrade-dags reports no issue
    • Some issues cannot be detected automatically by looking at the python syntax. A way to detect them is to run conveyor project validate-dags --airflow-version 3, which lets Airflow parse the dags.
    • If there are no conflicts, you can safely deploy these changes to your Airflow 2 environments. The Conveyor team has made sure that your Airflow 3 DAGs can run on both Airflow 2 and Airflow 3 environments.
    • If you encounter issues, please reach out to the Conveyor team for further assistance.
  2. If your project depends on the slack or opsgenie provider of Airflow, we do not support them for Airflow 3 due to breaking changes. We suggest to either:
    • Use the Conveyor built-in Airflow alerts, which sends emails to the respective users.
    • Create your own utility which uses the requests library of Python to send a message to the Opsgenie or Slack webhook. You can store the required token in the Airflow connections or variables for your environment.
    • Roll this change out on your Airflow 2 environment such that both Airflow 2 and Airflow 3 environments will be compatible.
  3. If you inject python functions in Airflow jinja templates, that use @provide_session to query the Airflow database, this will not work anymore on Airflow 3. Please reach out to the Conveyor team for further assistance.
  4. As part of the migration, we removed the v1 version of the Conveyor operators. If you still use an import for: ConveyorContainerOperator, ConveyorSparkSubmitOperator, you will now get an import error. These imports can safely be removed as the v1 versions stopped working 2 years ago and should all have been migrated to their respective v2 versions.
  5. As part of the migration, we also removed the datafy references from the DAGs. This means you need to replace the imports or jinja templates that reference datafy by their Conveyor equivalents. For more information, look at the previous section.
  6. Create a new Airflow 3 environment in Conveyor and deploy the projects to the new environment.
    • You can do this by running conveyor environment create --name <your-name> --airflow-version 3.
    • Make sure that your Airflow 3 environment has a different name than your Airflow 2 environment.
    • Deploy your projects to the new environment and run the dags to make sure everything works as expected. This validates that there are no issues in jinja templates, task parameters,... as these are only executed when running the task.
    • If you have IAM identities bound to a specific environment name, make sure to update these that they work for both the Airflow 2 and Airflow 3 environments.
  7. For critical environments, we highly recommend setting the --airflow-instance-lifecycle to onDemand. In Airflow 3, the api-server becomes a critical component as all workers connect to it instead of going to the database. By setting the lifecycle to onDemand, you make sure it does not get spot interrupted and we make sure there are 2 api-server replicas.
  8. For critical environments (e.g. acc, prod), choose a time to switch over to the Airflow 3 environment. You should take the following steps:
    • As preparation change the start_date of the dag for the Airflow 3 environment to your chosen switch over time, otherwise your DAG may rerun jobs since the original start_date of catchup=True
    • On the chosen date disable the respective dags in the Airflow 2 environment and enable them in the Airflow 3 environment.
  9. Repeat the previous steps for all projects that run on an Airflow 2 environment.

Reasons for this approach

We have chosen this approach for the following reasons:

  • Incremental migration: You can migrate your projects one by one, without having to change everything at once.
  • Minimal risk: You have both Airflow 2 and Airflow 3 environments running in parallel, so you can test your projects in the new environment before switching over. If you encounter issues, you can fall back to your Airflow 2 environment while reaching out to the Conveyor team for further troubleshooting.

The main drawback of this approach is that you do not have the history of your Airflow 2 environment in the Airflow 3 environment. However you can always use the backfill functionality in the Airflow 3 UI to backfill before the start_date of your DAG. You can find the backfill functionality by using the Trigger button in the Airflow UI, and choosing the backfill option in that modal.

Alternative/unsupported approach

The alternative approach is to update your Airflow 2 environment to Airflow 3. The drawbacks of this approach are:

  • There are many breaking changes in Airflow 3, so if you forgot to update the dags of 1 project it might impact the whole environment.
  • There are many schema migrations in Airflow 3, so it might take a long time and it can cause issues to update large environments.

We do not recommend this approach due to these risks, and currently don't offer this as an option.

Pendulum update

As part of the upgrade to Airflow 3, we are also updating the Pendulum datetime library to version 3. Airflow uses Pendulum to represent all datetimes in the application.

The breaking change between versions 2 and 3 in the context of pendulum is what the string representation of its DateTime object is. In pendulum 2, this was the ISO 8601 format (YYYY-MM-DDTHH:MM:SS.ffffff). In order to align the behavior between pendulum and the standard datetime module from Python, this was changed to YYYY-MM-DD HH:MM:SS.ffffff (the T separator is now a space).

The impact for you as an Airflow user is that when using a DateTime macro, the rendered value will be different after the upgrade. Examples are the {{ data_interval_start }}, {{ data_interval_end }} and {{ logical_date }} macros.

If you are using these values as arguments for your job, you have two options to handle this change in behavior.

1. Update the macro formatting

info

This option is compatible with both Airflow 2 and 3, hence Conveyor recommends this implementation.

You change the invocations of the macros to explicitly use the ISO 8601 format by changing your macro to f.e. {{ logical_date.isoformat() }}. By making this change, you effectively restore the behavior as it used to be in Airflow 2, using pendulum 2.

To shorten the macro invocation, Airflow also provides a number of jinja filters to format values. The {{ logical_date.isoformat() }} can also be written as {{ logical_date | ts }}. If you want to only extract the date part, you can use {{ logical_date | ds }}.

Example

Suppose you are using the {{ logical_date }} as input to your task, you can modify your task in the following way.

from conveyor.operators import ConveyorContainerOperatorV2

# Airflow 2 example
ConveyorContainerOperatorV2(
task_id="my_task",
command=["python", "-m", "my_application"],
arguments=["--timestamp", "{{ logical_date }}"],
)

# Updated for Airflow 3
ConveyorContainerOperatorV2(
task_id="my_task",
command=["python", "-m", "my_application"],
arguments=["--timestamp", "{{ logical_date | ts }}"],
)

2. Update the datetime parsing

info

This option can be made to work with both Airflow 2 and 3, provided that you are using Python 3.11 to run your application. For non-Python applications, please double-check if your parser can handle both formats, or implement the changes of option 1 to ensure compatibility.

You can update the part of your application logic that parses the passed string back to a datetime. If you update the expected format to reflect the new separator (space instead of T), the generated datetime can again be parsed correctly.

Example

If you are using Python to parse your timestamps (using the datetime library), you can update the parsing logic according to these examples.

DAG definition
from conveyor.operators import ConveyorContainerOperatorV2

# Running on Airflow 3, but not explicitly formatting the macro value
ConveyorContainerOperatorV2(
task_id="my_task",
command=["python", "-m", "my_application"],
arguments=["--timestamp", "{{ logical_date }}"],
)
Application code
from datetime import datetime

airflow2_value = '2011-11-04T00:05:23'
airflow3_value = '2011-11-04 00:05:23'

# The ISO 8601 format, can be simply parsed.
datetime.fromisoformat(airflow2_value)

# For the format with a space separator, you can use a custom parser.
datetime.strptime(airflow3_value, '%Y-%m-%d %H:%M:%S')

# If you are using Python 3.11 or greater, you can simply depend on fromisoformat.
# This function was extended in Python 3.11 to allow more formats.
datetime.fromisoformat(airflow3_value)

As shown in this example, the easiest way to ensure a smooth migration is using datetime.fromisoformat on Python 3.11+. More information on this function can be found in the Python documentation.