Release notes
1.18.22 (18-12-2024)
bugfixes
- [General]: Revert an API component update that caused certain requests to fail
- [UI]: Gracefully handle dynamic load errors after a new release
1.18.21 (17-12-2024)
features
- [UI]: Remove the old-style project pages
- [Airflow]: Upgrade Kubernetes provider plugin to 10.0.1
bugfixes
- [UI]: The collapsible logs for manual and stand-alone tasks now show the logs for the correct tasks. Before this fix, the logs of the latest scheduled task were shown.
1.18.20 (11-12-2024)
bugfixes
- [Airflow]: Revert the upgrade of Kubernetes provider plugin to 9.0.1, we now use 8.3.4, in specific circumstances this could result in tasks getting stuck in queued.
1.18.19 (10-12-2024)
features
- [SDK]: We now support build-args in the ProjectBuilder, available as of version 0.0.6
- [Airflow]: Upgrade the Kubernetes provider plugin to 9.0.1
- [UI]: Allow searching and filtering the list of IDE instances
- [AWS]: Upgrade EBS CSI driver from v1.36.0 to v1.37.0
bugfixes
- [IDE]: Ensure that the IDE configuration matches the region where it is running. The issue only occurs for tenants running clusters in multiple AWS regions.
- [IDE]: Fix an issue where subsequent snapshotting of IDEs might result in data loss.
- [IDE]: Improve the IDE resilience when the code-server process is stuck/crashed.
- [Costs]: Fix issue with IDE project builds not being assigned a project name but only an ID, this resulted in small issues with cost allocations
1.18.18 (05-12-2024)
bugfixes
- [IDE]: Pin containerd to version 1.7.23-1 in IDE images such that building docker containers work again.
1.18.17 (02-12-2024)
features
- [AWS]: Upgrade EKS to version 1.30
- [Azure]: Upgrade AKS to version 1.30
- [AWS]: Upgrade the following components:
- EKS addon kube-proxy upgrade to v1.30.3-eksbuild.5
- EKS addon CoreDNS upgrade to v1.11.3-eksbuild.1
- EKS addon Amazon VPC CNI upgrade to v1.18.5-eksbuild.1
bugfixes
- [SDK]: The application name is now correctly propagated to the UI for Spark application runs
- [CLI]: Provide better feedback when the containerd image store is used. Usage of this storage driver is not recommended.
1.18.16 (25-11-2024)
features
- [AWS]: Update the OS for IDE nodes from ubuntu 20.04 to 22.04
1.18.15 (21-11-2024)
bugfixes
- [Airflow]: The scheduler process will no longer fail for environments that don't yet have any deployments.
1.18.14 (20-11-2024)
features
- [Airflow]: Airflow has been upgraded to 2.10.3
- [Python SDK]: The SDK now supports setting environment variables, including secret mounting
- [Azure]: Upgrade Cilium to 1.15.10
bugfixes
- [Airflow]: Fixed an issue where in very rare circumstances a successful task could result in failure during cleanup
- [UI]: We no longer attempt to fetch metrics for streaming applications running shorter than 2 minutes
1.18.13 (30-10-2024)
features
- [Azure]: Upgrade Cilium to 1.14.16
- [Python SDK]: The SDK now supports running Spark applications as well as regular containers
bugfixes
- [Airflow]: We rolled back the changes to the DAG location for Airflow executors in the previous release.
This change was part of a couple of changes to decrease the load on Kubernetes when creating Airflow executor pods.
This had several unintended consequences:
- Some people assumed the location of DAG files was in the
/opt/airflow/dags
folder, we created updated documentation on how to properly load files without making that assumption. - The Airflow ExternalTaskSensor would fail when
check_existence=True
is enabled.
- Some people assumed the location of DAG files was in the
1.18.12 (29-10-2024)
features
- [Spark]: Updated base images for Spark are now available:
- public.ecr.aws/dataminded/spark-k8s-glue:v3.3.4-hadoop-3.3.5-v1
- public.ecr.aws/dataminded/spark-k8s-glue:v3.4.3-hadoop-3.3.6-v1
- public.ecr.aws/dataminded/spark-k8s-glue:v3.5.3-hadoop-3.3.6-v1
- [AWS]: Upgrade the following components:
- Upgrade Karpenter to 1.0.6
- Upgrade the EBS CSI driver to v1.36.0
- [IDE]: Update code-server to version 4.93.1
bugfixes
- [IDE]: Fix an issue when using IDEs for a project that has an underscore.
1.18.11 (21-10-2024)
features
- [Python SDK]: First version of our Python SDK to simplify running tasks outside of Airflow. For more information, please refer to our how-to guide.
bugfixes
- [Azure]: Correctly use default role when accessing secrets from an Azure KeyVault
- [Notebooks]: Use Java 17 by default instead of Java 11
- [CLI]: The previous release unintentionally changed the yaml generated by
conveyor project generate-config
. This regression has been corrected.
1.18.10 (14-10-2024)
bugfixes
- [Airflow]: Fixed a bug where the spark submit logs could not be fetched in Airflow when the spark submit process failed.
- [Airflow]: Provide better error messages when mounting secrets fails due to an incorrect IAM role.
- [SSO]: Resolve a visual bug where one users logging in would remove all other user from a team created through SSO group mapping. This bug only affected the team listing, all access rights were still correctly applied.
- [CLI]: Add extra validation that the project id and project name match before building and return meaningful errors.
- [CLI]: Fix an issue with Conveyor run when passing Azure keyvault secrets as environment variables.
- [UI]: Fix an issue where deleting a user fails due to a mismatch between the username in Conveyor and the identity provider.
1.18.9 (02-10-2024)
features
- [Airflow]: Update Airflow to 2.10.2
- [Airflow]: The Airflow on-demand mode is made more robust. It now runs 2 web instances, that prefer to be run on different nodes. When nodes are consolidated due to scheduling optimisations, only one web instance at a time will be removed. This change should eliminate downtime for the Airflow UI under normal circumstances.
- [AWS]: Updated minimal role template to version 1.3.0, this new versions allows Conveyor to retag AWS resources with new tags when requested by the customer. Resource tags can only be changed if they have a name starting with Conveyor (or the old name Datafy), or have the tag Conveyor=True which is set during resource creation.
- [Costs]: The costs page now shows both the current and previous period, for easier identification of changes.
- [Docs]: Added a new how-to for configuring a default identity.
- [Docs]: Updated azure identity docs to include user managed identities
bugfixes
- [Azure]: Fixed an issue with keeping warm IDE nodes, that would result in IDE being slow in startup
- [General]: Fix issues with Azure price changes impacting cost calculation
1.18.8 (27-09-2024)
bugfixes
- [Airflow]: Backport a bugfix to Airflow dag args validation that was added in Airflow 2.10.x
1.18.7 (26-09-2024)
features
- [Airflow]: Upgrade Airflow to 2.10.1, this changes the dark mode from our custom theme to the official Airflow dark theme
- [dbt]: Release dbt 1.8.6 image
- [UI]: Add a configuration option under settings to prevent the creation of projects and environments through the UI
- [UI]: Add the ability to change the name of an IDE
- [AWS]: Upgrade Karpenter to 1.0.2
- [IDE]: Upgrade code-server to 4.92.2
1.18.6 (17-09-2024)
features
- [Airflow]: Update PgBouncer to 1.23.1
- [CLI]: Extend the list of instance types that can be selected when creating a notebook
- [Notebooks]: Creation of notebooks using Python 3.7 and 3.8 is deprecated. Python versions 3.11 and 3.12 are now supported instead.
- [AWS]: Upgrade the following EKS components:
- EKS addon Amazon VPC CNI upgrade to v1.18.2-eksbuild.2
- EKS addon kube-proxy upgrade to v1.29.7-eksbuild.2
- EKS addon CoreDNS upgrade to v1.11.1-eksbuild.11
bugfixes
- [IDE]: Make IDE snapshotting more robust, also added up to 3 retries when the snapshotting process fails
- [UI]: Fix redirect in onboarding flow after logging in
1.18.5 (09-09-2024)
bugfixes
- [Airflow]: Fix a possible crash when opening the Web UI for the first time.
- [CLI]: The CLI will no longer present the error that Docker is not running for operations that do not require it.
1.18.4 (04-09-2024)
features
- [UI]: The cost charts have been updated to offer more accurate estimates, and allocate those costs to different categories.
- [Spark]: We've released new images for Spark 3.5.2, this also updates the delta lake (3.2.0) and iceberg (1.6.1) libraries:
public.ecr.aws/dataminded/spark-k8s-glue:v3.5.2-hadoop-3.3.6-v1
public.ecr.aws/dataminded/spark-k8s-glue:v3.5.2-2.13-hadoop-3.3.6-v1
- [UI]: Add custom link icon for api endpoints
bugfixes
- [UI]: Navigating from Airflow to the Conveyor pipeline detail will now match the task name exactly
1.18.3 (28-08-2024)
features
- [AWS]: Update Karpenter to 1.0.1
- [Airflow]: Update Kubernetes provider to 8.3.4
- [UI]: Support the Amazon CloudWatch icon as an option for project links
- [UI]: The new projects page will automatically load task logs when navigated to from Airflow
bugfixes
- [UI]: Improve the settings page so that users can be removed from projects with a long name
- [UI]: IDEs will automatically open again after their build is done
- [UI]: The redirect-after-login route redirects properly again
1.18.2 (22-08-2024)
features
- [AWS]: Updates to components:
- VPC CNI to 1.18.3
- Karpenter to 0.37.1
- [Airflow]: Use private registry for Airflow validation images instead of public ECR
bugfixes
- [UI]: The buttons to open an IDE in a new window now all link to the correct URL
1.18.1 (14-08-2024)
features
- [IDE]: 🎉 IDEs are out of preview 🎉
- [Airflow]: Made the authentication flow of
ConveyorExternalTaskSensor
more robust - [Airflow]: Apply security patches for underlying dependencies
- [IDE]: Update code-server to 4.91.1
- [IDE]: Made installing extensions more robust
- [IDE]: Updated default installed extensions:
- ms-python.python@2024.12.2
- ms-toolsai.jupyter@2024.6.0
- ms-toolsai.jupyter-keymap@1.1.2
- ms-toolsai.vscode-jupyter-cell-tags@0.1.9
- ms-toolsai.vscode-jupyter-slideshow@0.1.6
bugfixes
- [Airflow]: Fix the conveyor task execution button such that it links to the task execution instead of the general project page
- [CLI]: Fix usage of
conveyor run
when package dependencies are used
1.17.10 (06-08-2024)
bugfixes
- [UI]: Fix login issues
1.17.9 (06-08-2024)
features
- [UI]: We now warn the user to not leave the IDE build page when the build is running
bufixes
- [UI]: Made sure that manual runs use the environment specified
1.17.8 (01-08-2024)
features
- [General]: Added project links! This feature allows you to add extra links to the project page. For example you can add links to your production Snowflake setup, or Databricks. This allows the developers of the project to more easily go to the tools they need to do their job.
- [Airflow]: Upgrade Airflow to 2.9.3.
- [UI]: Added the possibility to see manual runs to the projectV2 UI
- [Costs]: Include the cost of IDEs in the cost overview.
bugfixes
- [UI]: Correctly render the delete action for projects and environments when users have this permission through their teams.
- [CLI]: Passing the deletion protection value is no longer forced when executing an environment update.
- [CLI]: Fix DAG validation when running on Gitlab CI.
- [Airflow]: Allow Airflow workers to recover from start-up failures.
- [UI]: Improve the performance to querying tasks for a pipeline execution in the projects v2 page.
- [Packages]: Fix an issue with deleting packages on Azure.
- [Packages]: Fix support for Conveyor templates.
1.17.7 (22-07-2024)
features
- [Airflow]: Added deadlock detection on Airflow to mitigate bugs in Airflow that could deadlock the Airflow scheduler (for example issue 7935 and issue 15938). After an hour of running we check if the scheduler has launched any tasks in the past hour, and if it hasn't we restart it. This means that if a deadlock has happened the scheduler will be restarted automatically after an hour at maximum.
- [UI]: Improve tasks logs when the Airflow job is queued or running, but the Conveyor job has not started yet.
1.17.6 (18-07-2024)
bugfixes
- [Airflow]: Through load testing we found out the fix from 1.17.5 regarding a recoverable error was not enough. In this release we fix that. The error occurs in kubelet when the create container call times out and is still busy when kubelet retries to create the container. Eventually everything is all right, but Airflow is too aggressive and assumed the container could never recover.
1.17.5 (17-07-2024)
bugfixes
- [Airflow]: Ignore a recoverable error from Kubernetes when processing status updates from executors in the Airflow Kubernetes scheduler. The error occurs when kubelet runs into a deadline creating a container.
1.17.4 (17-07-2024)
bugfixes
- [Airflow]: Improved logging on certain issues with the Airflow Kubernetes scheduler
1.17.3 (16-07-2024)
bugfixes
- [CLI]: fix an issue with pulling the Airflow dag validation image while building.
1.17.2 (16-07-2024)
bugfixes
- [IDE]: Fixes an issue with IDE nodes not booting up properly
- [Airflow]: Incorporated backport of: airflow#40806
1.17.1 (16-07-2024)
features
- [Packages]: In this release we add a new feature for packages! Packages can be used to deploy code to every Airflow environment, this makes it easy to share common alerting code, ingestion code etc... For more info have a look at our how-to guides.
- [CLI]: Upgraded the Conveyor template to version 1.7.0, which includes a template for packages.
- [UI]: On the redesigned project page you can now filter pipelines on name or only show those with failures.
bugfixes
- [UI]: Fix an issue where the metrics of a running job would not be shown in the UI.
- [UI]: Fix the gitpod url link when not linking to a private gitpod installation
- [UI]: Fixed a small issue with the rendering of users pages for environment and project
1.17.0 (15-07-2024)
Release skipped.
features
1.16.14 (05-07-2024)
bugfixes
- [UI]: Fix task filtering on task executions page
- [UI]: Fixed an issue when opening task details page for a
conveyor run
task - [General]: Fix an issue where resources were not cleaned up on the k8s cluster
1.16.13 (02-07-2024)
bugfixes
- [AWS]: Gave aws-node more memory, this component is responsible for networking. It would sometimes go out of memory when a bunch of nodes would be launched at once.
1.16.12 (01-07-2024)
features
- [DBT]: Release image
public.ecr.aws/dataminded/dbt:v1.8.3
- [AWS]: Upgrade the follow eks components:
- EKS addon Amazon VPC CNI upgrade to v1.18.2-eksbuild.1
- EKS addon kube-proxy upgrade to v1.29.3-eksbuild.5
- EKS addon CoreDNS upgrade to v1.11.1-eksbuild.9
- [General]: Update templates to 1.6.2
bugfixes
- [UI]: Ensure the styling for project descriptions correctly changes based on the chosen theme.
- [UI]: Fix crash in page load when viewing logs in the new projects page.
1.16.11 (26-06-2024)
features
- [Airflow]: Improve the stability of Airflow web on Azure by assigning a dedicated nodepool for long-running workloads.
- [IDE]: Upgrade code-server to v4.90.3
bugfixes
- [Airflow]: Downgrade the version of urllib3 used by Airflow to resolve scheduling errors.
1.16.10 (25-06-2024)
features
- [UI]: Enable switching by try number on the application runs details page.
- [UI]: Allow filtering tasks by name on the new pipeline executions page.
- [IDE]: IDE builds automatically fail when taking more than an hour.
bugfixes
- [Airflow]: Downgrade the Kubernetes provider to the version used on Airflow 2.8. The new version causes issues with rescheduling sensors after a spot interrupt.
- [UI]: Log lines should no longer be able to overlap.
1.16.9 (19-06-2024)
bugfixes
- [General]: Rework cognito user validation such that it also works when an SSO username contains uppercase characters.
1.16.8 (18-06-2024)
features
- [Airflow]: Upgrade Airflow to 2.9.2
- [UI]: The new Projects page is now active by default. You can switch back to the old version at any time.
- [UI]: Make it easier to expand the logs on the new Projects page, clicking on the row will expand the logs.
- [UI]: Replace the logs button with a details button in the new Project actions list.
- [UI]: The create Conveyor IDE button now first asks you if you want to continue working on an old IDE on the new Projects page.
- [UI]: Improve the error page when a user does not have access to a project.
- [IDE]: Using the Conveyor CLI inside an IDE no longer requires you to login anymore, you will be automatically logged in.
- [Docs]: Document the process for installing the AWS GuardDuty agent on a Conveyor EKS cluster.
bugfixes
- [General]: Fix a bug in the authorization flow that would lead to 503 errors in rare cases.
- [UI]: The description card for projects v2 will no longer overflow the box.
- [UI]: Fixed an issue in redirecting from Airflow to the projects v2 page.
- [UI]: The calculation of pipeline and task success rate is now correct when project names share the same prefix.
- [UI]: We show more information on the projects v2 page where the logs of the run would show an empty page when no logs are available.
- [UI]: Usernames are no longer duplicated in their selection menu
- [General]: Rework cognito user validation such that it also works when a sso username contains upper case characters.
1.16.7 (10-06-2024)
features
- [UI]: Show IDE build logs after the build has finished.
bugfixes
- [UI]: Fix an issue where the base image build logs were not shown in the UI for Azure.
- [UI]: Fix reset button in date filter.
- [CLI]: Fixed an issue where a
conveyor run
would not package the DAGs, if you deploy the build created byconveyor run
the DAG would be removed from the Airflow environment.
1.16.6 (29-05-2024)
features
- [UI]: ✨Projects redesign released ✨This redesign gives a more prominent place to the project page. You can switch between the old and new version.
- [General]: Improved the SSO group syncing performance
- [CLI]: Allow including inactive deployments when listing deployments
via
conveyor project deployments --include-inactive
. - [dbt]: We provide a new dbt image:
public.ecr.aws/dataminded/dbt:v1.7.15
bugfixes
- [CLI]: The command
conveyor project add-alert-config
no longer ignores the passed environment flag. - [CLI]: The command
conveyor notebook ide-connect/ide-disconnect
works again.
1.16.5 (22-05-2024)
bugfixes
- [AWS]: The sandbox image for Kubernetes will no longer be garbage collected.
1.16.4 (21-05-2024)
features
- [Azure]: Upgrade AKS to 1.29.2
- [Azure]: Upgrade Cilium from 1.13.9 to 1.14.10
- [Azure]: Update the available resources for the Conveyor instance types.
- [AWS]: Upgrade EKS to 1.29
- [AWS]: Upgrade AWS EBS CSI driver to v1.30.0
- [AWS]: Upgrade EKS VPC CNI from v1.16.4-eksbuild.2 to version v1.18.1-eksbuild.1
- [AWS]: Upgrade Secrets Store CSI driver from v1.4.2 to v1.4.3
- [IDE]: Upgrade code-server to v4.89.0
bugfixes
- [UI]: Fix detection of missing log streams and return a proper error message.
- [UI]: Fix state propagation when starting an IDE.
- [IDE]: Make sure to use the latest image version for base image builds.
1.16.3 (02-05-2024)
features
- [dbt]: We provide a new dbt image:
public.ecr.aws/dataminded/dbt:v1.7.13
- [Templates]: Upgraded templates to 1.6.1.
bugfixes
- [IDE]: Fix an issue where IDE build for Azure failed when using a base image.
- [UI]: Fix an issue with the datepicker not showing the dates specified as a query parameter.
- [UI]: Improve handling of empty log pages when filtering.
1.16.2 (02-05-2024)
Release skipped.
1.16.1 (18-04-2024)
security
- [Airflow]: Upgrade gunicorn to fix CVE-2024-1135
bugfixes
- [CLI]: Remove leftover debug output
1.16.0 (16-04-2024)
features
- [IDE]: Allow editing user settings within an IDE.
- [Airflow]: Added the possibility to use the xcom value from a
ConveyorContainerOperatorV2
in dynamic task mapping. More info can be found in the docs.
bugfixes
- [Airflow]: Correctly detect when an invalid container image is provided as argument of the Conveyor operators.
- [UI]: Disable downloading of streaming application logs for now until we are sure how to handle it.
- [UI]: Fix an issue where navigating the streaming application logs worked incorrectly.
preview
- [IDE]: This release introduces support for creating custom base images for your IDEs and share them across projects. This feature is currently in preview, if you want to try it out, you can start by taking a look at the introduction video or the how-to-guide for more details.
1.15.7 (09-04-2024)
features
- [IDE]: Made IDE's more robust to memory kills, the IDE service itself has a memory limit set, so instead of crashing the whole container only the IDE service itself will restart. This way your local changes to the IDE or your code do not get lost!
- [IDE]: Allow users to remove IDE settings through the UI and CLI
1.15.6 (09-04-2024)
Release skipped.
1.15.5 (03-04-2024)
features
- [General]: We have a new logo!
- [Airflow]: Upgrade Airflow to 2.8.4
- [AWS]: Upgrade Karpenter from v0.35.1 to version v0.35.4
bugfixes
- [UI]: Filtering the application logs on a term containing parentheses works again.
- [AWS]: Improve the handling of missing timestamps in the application logs.
- [General]: The allowed characters for project names haven been made more strict. This prevents issues when creating the container repository for a project.
- [General]: Deleting the same project multiple times will no longer result in an error.
- [General]: Improve the detection of spot node interrupts in Airflow.
1.15.4 (28-03-2024)
bugfixes
- [General]: Fix an issue with cleaning up old builds
- [General]: Fix an issue with new auth0 users not having the tenant information attached
1.15.3 (26-03-2024)
features
- [Spark]: Introduce mode
cluster-v2
in the ConveyorSparkSubmitOperatorV2. This mode speeds of the launching of your spark job, by at most 3 minutes. It is the new default mode when usingconveyor run
, but can also be set in the operator. After extensive testing and feedback this will become the default. - [Airflow]: Allow xcom to be used with
ConveyorContainerSensors
in the same way as theConveyorContainerOperatorV2
- [Spark]: Automatically prefix the application property of the
ConveyorSparkSubmitOperatorV2
withlocal://
if it is not present. For more details look here
bugfixes
- [Airflow]: Revert acryl-datahub-airflow-plugin to version 0.12.1 as the latest version introduced a breaking change in the
Ownership
model.
1.15.2 (19-03-2024)
features
- [Airflow]: Upgrade Airflow to 2.8.3
bugfixes
- [General]: Fixed a bug where canceling an application might result in the state getting stuck in canceling. This happens when a canceled application completes successfully in the 30s window where a canceled application can cleanly shut down
1.15.1 (13-03-2024)
features
- [AWS]: Upgrade EKS VPC CNI from v1.15.5-eksbuild.1 to version v1.16.4-eksbuild.2
bugfixes
- [UI]: Improve the error handling when updating the environment/project settings
- [UI]: Ensure chart area of the executor metrics remains visible when many executor need to be shown
- [UI]: Fix regression in the text color of Airflow tasks when using dark mode
1.15.0 (05-03-2024)
features
- [Airflow]: Upgrade Airflow to 2.8.2
- [Airflow]: Upgrade acryl-datahub-airflow-plugin to 0.12.1.5
- [Spark]: Remove support for spark_main_version variable, this was only needed for spark 2.x support, but that has been deprecated for a long time
- [Spark]: We've released new images for Spark 3.5.1, this also updates the delta lake (3.1.0) and iceberg (1.4.3) libraries:
public.ecr.aws/dataminded/spark-k8s-glue:v3.5.1-hadoop-3.3.6-v1
public.ecr.aws/dataminded/spark-k8s-glue:v3.5.1-2.13-hadoop-3.3.6-v1
- [dbt]: Released new dbt image:
public.ecr.aws/dataminded/dbt:v1.7.8
- [IDE]: Upgrade code-server to v4.21.1
- [Templates]: Upgraded templates to 1.6.0.
bugfixes
- [UI]: The state of the Airflow UI is now fully synchronized to the page URL, allowing for easier link sharing and navigation.
- [UI]: Fixed an issue with IDE build permission in the UI
- [UI]: Fixed a bug where undeploying a project would not update the deployment list
- [UI]: Make sure the error message is correctly shown in the UI when something goes wrong.
- [General]: Fixed the processing of errors of jobs to clearly label disk pressure issues
1.14.8 (22-02-2024) - Hotfix
bugfixes
- [Azure]: Hotfix for node creation on Azure
1.14.7 (21-02-2024)
bugfixes
- [Docs]: Remove trailing slashes from URLs so that links keep on working.
- [CLI]: The previous release broke interactive task selection in the
conveyor run
command, this now works again. - [General]: Improve handling of cancelled tasks
1.14.6 (14-02-2024)
features
- [Airflow]: Add a new macro to fetch the project that a task belongs to
- [dbt]: Release a new dbt image, based on dbt 1.7.7
- [IDE]: Upgrade code-server to v4.20.1
- [UI]: Log filtering is now case-insensitive
bugfixes
- [IDE]: The spark-history executors page will now be properly rendered when ran inside an IDE.
- [UI]: Fix an issue where the Spark Submitter logs failed to show.
- [UI]: Fix a visual bug when deleting two projects after each-other.
1.14.5 (31-01-2024) - Hotfix
features
- [Docs]: The implementation of the search functionality has been changed to offer better results.
bugfixes
- [General]: Fix our integration with Auth0
1.14.4 (30-01-2024)
features
- [Spark]: Our
ConveyorSparkSubmitOperatorV2
now supports setting the--verbose
flag on the spark-submit command - [Airflow]: This change has also been incorporated in the new version of the types-conveyor package
bugfixes
- [AWS]: We now correctly handle an additional error case during secret mounting.
- [AWS]: Switch to using the regional STS endpoints for the cloudwatch agent.
1.14.3 (23-01-2024)
features
- [UI]: Show the logs of an IDE build
- [UI]: Show the SSO groups of users in the UI
- [CLI]: Allow changing the spark image used by
conveyor project spark-history
- [Airflow]: Update python types-conveyor package to version 0.0.6, take a look here
- [Terraform]: Released a new version of the terraform provider
bugfixes
- [UI]: Make sure IDP Group mapping syncing happens on every logout/login cycle
- [CLI]: Make it possible to open the spark history in a Conveyor IDE
- [IDE]:
sudo
is no longer required in order to update the Conveyor CLI within an IDE
1.14.2 (17-01-2024)
bugfixes
- [General]: Fixed an issue when tailing logs of pods. When using
conveyor run
orconveyor ide cache
, sometimes errors could occur when following the logs. We now wait until the node is fully ready before trying to show the logs. - [IDE]: Fixed the "failed writing project config" error when executing
conveyor run
inside IDEs. - [IDE]: Properly set the project folder used by git clone during IDE startup
- [IDE]: The
.bashrc
config file should now remain constant between restarts
1.14.1 (15-01-2024)
features
- [CLI]: You can now change where the conveyor CLI stores settings and tokens by setting the
CONVEYOR_HOME
environment variable - [IDE]: Upgrade code-server to v4.20.0
- [IDE]: Upgrade sysbox to version 0.6.3
- [IDE]: Added to possibility to use Snowflake setup with SSO to be used inside a Conveyor IDE, look at the how-to-guide for more details
- [AWS]: Upgrade eks to 1.28
- [Azure]: Upgrade aks to 1.28
bugfixes
- [CLI]: Using conveyor run for a job that failed with secrets access would not give any helpful error message. The error message is now properly propagated.
- [Airflow]: Fixed an issue were the scheduler could stop properly scheduling when sending out alerts when a dag run fails
1.14.0 (10-01-2024)
features
- [General]: Allows teams to be mapped to groups from the SSO provider
- [Airflow]: Support the automatic configuration of connections/variables for your Airflow environments. The content is loaded from a secret store (e.g. AWS secrets manager or Azure Key vault). For more information look here
- [AWS]: Added support for C instances, for information on cpu and memory see Instances
- [UI]: Added support for undeploying a project from an environment in the UI
- [UI]: Added multiline parsing for python and java stack traces. This makes it easier to see your stack trace when using search functionality
- [IDE]: Sped up shutdown of the IDE when suspending, this means the snapshot process can start earlier and suspending will be faster
- [General]: Switch links in our emails from the legacy
https://datafy.cloud
tohttps://conveyordata.com
domain. - [CLI]: Improve conveyor auth configure documentation
- [AWS]: Upgrade coredns to version v1.10.1-eksbuild.6
- [AWS]: Upgrade karpenter from v0.32.4 to version v0.33.1
- [AWS]: Upgrade eks vpc cni from v1.15.3-eksbuild.1 to version v1.15.5-eksbuild.1
bugfixes
- [Airflow]: Bring back the old behavior to support configuration options when triggereing a DAG in the UI. This was removed in Airflow 2.7.0, new advice is described here
- [UI]: Fix an issue where adding users to environment/projects used the wrong user in some cases
- [Notebook]: Do not allow the creation of notebooks that have a too long name (>63 characters)
- [UI]: Fixed an issue with wrap long lines in the logs
- [General]: Fixed a rights issue when downloading logs as a non-admin
- [IDE]: When storing too much data (apt packages, pip install, or data) inside your IDE it could happen snapshotting would fail because we ran out of storage. We made snapshotting more storage efficient and faster, and now mount a separate volume. This way we can never run out of storage.
1.13.1 (20-12-2023)
features
- [UI]: Allow admins to disable the Notebooks and/or IDE features
- [CLI]: Switch to using the regional STS endpoint by default
- [IDE]: The AWS CLI now comes pre-installed in the IDEs
- [AWS]: Karpenter has been upgraded to 0.32.4
bugfixes
- [IDE]: Fix a small issue where certain images displayed in the readme were not properly loaded
- [General]: Fixed the caching of terraform provider on the agent, this results in faster deploys of resources
1.13.0 (11-12-2023)
features
- [UI]: Added a project create button
- [UI]: Allow you to make the git repo blank in the UI
- [UI]: Added the IDE button for Conveyor on the project page even if the git repo is not specified
- [Airflow]: Added a proxy to serve the static Airflow files, this will result in faster load times of web pages. All static files are served almost twice as fast
- [Airflow]: Add support for automatically mounting Azure keyvault secrets in containers and spark jobs. For more details look here
- [AWS]: Upgrade Karpenter from v0.31.1 to v0.32.2
- [AWS]: Upgrade our conveyor-install-role-and-policies.template to version 1.0.1, we removed some rights for Karpenter that are now unnecessary in version v0.32.2
- [Airflow]: Add support for multiple
task_ids
in theConveyorExternalTaskSensor
, for more details look here - [IDE]: Upgrade code-server to version v4.19.1
- [Spark]: Released new images for Spark 3.3.3
public.ecr.aws/dataminded/spark-k8s-glue:v3.3.3-hadoop-3.3.5-v1
andpublic.ecr.aws/dataminded/spark-k8s-glue:v3.3.3-2.13-hadoop-3.3.5-v1
- [DBT]: Released new dbt images:
public.ecr.aws/dataminded/dbt:v1.6.9
public.ecr.aws/dataminded/dbt:v1.7.3
- [General]: When using on-demand nodes prefer recent instance generation over older generations
- [General]: Added support for m7i, m7a, r7i and r7a instances
bugfixes
- [IDE]: Bash autocompletion was removed in the IDE with an upgrade, it was added again. Now bash auto-complete will work for Conveyor in the IDE
1.12.0 (28-11-2023)
features
- [Airflow]: Upgrade Airflow to 2.7.3
- [UI]: The homepage has changed from Environments to Projects
- [UI]: The UI is now available in Dutch as well as English
- [UI]: You can now control the Gitpod URL in the integrations, this is useful when using your own installed Gitpod environment
- [AWS]: Upgrade the VPC CNI addon from v1.13.4-eksbuild.1 to v1.14.1-eksbuild.1
- [AWS]: Upgrade Secrets Store CSI driver from v1.3.4 to v1.4.0
- [CLI]: Improve the debug output for build commands
- [Docs]: The table of container images is now split in multiple tables for better ease of use
- [Terraform]: Release Conveyor terraform provider 0.3.0
bugfixes
- [General]: Restrict the allowed environment names
- [Azure]: Downloading logs now also works on Azure
- [Azure]: Fix a regression with IDEs on Azure where the Azure SDK cannot be installed
- [Airflow]: Make sure conveyor run works for DAGs in a local timezone and with a cron schedule multiple times a day
- [Airflow]: Set the correct message when retrieving the logs for an Airflow task that did not run
- [CLI]: Fixed an issue when using
conveyor run
to run an image of a project your project depends on
1.11.23 (20-11-2023)
features
- [CLI]: Add the image build to the json output of
conveyor build
- [UI]: Add the ability to download the logs of a task
- [AWS]: Update multiple components on the EKS cluster:
- Upgrade EBS CSI driver from v1.19.0 to v1.25.0
- Upgrade kube proxy addon from v1.27.4-eksbuild.2 to v1.27.6-eksbuild.2
- Upgrade vpc cni addon from v1.12.6-eksbuild.2 to v1.13.4-eksbuild.1
- Upgrade aws-observability/aws-for-fluent-bit from 2.23.3 to 2.32.0
- Upgrade cloudwatch agent from 1.247352.0b251908 to 1.300031.1b317
- Upgrade k8s dns node cache from 1.22.16 to 1.22.27
- [Azure]: Update multiple components on the AKS cluster:
- Upgrade Cilium from 1.13.6 to 1.13.9
- Upgrade fluent-bit from 1.8.15 to 1.9.10
- Upgrade k8s dns node cache from 1.22.16 to 1.22.27
1.11.22 (13-11-2023)
features
- [Spark]: We've released new images for Spark 3.5.0, containing support for Delta and Iceberg table formats.
- public.ecr.aws/dataminded/spark-k8s-glue:v3.5.0-hadoop-3.3.6-v2
- public.ecr.aws/dataminded/spark-k8s-glue:v3.5.0-2.13-hadoop-3.3.6-v2
- [General]: Upgrade pgbouncer to version 1.21.0
bugfixes
- [UI]: Make sure the retention parameter for cloudwatch logs is taken into account when querying at application runs.
1.11.21 (07-11-2023)
bugfixes
- [AWS]: Increase the memory available for the secrets backend to resolve a second issue in secret mounting.
- [AWS]: Reduce the inline policy used for building projects as to not hit the PackedPolicyTooLarge when using long project names.
1.11.20 (30-10-2023)
bugfixes
- [IDE]: Fixed a race condition that can happen when suspending an IDE, the result was the snapshotting of the IDE was started twice resulting in the second snapshot being empty and a failure when starting up the IDE again.
- [IDE]: Node startup for IDEs is now more robust after fixing two small issues.
- [AWS]: Resolved an issue causing secret mounting to fail under high load.
- [CLI]: Running
conveyor build
could remove thedefault-iam-role
and other fields by accident from the project. Please make sure to update your CLI.
1.11.19 (24-10-2023)
features
- [CLI]: Change the usage of
conveyor build --build-arg
argument to be inline with how it is used in Docker. - [CLI]: Improve
conveyor project validate-dags
to also detect duplicated DAG ids. - [IDE]: Upgrade code-server version to v4.17.1
- [Airflow]: Migrate to
securecookie
webserver backend, thedatabase
implementation has some advantages that are currently covered by the Conveyor authorization checks, and thesecurecookie
implementation puts less pressure in the database. This change also speeds up the airflow webserver by quite a lot.
bugfixes
- [CLI]: The stdout stream of
conveyor build --output json
is no longer polluted by other messages. - [Airflow]: Airflow proxying is working again when using dark mode.
- [UI]: The support chat button will now hide itself when scrolling down, preventing overlap with other components.
1.11.18 (16-10-2023)
features
- [UI]: Support viewing and updating the environment settings.
- [UI]: Allow updating the default IDE configuration in the project settings.
- [Airflow]: We released a new version of the types-conveyor package, which adds a couple of missing function parameters.
- [AWS]: Karpenter has been upgraded to 0.31.1.
bugfixes
- [UI]: Fix an issue where clearing the defaultIamIdentity does not work on the project settings page.
- [General]: Fix an issue where we could get throttled by auth0 when checking user permissions.
- [IDE]: Fix an issue where 2XLarge IDE nodes would not start up on Azure.
1.11.17 (09-10-2023)
We introduced a breaking change in the conveyor deploy command when using the git hash flag.
If you are using: conveyor deploy --env <some-env> --git-hash <some-git-hash>
, we will now fail immediately if the git hash is dirty.
The reason for this is that a dirty git hash may link to multiple conveyor builds, which can result in you deploying unwanted code.
features
- [Spark]: Released initial Spark 3.5.0 images.
Iceberg and Delta Lake unfortunately do not support Spark 3.5 yet,
so these dependencies are not included for now.
- public.ecr.aws/dataminded/spark-k8s-glue:v3.5.0-hadoop-3.3.6-v1
- public.ecr.aws/dataminded/spark-k8s-glue:v3.5.0-2.13-hadoop-3.3.6-v1
- [Airflow]: You can now set the navbar color of Airflow on a Conveyor environment
- [CLI]: Add support for deleting tags in the CLI
bugfixes
- [Airflow]: Fix an issue when using Airflow in dark mode where some panels would fail to load.
- [Airflow]: Cosmetic improvements when using dark mode.
- [UI]: Fixed an issue where users could not be unlinked from projects/environments after being removed from AD.
- [UI]: Fix an issue where deleting an environment resulted in weird behavior in the UI.
- [UI]: Fix sorting of teams in the settings page.
- [CLI]: Prevent deploying a project when using the git hash flag and passing a dirty hash. This fix contains a breaking change, see warning.
1.11.16 (26-09-2023)
bugfixes
- [Airflow]: Downgrade Airflow dependencies upgrade in 1.11.15, this resulted in increased memory pressure for Airflow web in certain environments
- [Airflow]: Fix an issue in the links between Conveyor and Airflow where they always landed on the Airflow home page
1.11.15 (26-09-2023)
features
- [Docs]: The table describing our Docker images now includes a column for the OpenJDK version
- [UI]: Move Instance type in the Create IDE Dialog from advanced to the main form. This removes the advanced form
bugfixes
- [IDE]: Fix an issue where writing to the /tmp directory fails for IDE builds
- [CLI]: DAG validation now works when using podman on Apple Silicon
- [UI]: Filtering deployments by "deployed by" works correctly again
1.11.14 (18-09-2023)
features
- [IDE]: Allow users to configure user-specific settings for their IDEs
bugfixes
- [Airflow]: Fix issue causing unnecessary scheduler restarts
1.11.13 (12-09-2023)
features
- [UI]: The application runs details page now shows the container image that was used
- [IDE]: Reuse the manifest of previous IDE builds to speed up subsequent builds
- [IDE]: Ensure that all environment variables are also visible in a Jupyter notebook environment
- [Notebooks]: Upgrade the Spark installation used by notebooks from Spark 3.3 to Spark 3.4
- [AWS]: Improve the mounting options of EFS, this will result in less downtime when AWS upgrades EFS in October
- [Airflow]: Upgrade Python from 3.9 to 3.11
bugfixes
- [UI]: The tag creation dialog works properly again
- [Azure]: Spark event logs are now correctly expired on Azure
- [Airflow]: The recycling rate of Airflow web workers has been increased to reduce the memory pressure. This change should result in fewer restarts
1.11.12 (12-09-2023)
Release skipped.
1.11.11 (04-09-2023)
features
- [Azure]: Upgrade AKS to 1.27.3
- [AWS]: Upgrade EKS to 1.27
- [UI]: Pinning of environments now also has a dedicated action button.
- [Airflow]: Airflow scheduler and web now exactly match an mx.small conveyor instance, this means that a bit more resources are available
bugfixes
- [General]: Fixed an issue with the permissions in our minimal installation role. This prevented proper deletion of an IDE when launched in a different account than the default Conveyor cluster.
- [IDE]: Fix issue where Azure tokens could not be used in IDEs.
- [Airflow]: Backport Airflow PR #33063 to fix the URLs generated by TaskInstances.
- [General]: Improve default error handling of secret mount issues for cases that we do not explicitly handle.
- [Spark]: Fix an issue with the eventlog file was not uploaded in certain cases.
- [IDE]: Improve the robustness of the IDE snapshotting process
1.11.10 (28-08-2023)
features
- [Airflow]: Upgrade to Airflow 2.6.3
- [Terraform]: Release terraform provider 0.2.0
- [General]: Improve performance of checking notebook access
- [General]: Improve performance of checking Airflow access
- [IDE]: Allow creating an IDE from the project list
bugfixes
- [CLI]: The spark-history command will now wait for the server to finish loading before opening the browser window.
- [IDE]: Handle failures when suspending an IDE in a more robust way.
- [General]: Fix a race condition that could return a ProjectNotFound error for a valid DeleteProject request.
- [Airflow]: The Airflow UI will now automatically refresh in case of temporary unavailability.
1.11.9 (21-08-2023)
features
- [IDE]: Improved the performance of proxying the IDE for the user
- [DBT]: The ConveyorDbtTaskFactory will now filter out ephemeral models and not add them as tasks
- [DBT]: Released our 1.6.0 dbt image: public.ecr.aws/dataminded/dbt:v1.6.0, you can see find the full list of supported software in our docs
- [Airflow]: Added support for email alerting on failures of dags, you can read more here
- [Airflow]: Upgrade apache-airflow-providers-slack to 7.2.3
- [CLI]: Make Docker BuildKit the default when using a Docker version of 23.0.0 or higher, just like Docker does
- [Azure]: Upgrade AKS to 1.26.6
- [UI]: Pinning projects in the overview page now has a dedicated action button
- [General]: Upgrade template to 1.4.0
bugfixes
- [AWS]: Fixed an issue where searching on logs could fail if an input like
|
was used. - [Spark]: AWS_REGION and AWS_DEFAULT_REGION should now be properly set on Spark jobs.
- [Airflow]: Fix an issue where Airflow crashed when displaying
ConveyorExternalTaskSensor
links - [UI]: Legend in the metrics page was be partially cut off
1.11.8 (08-08-2023)
bugfixes
- [AWS]: Fixed an issue when building the AMI for IDEs
1.11.7 (07-08-2023)
Release skipped.
features
- [CLI]: Upgrade template to 1.3.2
bugfixes
- [UI]: Fix an issue where the build ID was not correctly passed when creating a new IDE.
- [UI]: Fixed an issue where the create IDE/Notebook button is in another location for an admin vs a non-admin
- [IDE]: Fix an issue with IDE builds on Azure
- [IDE]: Fix an issue with deleting IDEs on Azure
1.11.6 (01-08-2023)
features
- [AWS]: Only allow a single metadata hop on aws ec2 instances, the limit used to be 2 for when kube2iam was still used. After its removal, the hop count can be reduced to 1, improving the security of the setup.
- [Streaming]: Streaming applications now support default roles as well.
- [CLI]: Selecting an IDE to resume or suspend now uses the name instead of the ID of the IDE in the selector, thus making it easier to select the right one
- [Spark]: Release Spark 3.4.1 images, you might need to make some changes to your scala spark jobs, see docs here
- public.ecr.aws/dataminded/spark-k8s-glue:v3.4.1-hadoop-3.3.6-v1
- public.ecr.aws/dataminded/spark-k8s-glue:v3.4.1-hadoop-3.3.6-v1
- [IDE]: Automatically clone the configured git repo for a Project when opening an IDE for the project
- [IDE]: Changed the username from coder to conveyor in the IDE
- [UI]: Support creating IDEs from the UI
1.11.5 (24-07-2023)
features
- [IDE]: Added the ability to set a default IDE config for a project through the CLI
conveyor update project --default-ide-config
command. This will later be used when people start their IDE from the UI, for now it is used as a fallback when noide.yaml
is included in the project. - [IDE]: Prefer 6th generation instances. In our testing, 6th generation instances can be twice as fast as first generation instances when running tests in a pyspark project
bugfixes
- [IDE]: Fix using a cloud identity when using IDE's
1.11.4 (19-07-2023)
features
- [IDE]: Installed some extra packages to make the first experience smoother:
build-essentials
,python3.10-venv
,python3-pip
,python3.10-dev
bugfixes
- [IDE]: Make sure packages that install man pages in
/usr/share/man/man1
can do so, for example, installingopenjdk-11-jre
was not able to install - [IDE]: Resuming after auto-suspend sometimes, suspended the IDE automatically again. This is now fixed
1.11.3 (18-07-2023)
features
- [IDE]: Added suspend and resume functionality for IDEs. IDEs will also automatically suspend after 60 minutes of inactivity to save costs
- [IDE]: IDE's are upgraded from ubuntu 20.04 to ubuntu 22.04
- [CLI]: Update conveyor templates to 1.3.1
bugfixes
- [IDE]: Fixed an issue with pulling the base image when building a custom IDE
1.11.2 (10-07-2023)
features
- [Terraform]: We've released v0.1.5 of our Terraform provider.
- [Airflow]: The Conveyor Airflow operators now support using default IAM identities configured on the project level. You can find more information on the default IAM identity in the project documentation.
- [UI]: It is now possible to promote deployments to a different environment using the web interface.
- [IDE]: AKS nodes will be pre-warmed to reduce the startup time of IDEs on Azure.
bugfixes
- [RBAC]: We now assign the correct permissions to 'Contributor' users for them to view logs and metrics of streaming applications.
- [IDE]: The git routine used by IDEs has been made more robust so that the generated
git clone
command is always valid.
1.11.1 (10-07-2023)
Release skipped.
1.11.0 (05-07-2023)
features
- [IDE]: This release container a preview version of IDE support on Conveyor. To get started, check out our two how-to-guides:
- [AWS]: Karpenter has been upgraded to version 0.28.1
bugfixes
- [Notebooks]: Make sure that starting a notebook from the CLI works when messages are processed with a delay.
- [Notebooks]: Fix an issue during the creation of a notebook which was not persisted in the notebooks.yaml.
- [CLI]: Airflow images will now be pulled using the correct credentials, avoiding rate-limiting errors.
1.10.18 (12-06-2023)
bugfixes
- [General]: Make sure there can be no overlapping names on Kubernetes for the spark operator, there was a very small chance we used overlapping names for certain resources, these are now guaranteed to be unique.
- [Airflow]: Fixed a bug in the ConveyorDbtTaskFactory where a model could have no dependency on start when a previous model was filtered out because of tag filtering.
1.10.17 (05-06-2023)
features
- [AWS]: Upgraded eks to 1.26
- [AKS]: Upgrade aks to 1.26.3
- [General]: Improve logging of internal components to reduce the costs on Azure
1.10.16 (31-05-2023)
bugfixes
- [Azure]: Make sure
conveyor build
andconveyor run
works again on Azure
1.10.15 (31-05-2023)
bugfixes
- [CLI]: Make sure conveyor update downloads the latest CLI instead of 1.10.2
- [General]: Fix referencing the same secret from aws ssm parameter store multiple times
1.10.14 (31-05-2023)
bugfixes
- [Notebooks]: Fix an issue with websockets used in notebooks
1.10.13 (30-05-2023)
features
- [General]: We've updated the machine images that are used on the cluster, resulting in even faster start-up times for your jobs.
bugfixes
- [Airflow]: Revert the changes made in 1.10.4 regarding the DataHub integration. The 'cluster' parameter will now default to "prod" again.
- [General]: Make sure you can use the same secret/ssm parameter multiple times in environment variables.
- [Spark]: Make sure to set the aws-region for Glue automatically such that we do not rely on ec2 instance metadata.
- [CLI]: Make sure we take quiet flag into account during conveyor build
- [CLI]: On a default installation of docker desktop the daemon socket is not at
unix:///var/run/docker.sock
but at another location in the home directory. We now check that one first - [CLI]: Improved the conveyor run connection, sometimes it timed out and we restarted it. We added keep alive messages on the connection to make sure it isn't closed
- [General]: Make cleaning up old builds more robust, this should result in fewer old builds staying around
- [Airflow]: Fix an issue with Airflow scheduler liveness probe.
1.10.12 (09-05-2023)
experimental
- [CLI]: We offer a new way to run dbt commands on Conveyor, directly from your command line. As this is an experimental feature, all feedback is welcome! Please refer to the documentation page for more information on the feature.
bugfixes
- [CLI]: Change location for temporary folders in
conveyor project
commands. In CI, we use local working directory and otherwise the default temp folder for your OS. - [Spark]: Our latest Spark images contain a fix for reading tables that are partitioned by date from AWS Glue. For more info, have a look here.
- [Airflow]: Fix an issue where Airflow workers are not scheduled on on-demand nodes for Spark jobs.
- [Airflow]: Backport PR airflow#31128 to Conveyor Airflow.
1.10.11 (02-05-2023)
features
- [General]: Upgraded Azure AKS to 1.25, this forces the use of cgroupsv2 on Azure. Certain JVM versions do not detect memory correctly, for Spark jobs this is no issue since we set the memory correctly. However, Java jobs using the ConveyorContainerOperatorV2 need to be upgraded to use JDK 11 (patch 11.0.16 and later) or JDK 15 and above.
- [General]: Upgrade AWS EKS to 1.25
- [Airflow]: Upgrade Airflow to 2.5.3
- [General]: Release DBT image 1.5.0
- [CLI]: Upgrade to use conveyor templates 1.3.0
bugfixes
- [General]: Fix an issue where a notebook can get stuck in pending due to a bug in the EBS CSI driver.
- [UI]: Improve the performance when fetching application runs.
- [UI]: Make sure that the default IAM role can be rendered for a project contributor.
- [UI]: The behavior of the notebook creation dialog was adapted to match your expectations.
- [UI]: Toast messages which report errors are now rendered correctly.
1.10.10 (25-04-2023)
features
- [Azure]: Upgrade AKS to 1.24.9
- [UI]: We've made a minor visual update to our breadcrumbs.
- [Terraform]: release terraform provider version 0.1.4
bugfixes
- [CLI]: Fix the root cause for the
file already closed
message while building a notebook
1.10.9 (24-04-2023)
bugfixes
- [Airflow]: Added a friendly error message when passing invalid values into the
cmds
argument of the ConveyorConveyorOperatorV2. - [General]: Fixed a race condition that popped up when running lots of tasks that use secrets from AWS Parameter Store or AWS Secrets Manager.
1.10.8 (19-04-2023)
bugfixes
- [Airflow]: Fix an issue with the external task sensor
- [Terraform]: Add examples for tags in the terraform provider
1.10.7 (19-04-2023)
features
- [CLI]: Add support for team commands on an environment.
preview
- [Airflow]: We've released type hints that you can use when developing your Airflow DAGs on Conveyor. You can refer to our documentation page for the installation instructions and more information.
bugfixes
- [CLI]: We now make sure that container image pulls and runs triggered by Conveyor use the correct platform configuration.
1.10.6 (12-04-2023)
bugfixes
- [UI]: We now show an informative message if a page is not found instead of a blank screen.
- [CLI]: Fix an issue where conveyor build failed when using Podman.
- [General]: Change cloudwatch metrics configuration to not use ec2 instance metadata.
1.10.5 (04-04-2023)
features
- [Spark]: We've built a new Spark 3.3.2 image that only packages 1 version of netty. For more details you can refer to our technical-reference.
- [Spark]: Additionally, we released new Spark 3.3.2 image also comes with hadoop 3.3.5, which contains several CVE fixes as well as an important fix for using Azure blob storage. More details can be found in our technical-reference as well. Azure users that use Hadoop 3.3.2 or 3.3.4 are recommended to switch to hadoop 3.3.5.
- [Spark]: When using Spark on Azure with Hadoop 3.3.5,
you can now use the
ManifestCommitter
as an alternative to theFileCommitter
. More information on this committer can be found here. - [CLI]: Update
conveyor deploy
to show a message on how to rollback to the previous deployment - [CLI]: Add functionality to
conveyor deploy
for deploying a project based on a git hash as an alternative to the build id
bugfixes
- [General]: Make sure
AWS_REGION
is also defined when users setAWS_DEFAULT_REGION
. as the region cannot be retrieved from EC2 instance metadata when kube2iam is disabled. - [CLI]: Fix an issue where the
validate-dags
functionality does not work on GitLab CI anymore. - [UI]: Some visual edge-cases in the tag creation flow have been taken care of.
1.10.4 (29-03-2023)
features
- [CLI]: Support dependent projects in
Conveyor build/run/validate-dags
command. More details can be found here - [Notebooks]: It is now possible to define a default IAM identity for your projects. This default role can be automatically used when creating a new notebook. More information on how to configure the default identity can be found in the CLI docs.
- [Docs]: Time for spring-cleaning! We made some modifications to our documentation to make sure everything is clear and tidy.
- [Airflow]: Add support for XComs in the
ConveyorContainerOperatorV2
.
bugfixes
- [Airflow]: When enabling the DataHub integration, the 'cluster' parameter is now set to the name of your environment, instead of defaulting to "prod".
- [Notebooks]: The integrated notebooks view again scales properly to your page height.
- [Notebooks]: Fixed an issue where we asked to download files when deleting notebooks in
ide
mode.
1.10.3 (22-03-2023)
features
- [Airflow]: Upgraded Airflow to 2.5.2
- [UI]: Allow filtering on sensor applications in the application runs pages for environment and project
bugfixes
- [CLI]: Improve cancellation handling in the
Conveyor run
command - [Airflow]: Fix the
Conveyor application runs
button for sensors, this button used to filter out the application runs incorrectly and showed nothing as a result
1.10.2 (15-03-2023)
features
- [Docs]: Added documentation on how to improve dbt startup latency, you can find the documentation here.
- [Airflow]: Add support for grouping tasks into
taskGroups
with theConveyorDbtTaskFactory
, more details here - [Terraform]: Released the conveyor terraform provider 0.1.0, with extra data sources concerning users of projects, environments and teams
bugfixes
- [General]: Added extra validation on environment variable keys, before when the key was invalid the job would hang. Now we fail it
- [Airflow]: In the generation of Kubernetes object from the Conveyor operator, we sometimes produced invalid names. This is fixed now
1.10.1 (08-03-2023)
features
- [Notebooks]: The number of driver cores will be automatically configured depending on the instance size
- [CLI]: The behaviour of the ide-connect CLI command is now consistent with the other notebook commands
- [UI]: You can now create and attach tags to your projects for easier filtering
- [UI]: The notebook creation dialog now hides optional configuration options by default
- [General]: On AWS we optimised the way we are running EC2 machines for all jobs. We now collocate the Airflow scheduler and web pods. This should result in less Airflow downtime and an optimised the cluster. For job smaller than xlarge, we know keep the VM alive for up to 5 minutes to accept new jobs. This should result in less VM churn, which results in lower EC2 startup overhead, and less AWS Config costs if that is enabled.
- [Templates]: Upgraded templates to 1.2.1.
bugfixes
- [RBAC]: Fix an issue where the GetDefaultCluster call fails for non admin users
- [General]: Improve the error handling when the IAM role cannot be assumed by your container/spark application
- [UI]: Fix an issue where the collapsed sidebar icons were not clickable
- [UI]: Fix an issue where the login can get stuck
1.10.0 (27-02-2023)
features
- [Spark]: Create new spark images for Spark 3.3.2, more details can be found here
- [General]: AWS: Improved node startup latency, by upgrading to EKS CNi to 1.12.2, and configuring it correctly the average startup latency of a node went down to 50s from 60s.
- [Airflow]: Added support for configuring certain parameters of Airflow.
At the moment we support 2 parameters in the core section
parallelism
andmax_active_tasks_per_dag
. You can find more information on when to configure these here.
1.9.0 (21-02-2023)
features
- [General]: Added support for sending Airflow lineage to DataHub, read our how-to to get started!
- [Azure]: Update the VM's used for user nodepools to the Dsv5-series instead of the Dsv3-series
- [Terraform]: Release the Conveyor terraform provider 0.0.8 with support for configuring the Airflow DataHub integration
bugfixes
- [CLI]: Fix an issue where creating a project from a template uses the wrong project name in some cases
- [Notebooks]: Fix an issue where ide notebooks did not get deleted after they were idle for too long
1.8.9 (16-02-2023)
features
- [Docs]: Small improvements to the RBAC documentation
bugfixes
- [General]: Fix prepush of images on Azure when using images from public ecr
- [Airflow]: Fix an issue where Airflow web restart fails due to pid file
1.8.8 (15-02-2023)
Skipped due to database performance issues.
1.8.7 (09-02-2023)
bugfixes
- [Airflow]: Fixed an issue with scheduling on-demand jobs on Conveyor. After this we will reset all jobs currently stuck in scheduling
1.8.6 (09-02-2023)
features
- [Notebooks]: Add support for updating files of a notebook running in IDE mode
- [Airflow]: Upgrade the database used to use GP3 storage, which should give the same or better performance for a lower price
- [Airflow]: Support attaching external storage to container pods. This is useful for container jobs that need a lot of temporary storage.
bugfixes
- [Airflow]: Fixed an issue that made retries stop working when Spark submit fails
1.8.5 (06-02-2023)
features
- [UI]: Display a clear error message when there are no logs because the retention period is expired
bugfixes
- [Notebooks]: Fix an issue where deleting a notebook by name deleted the wrong notebook
- [General]: Fixed an issue in the onboarding flow, where the role could not be properly registered
1.8.4 (02-02-2023)
bugfixes
- [UI]: Fixed a bug where the memory metrics where not showing correctly for the spark driver, or container application
- [Airflow]: Fixed an issue that made retries stop working with the Conveyor Airflow Operators
1.8.3 (01-02-2023)
features
- [Airflow]: In airflow we now show you your last 100 log lines of your failed application. We also add a link to the logs that takes you straight to the logs in Conveyor. You can copy and paste into your web browser.
- [Airflow]: Update datahub integration to 0.9.6
- [Airflow]: Added the ability to override the start and end task in the ConveyorDbtTaskFactory
- [Templates]: Upgraded templates to 1.2.0.
bugfixes
- [CLI]: rework
conveyor completion
command to work without internet connectivity such that you can source is in a terminal. - [Airflow]: Fixed a bug where the
Conveyor Application Runs
button would not work when a user created a manual run with a self supplied run_id. - [UI]: show correct spark executor instance lifecycle and type in details table.
- [Airflow]: Fix an issue where the application run button does not generate the correct link for new projects
- [UI]: The spark executor metrics legend sometimes did not match the actual executor of which the metrics are shown. We now make sure the legend matches the correct metric
- [UI]: Fix broken redirect when pressing the logs button on the task executions page
1.8.2 (19-01-2023)
bugfixes
- [UI]: Fix an issue with the sorting of executors
- [Spark]: Fix an issue where Karpenter is allowed remove nodes with a Spark submitter pod
1.8.1 (18-01-2023)
features
- [Docs]: Improved the documentation of the ConveyorDbtTaskFactory in regard to start and stop task
bugfixes
- [UI]: Fix an issue where the filtered string was not taken into account when switching between pods
- [UI]: Fix an issue where navigating from execution details to executor log is broken
1.8.0 (18-01-2023)
In this release we reworked the task execution pages. When clicking on a row in the task executions
page, you now go to the details of that run,
showing more info on different pods of the Spark/container/sensor job.
By clicking on a row in the details page, you will go the logs of that specific pod (driver, submitter or executor).
If you immediately want to go from the task executions
page to the logs of Spark driver or container pod,
as was the default behavior in previous releases, you should now use the logs button in the Actions
column.
features
- [Airflow]: Do not restart Airflow web on redeploys as it is not needed anymore, we know made sure the
Conveyor Application Runs
button still works - [UI]: Rework the task executions in the UI to add a tab that displays detailed information for a job next to the logs and metrics. This tab shows the start date, duration, instance type as well as the failure reason for all pods of a given job. This is useful for Spark to get a quick overview on why executors are failing.
- [UI]: allow filtering when choosing an executor/driver on the task execution logs page.
- [UI]: add tooltip with resources available when displaying the instance types.
bugfixes
- [Spark]: A spark application used to hang when setting "spark.driver.memoryOverheadFactor" or "spark.executor.memoryOverheadFactor" for a spark version < 3.3.0, now we accept but ignore the value just like Spark would
- [Logs]: Fixed an issue where sometimes the logs of short running jobs would not be stored. We fixed the startup issue of the log aggregation service so now logs should always be available.
- [UI]: Show specific error message when a spark executor has no logs instead of showing generic error message.
- [Spark]: Fix an issue with the spark history command to allow cross account access to the artifacts bucket
1.7.9 (10-01-2023)
bugfixes
- [Airflow]: Revert the change to not restart airflow on redeploys, the change made the
Conveyor Appliation Runs
button fail
1.7.8 (10-01-2023)
features
- [Templates]: Mount a template git repo to the cookiecutter container instead of a folder within the repo
- [Airflow]: Do not restart Airflow web on redeploys as it is not needed anymore
- [Airflow]: Change
ConveyorExternalTaskSensor
to request taskInstances from the db in order to fix 404 errors on Airflow web restarts
bugfixes
- [RBAC]: Fix an issue where a non-admin user could not list Streaming Applications
- [Airflow]: Fix an issue where restarting the scheduler leads to temporary missing dags in the web UI.
1.7.7 (09-01-2023)
features
- [Airflow]: Log the node on which the Airflow worker is running, this makes it easier when needing to debug node outages
1.7.6 (04-01-2023)
bugfixes
- [Airflow]: Fix an issue with the Airflow dag-syncer init container going OOM
- [General]: Added a fix to logs and metric fetching on Azure
1.7.5 (03-01-2023)
bugfixes
- [RBAC]: Fix an issue with RBAC when fetching the list of task executions
features
- [Spark]: Add support for Iceberg in the latest images
- public.ecr.aws/dataminded/spark-k8s-glue:v3.3.1-hadoop-3.3.4-v2
- public.ecr.aws/dataminded/spark-k8s-glue:v3.3.1-2.13-hadoop-3.3.4-v2
1.7.4 (02-01-2023)
features
- [General]: Upgrade to eks 1.24
- [General]: Upgrade to aks 1.24
- [General]: On Azure run the CNI controller on the default node
- [Spark]: Add the ability to see the spark executor and submitter logs in the UI for both batch and streaming
- [notebooks]: Upgrade images to use ubuntu 22.04 as a base
- [notebooks]: Remove support for python 3.6
1.7.3 (27-12-2022)
features
- [Notebooks]: Use Spark 3.3.1 in notebook images
bugfixes
- [Spark]: Fix an issue in Spark controller where eventlog directory was deleted before driver was terminated.
1.7.2 (21-12-2022)
features
- [CLI]: Support updating the project description through the
project update --description <content>
command - [Airflow]: Allow operatorLinks to be inherited
bugfixes
- [General]: Reduce the time and improve robustness when deploy Conveyor projects
- [UI]: Made the log fetching in the UI more robust for AWS. This should make certain uses cases where AWS returned nothing work again.
- [Spark]: Fix an issue when using spark on-demand executors
- [Costs]: Fix an issue in calculating costs for new g5 and g4dn instances
- [UI]: Fix an issue with displaying top x project/environment costs
1.7.1 (14-12-2022)
features
- [Airflow]: Upgrade Airflow to 2.4.3
- [Airflow]: Upgrade datahub packages to 0.9.3.1
1.7.0 (05-12-2022)
🎉 We are happy to announce that Notebooks, one year after the initial introduction, are now out of preview 🎉 .
Over the past year we improved the Notebooks feature in terms of stability as well as user experience. From now on the API is stable, and we will ensure that all changes are backwards compatible.
features
- [General]: Remove the unused Airflow role
- [UI]: Improved the task execution page for a spark job running in local mode. We do not try to show spark executor metrics now, and we show the mode in the overview
- [CLI]: add Support for Podman next to Docker
- [Spark]: Released spark 3.2.3 images:
- public.ecr.aws/dataminded/spark-k8s-glue:v3.2.3-2.13-hadoop-3.3.4-v1
- public.ecr.aws/dataminded/spark-k8s-glue:v3.2.3-2.13-hadoop-3.3.4-v1
- [General]: Switch the EFS volume used for Spark event log, and Airflow logs from bursting to elastic (the new recommended mode)
bugfixes
- [UI]: Do not allow the deletion of admin users, make sure they are removed from the administrators first.
- [UI]: Fix an issue where inviting a user would sometimes fail because of a wrongly generated password
- [UI]: Fix an issue where filtering on streaming applications would not work
- [UI]: Fix an issue where the create environment modal was behind the guided tour
- [UI]: Improved log readability. It used to be that multiple whitespaces were collapsed into on,
printing for whitespaces like this:
foo bar
, would show like:foo bar
. Reducing readability when printing tables with whitespace. - [General]: Fixed an issue where spark applications using multiple R-instance executors would not launch
- [General]: Fixed an issue with zone limitations when using R-instances
- [CLI]: Make sure conveyor run loads the Airflow backports
- [Costs]: Fixed a bug in cost calculation for Spark, this has a low impact on the cost calculation
1.6.0 (22-11-2022)
In this release we add support for R instances on AWS. These instances use Karpenter (a new autoscaler) for the cluster which should result in faster node startup. It is a new component, with which we have less experience, so we will be monitoring it closely. If you notice any issues when using these instances, do not hesitate to contact Conveyor support.
features
- [General]: Add support for R instances on AWS. These instances have a higher memory to cpu ratio than the M instances on AWS. They are useful when loading in large data sets with limited computation. They are currently only supported on AWS.
- [General]: Use the new
price-capacity-optimized
allocation strategy. This is a new strategy that will use the spot pools which are least likely to be interrupted and selects the lowest price from these spot pools. It is the new recommended strategy by AWS. It will result in a cost savings compared to the previous recommendedcapacity-optimized
strategy. For more info you can look at the release blog by AWS - [Docs]: Added documentation on how the
executor_disk_size
setting may improve performance. - [General]: Optimised the log storage on AWS, this should reduce the cloudwatch logging bill
- [General]: Release terraform provider 0.0.7
bugfixes
- [Spark]: Handle Spark applications that launch multiple spark contexts gracefully and do not upload their event logs
- [UI]: Fix an issue with filtering the logs of streaming applications
- [UI]: Make cancel button for Spark tasks clickable without using horizontal scrollbar
- [Airflow]: Fixed a bug where the Airflow scheduler could get into a faulty state when scheduling Dynamic tasks. This is now fixed
1.5.15 (16-11-2022)
features
- [Airflow]: Remove db cleanup job now it has executed successfully
1.5.14 (16-11-2022)
features
- [General]: Expose extra Airflow metrics to monitor Airflow environments more thoroughly
bugfixes
- [UI]: Improve the spark history server dialog
1.5.13 (15-11-2022)
bugfixes
- [Airflow]: ConveyorContainerOperatorV2
arguments
allow None values to be passed - [Airflow]: ConveyorSparkSubmitOperatorV2
application_args
allow None values to be passed
1.5.12 (15-11-2022)
bugfixes
- [General]: Fix an issue with internal Airflow metrics
1.5.11 (15-11-2022)
features
- [UI]: In the task executions list, change the default match from exact to fuzzy
- [Airflow]: In the ConveyorContainerOperatorV2 we now raise an exception if the
arguments
passed are not all of type string - [Airflow]: In the ConveyorSparkSubmitOperatorV2 we now raise an exception if the
application_args
passed are not all of type string
bugfixes
- [Airflow]: Fix incorrectly stored on-demand task instances affected by bugfix
1.5.10 (14-11-2022)
features
- [Spark]: Rework spark-history support, now you can create a spark-history server locally instead of sharing one per client, which caused many issues
bugfixes
- [Airflow]: Load the backport of Airflow bugfix earlier so that the scheduler picks it
1.5.9 (09-11-2022)
bugfixes
- [General]: Fixed an issue with deleting environment, where certain undeploys where processed multiple times. This resulted in long waiting times to delete environments
- [Airflow]: Backported an Airflow bugfix which impacted on-demand tasks
1.5.8 (09-11-2022)
features
- [Airflow]: Upgrade to Airflow 2.3.4
- [Spark]: Release spark 3.3.1 images, more details in the docs
- [UI]: Show progress when deleting/updating an environment in the environment events
- [Airflow]: Airflow version 1 support has been removed
bugfixes
- [General]: Fix an issue with cleaning up temporary files while deleting an environment
- [CLI]:
conveyor notebook start
had an issue where it asked for you notebook twice, this was fixed - [CLI]:
conveyor notebook
commandsstart
,stop
,open
now only show notebooks you own - [General]: Fix autoscaling from zero for our autoscaling groups, there was an issue with notebooks not properly detecting the Availability zone of the volume
1.5.7 (27-10-2022)
features
- [General]: Deployments are now attributed to the user who trigger them
- [Docs]: Added documentation on the needed IAM rights to use secrets in the operators
- [UI]: The guided tour can now be stopped and resumed even after refreshing the application
- [UI]: Admin users are now displayed in the user list page
- [UI]: Allow multiple users to follow the guided tour without interfering with each other
bugfixes
- [CLI]: Let the exit code of the
conveyor run
command mimic the exit code of the Airflow task - [UI]: Previously used logins are now always kept in lowercase
- [General]: Fixed a bug where old spark application where not cleaned up properly. This could block the deletion of environment.
1.5.6 (18-10-2022)
features
- [Airflow]: Added support for dynamic tasks in
conveyor run
. For more information go here - [Spark]: Release support for Spark 3.2.2
- [CLI]: Added support for updating the airflow instance lifecycle (on-demand, spot) via the CLI. Before you could only supply it during creation.
- [CLI]: Added conveyor completion commands to the docs
- [General]: AWS Upgrade EKS to version 1.23
- [General]: Azure upgrade AKS to version 1.23.12
- [General]: AWS and Azure decrease logging costs for fluent-bit
- [General]: Azure decrease logging cost, we do not store the logs of certain verbose Azure managed components
- [Notebooks]: When opening a terminal, the working directory is your conveyor project
- [Notebooks]: Install ssh by default
bugfixes
- [UI]: Fixed the project link not work on the task execution detail page
- [UI]: Fix the option to partially or fully select a task in the task executions page
- [UI]: Fix an issue where multiple people could not create a notebook with the same name
- [CLI]: Do not validate the setup.py and requirements.txt when using your own Dockerfile for notebooks
- [General]: Fix a bug in the airflow scheduler liveness probe
- [Spark]: Fixed a bug where the spark history server link would point to a wrong file name in certain rare cases, this could happen when a spark job was shut down unexpectedly
- [Notebook]: In a certain edge case a notebook delete would not do anything, this has been fixed
1.5.5 (06-10-2022)
bugfixes
- [UI]: Fix a bug that made the executions page fail to load
1.5.4 (06-10-2022)
features
- [General]: Added a Conveyor tag to all aws resources
- [General]: Increase the responsiveness of the Kubernetes cluster autoscaler after a failure
bugfixes
- [Airflow]: Improved the liveness check of Airflow to use standard airflow code, this should result in less scheduler restarts
1.5.3 (21-09-2022)
bugfixes
- [CLI]: The previous release introduced a bug in the promotion mechanism which is now fixed.
- [UI] Prevent accidental clicks on project/environment pins
1.5.2 (20-09-2022)
features
- [UI]: You can now use Conveyor in Dark Mode, including the Airflow and Notebooks UIs.
- [UI]: Added an option to list all past deployments of a project, with their build ids and git commit hash ; not only the active one.
bugfixes
- [Airflow]: Fixed an issue with dbt factory when models contain the word model.
- [UI]: Sorting projects on last activity was not working correctly.
- [Streaming/RBAC]: Fixed an issue with a missing RBAC permission to validate a Streaming application.
1.5.1 (13-09-2022)
features
- [Airflow]: Added function to add conveyor executions URL to alerting, see documentation here
- [Spark]: Backport hadoop 3.3.4 for older Spark images.
Full details on the images are available at docs
- public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-hadoop-3.3.4-v8
- public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-2.13-hadoop-3.3.4-v8
- public.ecr.aws/dataminded/spark-k8s-glue:v3.1.3-hadoop-3.3.4-v3
- public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.4-v7
bugfixes
- [CLI]: fix an issue where the git repo detection was not correctly filled in on a project.
- [Notebooks]: downloading notebooks was mistakenly overwriting the local src folder with the content of the notebooks folder.
- [CLI]: use a different command for detecting the current git commit hash such that it also works for repos with annotated tags.
- [General]: fix an issue with cost calculation such that it works again.
- [UI]: recent and pinned projects/environments were not properly removed when the corresponding project/environment is deleted.
1.5.0 (05-09-2022)
features
- [CLI]: Add commands for teams and use them in tf provider
- [Notebooks]: The notebook storage has been migrated from EFS to EBS. The migration will be handled automatically as soon as you have paused your notebook. The reason for this is that the EBS performance is more consistent, the downside is you have to specify a size up front, the default is 10Gi.
- [Notebooks]: We now not only persist your code and notebooks but also your venv. So if in a notebook you install
new packages these will be persisted across restarts. We do this by persisting the folder
/home/jovyan/work
. - [Notebooks]: Opening a terminal in the notebook has your virtual environment automatically activated (only for newly started notebooks).
- [Notebooks]: Added the plugin
jupyterlab-system-monitor
by default to show the current memory usage of your notebook - [Notebooks]: Added start/play button to the notebook details overview.
- [Notebooks]: The details overview can now always be opened even if the notebook is not ready or stopped.
- [Notebooks]: In the list view, the action open notebook UI now always opens in a new tab
- [Notebooks]: Added documentation on how to install jupyter extensions for notebooks.
- [CLI]: When running
conveyor notebook start
we now open it automatically when it has been started - [CLI]: Added the
conveyor notebook open
command to open your notebook in a browser - [CLI]:
conveyor notebook create
gives more feedback to the user on what is going on - [CLI]: Added icons to indicate that certain steps have finished when using
conveyor notebook create
andconveyor run
- [UI]: When creating a notebook, we now open it automatically
- [UI]: Rework how logs are shown in the UI. Make the UX simpler and allow searching logs across all pages.
- [Spark]: Include delta 2.1.0 in spark 3.3 images
- [UI]: Task executions can now be filtered by clicking on the corresponding environment/project/task/dag buttons from the table columns
- [UI]: Filtering on task executions columns now matches exactly the filter, unless you prefix it with ~
- [UI]: Added shortcut buttons for recently visited environments and projects
- [UI]: You can now pin projects and environments so that they show up on top of the lists
bugfixes
- [General]: Inviting a user could sometimes result in an email with a non-working url. We have fixed the encoding so inviting someone should always work.
- [RBAC]: Users were not properly removed from RBAC when being deleted. It is now fixed.
- [RBAC]: When RBAC is disabled show all the settings pages
1.4.2 (23-08-2022)
features
- [Airflow]: Support running Airflow scheduler/web on on-demand nodes to ensure that they are always available, which might be useful for production environments. For more info
- [Spark]: Added spark images for hadoop 3.3.4
- public.ecr.aws/dataminded/spark-k8s-glue:v3.3.0-hadoop-3.3.4-v1
- public.ecr.aws/dataminded/spark-k8s-glue:v3.3.0-2.13-hadoop-3.3.4-v1
- [UI]: All users can now be deleted by admins, including SSO users. If an SSO user gets deleted, he will simply be recreated on the next login. However, his previous permissions will be lost and will need to be set again.
- [Templates]: Upgrading to the latest version of the templates 1.1.0
- [UI]: Include chat option in UI to contact the support team
- [UI]: Add readonly demo environment for potential users
- [Templates]: Use strict uuid pattern matching in the resources. (@stijndehaes)
- [Templates]: Upgrade the resource folder assume role policies to use the service account. (@stijndehaes)
- [Templates]: Upgrade spark images to our latest releases. (@stijndehaes)
- [Templates]: Upgrade to dbt 1.1.0 (@pascal-knapen)
- [Templates]: Add gitpod and codespaces configuration (@pascal-knapen)
bugfixes
- [Azure]: Fixed an issue with calculating the memory of pods on Azure
- [General]: Fixed detection of the command or entrypoint of a container not working. Also adding documentation on how to debug the issue
1.4.1 (09-08-2022)
features
- [UI]: Admin users can now delete users from the Conveyor installation, if they are not part of the identity provider
- [Airflow]: Conveyor Airflow operators V1 are now obsolete and have been removed, after their deprecation period. If you still have V1 operators in some projects, please refer to our upgrade guide
- [Streaming]: Added the possibility to modify
restartsThreshold
,restartsWindow
,restartsAlertCoolDown
for Spark streaming applications - [UI]: Links to Gitlab repositories are now supported
1.4.0 (01-08-2022)
features
- [General]: We improved the security of containers running on the Kubernetes were possible.
Our own Kubernetes management applications and Airflow run with the following settings:
- We enable read only root file system where possible, this makes it harder for attackers to exploit the root file system
- We run the containers as a non-root user, this makes privilege escalation harder
- We disable privilege escalation, this disables privilege escalation to the node
- We disable the Automount of service account tokens on pods that do not need it, by disabling this an attacker cannot receive an authorization token for the Kubernetes API. We run user containers with the following extra settings:
- We disable privilege escalation
- We disable the Automount of service account tokens on pods that do not need it In the future we might look into enabling more of these settings, or allow the you to define them for your containers.
- [UI]: The embedded Airflow view now persists the navigation to its various pages in the conveyor url. This makes it easier to share the url with others, or revisit it yourself.
- [UI]: You can now easily navigate from the Task Execution list and details pages to the Airflow DAGs and Tasks pages
- [RBAC]: The error message that is returned when an action is not authorized is now more descriptive.
- [General]: The reset password email was still mentioning Datafy. It is now changed to Conveyor.
- [Docs]: Added a contact and support page
- [UI]: We now show the guided tour to first-time users by default
- [Airflow]: Decrease the load on NFS by Airflow by decreasing the dag processing logging to Error, and only creating the files when needed
- [Airflow]: Added docs on assume cross account iam roles for operators.
- [General]: Fixed an issue with the user invitation flow where an invalid invitation link was sometimes generated
- [UI]: You can now view the AWS IAM role or the Azure Application Client Id used by task executions in the Task Execution details view.
bugfixes
- [UI]: Fixed an issue with filtering task executions on multiple values of status, type, ...
- [CLI]: Fix an issue where the CLI does not listen to your keystrokes after logging in, e.g. when creating a notebook
1.3.12 (26-07-2022)
features
- [General]: On Azure we upgraded Kubernetes to 1.22.11
bugfixes
- [General]: Lower the usage of EFS by airflow executors, EFS was a critical component for airflow. Because of issues from the past we are lowering our dependency on it, thus increasing availability.
- [General]: Fixed a memory leak by upgrading one of the libraries we use
1.3.11 (19-07-2022)
features
- [General]: On aws we enable image tag immutability on the ECR repositories we manage
bugfixes
- [General]: On Azure we are seeing networking timeouts when pods start up, this is very problematic for the airflow scheduler as it never recovers. We now restart it when we detect the issue. We are also into contact with Azure support to find the root cause.
1.3.10 (14-07-2022)
bugfixes
- [UI]: Fixed an issue with the project pages not properly loading
1.3.9 (14-07-2022)
bugfixes
- [UI]: Fixed a race condition when logging in
1.3.8 (13-07-2022)
features
- [UI]: Adding buttons to open a project in gitpod or github codespaces
bugfixes
- [General]: Make the algorithm to detect different memory configurations from 1.3.7 more robust. We do not accept changes that are considered too low, this way we are more tolerant to faulty configurations.
- [Airflow]: Remove some dangling tables after airflow 2.3 migration
1.3.7 (12-07-2022)
bugfixes
- [General]: Not all aws instances get the same amount of memory. In one instance group we see heavy fluctuations. The difference between from one m5.4xlarge to another can be in the range of several 100Mbs. This can result in an issue when calculating the memory for your container. We now autocorrect these values.
- [Notebooks]: Fixed an issue where downloading notebook files would fail if the old
datafy.cloud
domain was configured - [Docs]: Add
mx.nano
instance type back to our ConveyorContainerOperatorV2 documentation.
1.3.6 (11-07-2022)
The notebook API has been updated in a non backwards compatible way, please upgrade to the latest CLI.
features
- [Notebooks]: Reworked our notebook API to be more consistent
bugfixes
- [Airflow]: Fixed an issue in fetching the spark submit logs from airflow
- [Notebooks]: Fixed an issue with notebooks where notebooks would be invisible when the SSO connection creates usernames with capitals
- [CLI]: Fixed
conveyor run
execution date detection when no schedule is set on a DAG - [Spark]: Fixed an issue where environment variables weren't propagated in local mode
- [UI]: Fixed an issue where the embedded Airflow view would not render properly when the access token is expired
- [UI]: Fixed an issue where a project git repo would not be updated after modifying it
1.3.5 (05-07-2022)
features
- [Airflow]: Reduce logging of airflow scheduler and file processor to efs.
- [Airflow]: Change how Airflow dag fetching works for the scheduler and web instance, this reduces load on EFS by 75%
bugfixes
- [UI]: Fixed an issue where the UI would use the user email with capital letters
1.3.4 (30-06-2022)
bugfixes
- [Airflow]: Preventing Airflow tasks to be marked as failed when an API 410 exception is thrown
- [Airflow]: Lower the logging of Airflow to EFS
1.3.3 (30-06-2022)
features
- [Spark]: Big spark jobs (mx.xlarge, mx.2xlarge, mx.4xlarge and more then 1 executor), will now be scheduled in a single Availability zone. We select the availability zone based on the least amount of spot interrupt change when running on spot. This will reduce network costs, and reduce network overhead for spark.
bugfixes
- [Spark]: We are investigating and issue where the EFS volume used for the spark event log is overloaded. We added a global Admin option for ourselves to disable spark event log upload, that we can enable when we notice issues in an environment.
- [Airflow]: Backported airflow#24478 to fix an issue with retrying old tasks in the UI
- [Airflow]: Backported airflow#24117 to fix an issue with retrying old tasks in the UI
1.3.2 (29-06-2022)
features
- [docs]: Advocate the use of strict uuid pattern matching when assuming roles
- [Airflow]: Upgrade to airflow 2.3.2
- [Airflow]: Allow users to template
num_executors
in ConveyorSparkSubmitOperatorV2 - [General]: Allow folders in dags and resources folder
- [General]: Added warning about v1 operator deprecation into UI
- [CLI]: Conveyor run now lets you select an environment interactively
- [CLI]: Conveyor run now lets you select a DAG and a task interactively
- [CLI]: Conveyor run now automatically uses the last execution date compatible with the DAG schedule if none is provided
- [Spark]: Added support for Spark streaming on Azure
- [Spark]: Added a spark local mode to the ConveyorSparkSubmitOperatorV2, see the docs for more info
- [CLI]: Support passing additional build arguments to the container engine
- [Spark]: Added section on improving performance when Spark on Conveyor
- [Costs]: Added a global overview per day
- [Azure]: Initial version of Azure metrics available in UI
- [Spark]: We now run big (more than 1 executor, and executor instance type mx.xlarge, mx.2xlarge, mx.4xlarge) batch spark applications in a single AZ by default. When using spot we use the aws spot placement score API to determine the best AZ when your spark application is launched. This improves the availability of the spark application, and reduces network costs and overhead.
- [Spark]: Added spark 3.3.0 images:
- public.ecr.aws/dataminded/spark-k8s-glue:v3.3.0-hadoop-3.3.1-v1
- public.ecr.aws/dataminded/spark-k8s-glue:v3.3.0-2.13-hadoop-3.3.1-v1
- [Azure]: Support enabling microsoft defender for cloud on Azure
- [Spark]: Released new images with reduced logging when using spark local mode:
- public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1-v6
- public.ecr.aws/dataminded/spark-k8s-glue:v3.1.3-hadoop-3.3.1-v2
- public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-hadoop-3.3.1-v7
- public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-2.13-hadoop-3.3.1-v7
- [Templates]: Use strict uuid pattern matching in the resources. (@stijndehaes)
- [Templates]: Upgrade the resource folder assume role policies to use the service account. (@stijndehaes)
- [Templates]: Upgrade spark images to our latest releases. (@stijndehaes)
bugfixes
- [UI]: Pressing ENTER when filtering columns was not working
- [UI]: Add executor info to spark detail page
- [UI]: Show all states for an environments in the UI, such that users can see what is going on when it is being deleted
- [UI]: Fixed an issue where inviting a user would result in Airflow UI's not loading untill the users logged out and in again
- [CLI]: Fixed an issue where logger wouldn't respect being set to quiet
- [CLI]: Fixed an issue with deleting notebooks
- [General]: Update ebs csi driver so it doesn't go out of memory when many pods are being scheduled. This improves reliability when using the spark option
executor_disk_size
- [General]: Fixed a bug where the metrics would show double the CPU usage on AWS
- [CLI]: Do not print the m2m token when logging in
1.3.1 (27-06-2022)
bugfix
- [Azure]: Correct daemonset overhead calculation to include the azure-cni component after switching away from calico
1.3.0 (20-06-2022)
features
- [Airflow]: Update Datahub packages to latest stable version (0.8.36)
- [Azure]: Prepush notebook and project base images when they originate from the project acr or the conveyor ecr registry
- [Azure]: Implement project and environment costs
- [RBAC]: You can now manage team of users and assign teams to projects and environments. For more details, please refer to our RBAC reference
1.2.5 (14-06-2022)
bugfixes
- [Airflow]: Fixed a bug where removing tags from a dag would make it fail to load
- [UI]: Fixed the link to the git repo in deployments, and on the project view
- [CLI]: Do not show version out of date warning when using
conveyor update
command
1.2.4 (13-06-2022)
bugfixes
- [Notebooks]: Update spark image used in notebooks to version: 3.2.1-hadoop-3.3.1-v6
- [UI]: Tour won't let you move past step 7
1.2.3 (09-06-2022)
bugfixes
- [General]: Fixed an issue where the new single availabilty zone option that could result in jobs being slow to launch
1.2.2 (08-06-2022)
features
- [General]: Added more instance types to our autoscaling groups. This will help to get the most stability out of spot instance on AWS
- [General]: Made the single availability zone option more robust for on-demand. When the preferred instance type is unavailable it will move on to the next preferred type in a list
bugfixes
- [General]: Fixed an issue where the new single availabilty zone option would always use on-demand instances
1.2.1 (08-06-2022)
features
- [UI]: Add optional tracking for analytics purposes
- [Airflow]: Resubmit Spark application when spot termination is detected during submission
- [Spark]: Allow users to select an availability zone for your spark application using ConveyorSparkSubmitOperatorV2
bugfixes
- [UI]: fix links to docs page
- [UI]: Fixed an issue on Azure where big logs would fail to load
- [Spark]: Handle an extra case as spot interruption instead of a regular spark submit failure
1.2.0 (07-06-2022)
features
- [Spark]: Added support for spark decommissioning, Spark decommissioning helps you to not lose data when an executor has a spot interrupt.
Before the spot interrupt goes through spark will try to send all intermediate results to other executors. Thus saving time and money for this job.
The feature is only supported in our latest spark 3.2 images:
- public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-hadoop-3.3.1-v6
- public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-2.13-hadoop-3.3.1-v6
bugfixes
- [Spark]: Created a new spark image which fixes a bug in MsalTokenprovider for using spark on Azure
1.1.6 (31-05-2022)
bugfixes
- [General]: use the api server address as master on Azure instead of internal Kubernetes service.
- [Airflow]: Fix an issue where manual airflow runs were not filtered correctly in the task executions
1.1.5 (31-05-2022)
bugfixes
- [CLI]: Fix the conveyor update progressbar
- [General]: Make sure the cluster-autoscaler can handle the taint set by spot termination: aws-node-termination-handler/spot-itn
- [Spark]: Upload the spark eventlog under the correct name, this makes the spark history server work again
- [UI]: Fixed a typo on the admin page where it said projects instead of environments
1.1.4 (30-05-2022)
bugfixes
- [Airflow]: Lower the Airflow usage of EFS by changing
min_file_process_interval
from 10 to 30 anddag_dir_list_interval
from 60 to 300. - [General]: Added some improvements to the code that will lower EFS usage
1.1.3 (27-05-2022)
bugfixes
- [UI]: Fix filtering on task executions
started_at
field - [General]: Allow more parallel processing in our operator. This reduces waiting time when a lot of spark/container jobs are launched
1.1.2 (27-05-2022)
bugfixes
- [General]: We only keep failed container CRD's around for 30 min instead of 3 days. They piled up and took too much resources.
1.1.1 (27-05-2022)
features
- [UI]: Remember previously used email in login screen
- [General]: Implement cleanup of old project builds for Azure
bugfixes
- [Airflow]: Catch 502 and 504 errors in External Task Sensor
- [General]: Fix an issue where project deletion on Azure failed
- [General]: Fixed an issue where an environment that failed to create could get into and unrecoverable state
- [Notebooks]: Cleanup unused notebook images from ACR
- [Spark]: Fix the spark eventlog upload failing
1.1.0 (24-05-2022)
features
- [CLI]: Change CLI to use environment variables with CONVEYOR prefix as a preferred
- [CLI]: update the
upgrade-dags
command to also rename imports and classes in dags fromDatafy
toConveyor
- [CLI]: Move the ~/.datafy profiles directory to ~/.conveyor
- [CLI]: Add warnings when running
conveyor build
if the dags still useDatafy
instead ofConveyor
. - [General]: Display the node id in the UI as well as in Airflow when the node got spot terminated.
- [Templates]: The github repository for templates has been renamed to
conveyor-templates
- [Notebooks]: The working directory for notebooks has been renamed from
datafy_project
toconveyor_project
. This might cause loss of data for existing notebooks. - [Projects]: Projects now get their configuration from the
./conveyor
directory, with fallback to the./datafy
directory
bugfixes
- [CLI]: Conveyor run now creates the date interval just like a scheduled Airflow run, before it behaved like a manual Airflow run
- [Spark]: Wait with uploading the event log until the spark application has finished, before there were instances where an upload happened before the spark application was shut down
- [Airflow]: Allow primitive types as env_vars and convert them to strings
- [Airflow]: Handle spot interrupts in ConveyorContainerSensor and ConveyorExternalTaskSensor tasks with
reschedule
mode by rescheduling them on another node - [Airflow]: Handle spot interrupts in all Sensor tasks which use mode
reschedule
by rescheduling them instead of crashing - [Airflow]: ConveyorExternalTaskSensor now also can now also watch for manually scheduled runs
1.0.2 (20-05-2022)
features
- [Aiflow]: Increase parallelism from 64 to 128
- [UI]: Improve the navigation breadcrumbs and the page icons
1.0.1 (19-05-2022)
bugfixes
- [CLI]: Fix issue with
datafy update
renaming the CLI binary to conveyor
1.0.0 (19-05-2022)
features
- [General]: Rename Datafy to Conveyor
- [CLI]: Cleanup command to delete managed docker images
- [CLI]: The
datafy
CLI executable will be automatically renamed toconveyor
when usingdatafy update
- [UI]: Fix a bug with embedded Airflow view auto-resizing
bugfixes
- [Airflow]: Retry pods that fail due to kubeletOutOfResources (more details about the bug in k8s: kubernetes#106884) up to 5 times.
0.63.3 (10-05-2022)
features
- [Spark]: Integrate azure libraries in our standard spark image.
- [Templates]: Use new spark image which supports both azure and aws
bugfixes
- [Notebooks]: Configure the notebook to work on azure
- [Notebooks]: Fix notebook configuration to include azure specific properties and jars
- [Notebooks]: Fixed a bug where the memory of the spark context was not changed according to the instance size
- [Notebooks]: Fixed a bug where files were not persisted when notebook was created from the UI
0.63.2 (05-05-2022)
bugfixes
- [UI]: Fix an issue to get current user roles when using SSO
0.63.1 (05-05-2022)
features
- [Airflow]: Removed the possibility to create airflow v1 environments, airflow v1 environments are deprecated for a long time now. They will be phased out in 2022. Airflow 1.x does not receive community support anymore and relies on very old libraries which are becoming less and less secure to use.
bugfixes
- [UI]: Use username to get user roles in UI
0.63.0 (05-05-2022)
features
- [CLI]: Pass environment variables to airflow for dag validation or using the
run
command - [UI]: Added the m2m tokens in the Conveyor UI Settings page
- [General]: Add support for detecting spot termination on Azure
- [Templates]: Make the templates work for both azure and aws
- [Templates]: Use newest spark version: 3.2.1 in the templates
- [Templates]: Update python versions such that they work with Apple Clang 13+
bugfixes
- [UI]: Fixed an issue with SSO users showing up with weird names in the list, this is only relevant for installations starting from 2022
- [Airflow]: Fixed an issue with cleaning up old airflow logs
- [Airflow]: Make Airflow workers more robust against connection reset errors while watching Kubernetes pods
- [UI]: Show the azure application client id for notebooks.
- [General]: Increased timeout on Azure when deleting container repositories.
0.62.7 (27-04-2022)
features
- [Spark]: Added a new failure mode for spark batch applications, if the application has lost more than 5 times the amount of executors requested the application will fail. The chances of such an application ever finishing are very low, and it would continue to take up resources in the cluster otherwise.
- [Airflow]: When an airflow executor shuts down unexpectedly we check if this was because of a spot interrupt. If that is the case we put a message in the logs. This should make debugging issues easier
- [General]: Add new failure mode for batch applications, if kubelet has not enough resources (cpu, memory, pods) but scheduler did assign the pod to the node, which caused it to fail.
bugfixes
- [UI]: Fixed an issue with SSO users not showing up in the user list, this is only relevant for installations in 2022
- [General]: Fixed an issue where updates of spark application with more than 600 executors would not be updated in our UI
- [CLI]: Fixed
datafy notebook download
being broken - [Docs]: Fixed broken link to connect-ide docs
0.62.6 (21-04-2022)
features
- [CLI]: Add support for Azure DevOps git when using templates to create a project
- [CLI]: Made the
datafy project stop-run
command more useful, it can now handle multiple matches. And allows you to stop runs in batch - [CLI]: Conveyor run now checks before starting if there are other manual runs with the same properties in the environment. If there are it will ask if you want to clean these up first. This will stop manual runs from piling up.
bugfixes
- [UI]: Fixed an issue where the cancel run button wouldn't work, but just redirect to the job logs
- [Airflow]: Fixed an issue where the datafy application runs button would not filter on the environment
- [Airflow]: The ConveyorExternalTaskSensor would fail if the Airflow instance was unavailable(for example because of a spot interrupt). Now we gracefully retry the sensor on the next poke
- [General]: Fix a bug where a new users wouldn't be able to register when using SSO
- [CLI]: Fix a bug for
datafy notebook
commandsdelete
,start
,stop
,download
where filling in only thename
orenvironment
flag would not filter properly on these flags
0.62.5 (15-04-2022)
bugfixes
- [Airflow]: Make Airflow workers more robust against glitches in Kubernetes instead of failing immediately
- [UI]: Return a 404 instead of a 500 when requesting nonexisting logs such the UI does not handle it as an error.
- [Notebook]: Fix errors in our notebook operator when using cross account clusters
features
- [Airflow]: Support instance_life_cycle option for dbt tasks
- [Airflow]: Support instance_life_cycle option for airflow sensors
0.62.4 (13-04-2022)
features
- [General]: Upgrade Aws EKS version to 1.22
0.62.3 (08-04-2022)
features
- [General]: Upgrade Aws for fluent bit to version: 2.23.3
- [General]: Upgrade aws load balancer controller in preparation of eks 1.22 upgrade
0.62.2 (07-04-2022)
features
- [Airflow]: Upgrade following dependencies for Airlfow 2: Airflow to 2.2.5, upgrade apache-airflow-providers-apache-spark to 2.1.3, apache-airflow==2.2.5 apache-airflow-providers-cncf-kubernetes to 2.2.0, apache-airflow-providers-slack to 4.2.3, acryl-datahub to 0.8.31.6, boto3 to 1.21.32
- [General]: When scheduling an application, we now don't fail at the first ImagePullBackOff happening in Kubernetes, we need three failure events. This makes the operator more robust to temporary network failures.
- [UI]: On the costs page add the selected cost range to the URL, this makes it easier to share URL's with other people
- [UI]: On the streaming application pages added the selected filter to the URL, this makes it easier to share URL's with other people
- [CLI]: Added the command
datafy project generate-config
, which will generate the.datafy/project.yaml
file for a project. This is useful when forgot to check it into git or when you use the terraform provider.
0.62.1 (04-04-2022)
features
- [CLI]: add login fallback when the automatic CLI login does not work or is not supported.
- [UI]: If your login is expired, and you go to your Airflow URL, we now redirect you to your Airflow page again after logging in.
bugfixes
- [UI]: Go to the correct landing page after logging in from an invitation link
- [DOCS]: Small cleanup in the pyspark and spark tutorial
0.62.0 (29-03-2022)
features
- [General]: Run the on-demand instances autoscaling group as a mixed instance fleet, that way we can handle a certain instance type note being available on aws
- [Spark]: Released new spark images that add a new
log4j-executor.properties
file that reduces logging for spark executors, this results in cloudwatch cost savings. The new images are:- public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-hadoop-3.3.1-v3
- public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-2.13-hadoop-3.3.1-v3
- public.ecr.aws/dataminded/spark-k8s-glue:v3.1.3-hadoop-3.3.1
- public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1-v5
- [UI]: Added type email to our login email field, this way browsers and password managers will recognize it better
- [UI]: Simplify create environment modal when there is only 1 cluster
bugfixes
- [General]: Fixed an issue when migration our datafy config file
- [CLI]: Fix an issue when deleting of notebooks fails
- [UI]: Fix login flow from CLI
0.61.7 (18-03-2022)
bugfixes
- [UI]: Revert runtime-config
0.61.6 (18-03-2022)
bugfixes
- [UI]: Correct runtime-config for production
0.61.5 (17-03-2022)
bugfixes
- [General]: Add cluster endpoints to management API
0.61.4 (16-03-2022)
bugfixes
- [CLI]: Fix cicd flow when passing environment variables
0.61.3 (16-03-2022)
bugfixes
- [General]: Fix listing users
- [CLI]: Fix cicd flow
0.61.2 (16-03-2022)
bugfixes
- [General]: Make sure the Conveyor team can access the tenants
0.61.1 (16-03-2022)
bugfixes
- [Docs]: Regenerated docs for template release 0.15.5
- [UI]: Fix IDP login for dataminded users
0.61.0 (16-03-2022)
features
- [UI]: Revamped login flow
- [Templates]: Update the spark settings so the aws glue integration works again
- [Templates]: Use dots instead of underscores for specifying the Conveyor_instance_type. (@nclaeys)
0.60.1 (10-03-2022)
bugfixes
- [Spark Streaming]: Fix missing applications in the UI.
0.60.0 (08-03-2022)
features
- [UI]: Add button for administrators to invite new users
- [Airflow]: Make conveyor_instance_type specification consistent by using dots everywhere
- [Spark Streaming]: Added an alerting option to spark streaming support, for more info see here
- [General]: Upgrade aws ebs csi driver to 2.6.3
- [Airflow]: Upgrade Airflow 2 to 2.2.4
bugfixes
- [CLI]: Remove debug message when copy image fails due to access denied
- [UI]: Fix filtering on environment and schedule in task executions page
- [UI]: Refresh page of streaming application that is in state pending every 10 seconds
- [CLI]: Refactor the result of
datafy project list-users
anddatafy environment list-users
to not include/User/
string
0.59.4 (18-02-2022)
bugfixes
- [Airflow]: Do not set tcp_keepalive when using airflow v1 as it does not exist.
- [CLI]: Fix
datafy update
for Apple Silicon Macs - [General]: Change the spark version used by the spark history server to not have issues with verifying the S3 ssl certificates
0.59.3 (16-02-2022)
bugfixes
- [Notebooks]: Use the correct images when launching a notebook from the UI
0.59.2 (16-02-2022)
features
- [UI]: Added instance type and lifecycle to task execution details page
- [UI]: Show deletion protection status in environments page
- [UI]: Added delete button in environments page
- [UI]: Added button to create new environments
- [UI]: Added button to create notebooks
- [General]: We run our agent on the cluster instead of on ECS
- [Spark]: Release spark 3.2.1 images
- [Docs]: Added documentation about spark hive integration issues.
- [Docs]: Migrated the documentation to Docusaurus, this should allow us to make the documentation more user-friendly
- [Notebooks]: Do not copy the virtual environment to nfs but only the project related files in order to speed up notebook creation
0.59.1 (03-02-2022)
bugfixes
- [CLI]: Make sure the docker client used by Conveyor also uses the typical Docker environment variables
- [CLI]: Allow uploading of dag files larger up to 16MB, up from 1MB. Also fail if a larger file is detected instead of printing a warning
- [UI]: Fix an error that was appearing on the first page load
- [UI]: Fix a problem in the admin user panel where a project could not be added to a user
- [UI]: Fix a scrolling issue in the embedded Airflow page
0.59.0 (31-01-2022)
features
- [Notebooks]: Added support for notebooks persistence. This means notebooks can now be stopped and started using the CLI and the UI.
- [General]: Cleanup unused aws secrets manager secrets
- [General]: Improved the pg-bouncer SSL setup to the RDS server by validating the RDS CA, the RDS only accepts encrypted connections now
- [CLI]: Added the same docker build flags for notebook create that are supported in project build
- [CLI]: Support for podman as container manager
bugfixes
- [General]: fix 2 small issues with the configuration of notebook properties
- [Airflow]: Update the connections used by Airflow 2 to be sourced from the environment. That way we should have fixed the issues with a connection being temporarily unavailable and that leading to a job failure
- [RBAC]: Fixed an issue where a project admin could not manager users on a project
0.58.1 (10-01-2022)
bugfixes
- [CLI]: Fixed an issue when building a project would not work
- [PySpark]: Update pyspark images such that setuptools (>=60.0.0) also installs global python packages in the correct directory for debian.
- [Template]: Upgrade templates to 0.15.4
0.58.0 (7-01-2022)
features
- [General]: Release the preview of the costs feature.
- [General]: Pre cache notebook and project base images to speed up uploading images the first time.
- [General]: use m6i instances instead of m5 when launching new nodes as they are more cost efficient
bugfixes
- [UI]: Fixed a bug where a job duration wasn't updated, and it also refused to show the metrics because of that.
- [General]: Fixed an issue where trying to use a secrets from aws without an IAM role would take 30m to fail.
- [CLI]: Fixed an issue where starting a new notebook wouldn't work
0.57.1 (6-01-2022)
bugfixes
- [CLI]: Fixed a bug where starting a notebook could scramble the order in your notebooks.yaml if you had multiple definitions
- [Spark]: Released new spark images that fix a vulnerability wit log4j 1.x, for more information, see our documentation.
0.57.0 (3-01-2022)
features
- [CLI] Rename all
new
CLI commands tocreate
to consistently use verbs for CLI commands (thenew
commands still work as aliases of thecreate
commands for backwards compatibility) - [UI] Added Git hash and repo link to task execution detail view
- [UI] Added "Trigger" column to the task execution page to distinguish tasks triggered by Airflow or via
datafy run
- [UI] Display the task operator version in the task execution page
- [UI] Add a button to cancel a task execution
- [UI] Add option to wrap long lines in task logs view
0.56.5 (20-12-2021)
features
- [General]: add command to stop a project run, when your terminal gets detached.
- [General]: add gcc and g++ libraries to the notebook base image and extend the notebook documentation to describe how to install pyodbc.
bugfixes
- [General]: make sure a failed applicationEvent is sent when cancelling a manual project run for both spark and container tasks.
- [General]: fix crashlooping notebook when mounting secrets from SSM or Secretsmanager.
0.56.4 (13-12-2021)
features
- [CLI]: The CLI binary is now available for Apple Silicon.
- [UI]: All page headers are now collapsible
bugfixes
- [UI]: Fix redirect after login and logout
0.56.3 (9-12-2021)
features
- [Airflow]: Indicate whether airflow workers are killed due to spot termination (container, spark, container_sensor)
bugfixes
- [General]: When using V2 Operators we now enable the sts regional endpoints by default. This removes the dependency for using your IAM roles on the us-east-1 region and is recommended by AWS
0.56.2 (7-12-2021)
bugfixes
- [Streaming]: Fixed an issue where the spark application could not be created if the name of the application was too long
- [Notebooks]: Use the same service account pattern as projects running from airflow and streaming
- [Airflow]: ConveyorSparkSubmitOperatorV2 tasks with a
.
in the name could not be started properly - [General]: Performance improvements when processing task executions
0.56.1 (2-12-2021)
features
- [Notebooks]: Added possibility to work on datafy notebooks from your IDE
- [Docs]: Updated dbt tutorial using latest template version
- [Airflow]: test tasks can be turned on/off when using the dbt task factory
- [General]: Detect spot node interruptions and handle it as a specific failure for a container/spark job
bugfixes
- [CLI]: Removed some excess logging statements when doing a login
0.56.0 (30-11-2021)
features
- [CLI]: Automatically add the remote repo url in the project info at project build time when it was empty
- [General]: Release the beta version of the notebooks feature
- [General]: Map container start error to a pod failure state such that it can be shown in application runs
- [UI]: Add route to root from Conveyor logo
- [Template]: Upgrade templates to 0.15.3
bugfixes
- [Airflow]: Remove trailing dot from conveyor_ui_domain variable. This ensures that the airflow variables: base_url, jwt_audience are correctly set.
- [Airflow]: Support passing None values in env variables to conveyor_container_operator_v2
- [General]: Fix deletion of streaming applications for projects that have an underscore in the name
- [General]: Improve error message with serviceAccountName when assumeRole fails when fetching secrets
- [UI]: Correctly show start time and finished time in application runs details when the container is still pending
0.55.2 (22-11-2021)
bugfixes
- [General]: Fixed deleting of users with the operator role from an environment
0.55.1 (19-11-2021)
bugfixes
- [General]: Revert cleanup of project versions in ssm as it failes when there are more than 10 projects in the request
0.55.0 (19-11-2021)
features
- [Airflow]: Upgrade Airflow 2 to version 2.2.2
- [General]: Remove the use of terraform when creating/deleting environments. This makes the creation/deletion of environments faster
- [CLI]: Remove irrelevant warning about encryption when using airflow validate dags
bugfixes
- [General]: Fixed an issue with spark executor metrics not showing up
- [General]: Fixed a bug with mount ssm parameters or aws secrets manager secrets as environment variables. If you mounted the same secret twice but with a different path the application would never start
- [Airflow]: Open Application Runs Airflow button in new tab again, the behaviour got changed by upgrading to Airflow 2.2.x, but it makes more sense to open in a new tab by default
- [CLI]: Remove warning about validation being run with Airflow 2
- [CLI]: When checking if the Dockerfile exists during
datafy build
we know take the project config docker path into account - [Templates]: Correct resources S3 template to use
like
instead ofequals
in trust relationship condition
0.54.12 (15-11-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.12/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.12/conveyor_darwin_amd64.tar.gz
bugfixes
- [General]: Fixed an issue with the migration from RDS proxy to pg-bouncer not going smoothly
0.54.11 (14-11-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.11/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.11/conveyor_darwin_amd64.tar.gz
bugfixes
- [General]: Temporary rollback in the way we capture events of running applications
0.54.10 (11-11-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.10/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.10/conveyor_darwin_amd64.tar.gz
bugfixes
- [General]: Fixed an issue processing events of running applications
0.54.9 (10-11-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.9/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.9/conveyor_darwin_amd64.tar.gz
features
- [General]: Finished the migration away from the deprecated EFS provisioner. The new EFS file system we use is encrypted as well.
- [Airflow]: Make Airflow database migration more robust by giving them longer to finish.
- [UI]: Show the reason a job fails in the UI. We used to only detect out of memory issues, we now expanded this with secrets issues, image pull issues, etc....
bugfixes
- [General]: Fixed an issue with Spark event uploading where sometimes Spark did not clean up the in progress properly, resulting in two eventlog files on the system.
- [Airflow]: Merged an upstream Airflow patch to fix an issue with get_next_data_interval so that it does not fail when there is no next_run defined yet. This will be fixed in future Airflow releases as well.
0.54.8 (05-11-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.8/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.8/conveyor_darwin_amd64.tar.gz
features
- [General]: Enable KMS encryption for our SQS queues
- [Airflow]: Upgrade to Airflow 2.2.1
- [Spark]: Added Spark 3.2.0 image with support for Scala 2.13
- [Template]: Upgrade templates to 0.15.1
bugfixes
- [Airflow]: Fixed an issue where a spot interrupt could result in a false task success in Airflow for the ConveyorSparkSubmitOperatorV2
- [General]: Fixed an issue with cleaning up Spark applications where the driver node gets interrupted
- [General]: When using datafy run and printing big log lines, datafy run would crash. We now split these lines into multiple chunks fixing the issue.
0.54.7 (28-10-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.7/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.7/conveyor_darwin_amd64.tar.gz
features
- [General]: Remove RDS proxy as we switched to using pg-bouncer instead
- [General]: Migrate the dags volume to the encrypted EFS drive
- [CLI]: Update used templates to 0.15.0
bugfixes
- [General]: Fix slow deletion of environment, we were trying to delete files while they were in use. Now we make sure they are not in use before deleting them
0.54.6 (27-10-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.6/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.6/conveyor_darwin_amd64.tar.gz
features
- [General]: Upgrade node local dns to 1.21.1
- [General]: Use 1.0.0 of secrets csi driver on the Kubernetes cluster
- [Airflow]: Added the opsgenie provider to airflow 2
- [General]: Added support for spark 3.2.0
- [CLI]: Added explanation to datafy run about the default execution-date used to start your job
0.54.5 (26-10-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.5/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.5/conveyor_darwin_amd64.tar.gz
features
- [Airflow]: Migrate EFS logs storage to an encrypted volume
- [General]: Make datafy run more robust, and add extra logging when something goes wrong
0.54.4 (22-10-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.4/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.4/conveyor_darwin_amd64.tar.gz
features
- [General]: Upgrade to aws eks cni 1.9.3
- [General]: Migrate to new encrypted EFS volume for spark events
0.54.3 (20-10-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.3/conveyor_darwin_amd64.tar.gz
features
- [Airflow]: Support for environment variables when using the dbt task factory
- [General]: Upgraded components of the K8s cluster to newer version that use IMDSv2
- [General]: Enable deletion protection on the RDS database instance used by Airflow
- [General]: Enabled deletion protection and drop invalid headers on the ALB used by Conveyor
bugfixes
- [General]: Fix bug where Spark applications were never cleaned from Kubernetes when the Spark event log directory was never created
- [CLI]: Fixed an issue where the Conveyor yaml migration would update old project to automatically use Airflow 2 validation, now this defaults to using Airflow 1.10 again
0.54.2 (15-10-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.2/conveyor_darwin_amd64.tar.gz
bugfixes
- [General]: Make sure the necessary components for secrets are installed on all nodes
- [General]: Give a bit more memory to the components managing the metrics and the logs to keep them from running out of memory
- [General]: Make Airflow 2 validation the default for new projects, bringing it in line with Airflow 2 being the default for a new environment
0.54.1 (14-10-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.1/conveyor_darwin_amd64.tar.gz
features
- [General]: Remove the old (and unused) NLB, we migrated to an ALB in our previous release but kept this around to be able to reverse
bugfixes
- [CLI]: Fix
datafy run
and dag validation not working because of IAM credential issues.
0.54.0 (13-10-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.0/conveyor_darwin_amd64.tar.gz
features
- [Airflow]: ConveyorSparkSubmitOperatorV2 and ConveyorContainerOperatorV2 added support for mounting secrets from SSM and Secrets Manager as environment variables.
- [General]: Removed unneeded access to S3 from Conveyor clusters on other accounts
- [General]: Increased S3 security by not allowing non-SSL requests on Conveyor artifacts bucket
- [General]: Encrypt the root EBS volume of Kubernetes worker nodes
- [CLI]: Update templates to 0.14.0
0.53.3 (28-09-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.3/conveyor_darwin_amd64.tar.gz
features
- [UI]: Enable streaming UI by default, it was hidden behind a feature flag before
- [UI]: Enable RBAC UI by default, it was hidden behind a feature flag before
0.53.2 (28-09-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.2/conveyor_darwin_amd64.tar.gz
bugfixes
- [UI]: Fix issues with the logs page
0.53.1 (27-09-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [CLI]: Fix broken table output
0.53.0 (27-09-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.0/conveyor_darwin_amd64.tar.gz
features
- [General]: Release spark streaming support
- [General]: Upgrade postgresql 13
- [Airflow]: Upgrade to Airflow 2.1.4
- [General]: Make Airflow 2 the default version and deprecated Airflow 1 support, this means that in a future release Airflow 1 will be removed.
- [General]: Upgrade the aws k8s cni to version 1.9.1
- [General]: Adding support for mounting external environment variables
bugfixes
- [Airflow]: Handle connection issues with Kubernetes gracefully for V2 operators.
0.52.10 (16-09-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.10/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.10/conveyor_darwin_amd64.tar.gz
bugfixes
- [Airflow]: Fixed an issue where a spark submit could be scheduled a second time making the job fail in Airflow. This only happened very sporadically.
0.52.9 (16-09-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.9/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.9/conveyor_darwin_amd64.tar.gz
features
- [RBAC]: The Airflow and Spark UIs of a given environment are now only accessible for users who have the Operator/Contributor/Administrator role for this environment.
bugfixes
- [General]: Several small bug fixes in the code used by the Airflow v2 operator.
- [Airflow]: Fixing the dbt factory with dependencies on sources items.
0.52.8 (09-09-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.8/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.8/conveyor_darwin_amd64.tar.gz
features
- [RBAC]: Added the operator role to an environment. This role allows users to view and operate the airflow of an environment, but does not allow them to deploy new releases to an environment.
bugfixes
- [General]: Several small bug/performance fixes in the code used by the Airflow v2 operator.
0.52.7 (07-09-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.7/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.7/conveyor_darwin_amd64.tar.gz
bugfixes
- [General]: Fixed a bug where Airflow thought a job was failed but the datafy UI showed the application was finished correctly. While looking for this bug we enhanced our code to make it easier to figure out the root cause next time.
0.52.6 (20-08-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.6/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.6/conveyor_darwin_amd64.tar.gz
features
- [Airflow]: Slack providers available in Airflow
- [Airflow]: Tag filtering on dbt task factory
- [CLI]: Selecting the airflow version for validation and run
0.52.5 (18-08-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.5/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.5/conveyor_darwin_amd64.tar.gz
features
- [General]: Upgrade the eks cni used to 1.9.0
bugfixes
- [Airflow]: Make sure the datafy runs button goes to the correct page
- [Airflow]: Airflow 2.1 introduced a new
kubernetes
settingworker_pods_pending_timeout
that kills pods by default that are more than 300s pending. This sometimes resulted in jobs being killed, we are setting it to 600s by default and will look into making scheduling faster.
0.52.4 (17-08-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.4/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.4/conveyor_darwin_amd64.tar.gz
features
- [CLI]: Asking for the task when no task is passed to the run command
- [General]: Preparations for future releases
0.52.3 (13-08-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.3/conveyor_darwin_amd64.tar.gz
bugfixes
- [General]: Fixed an issue where our operator managing airflow, spark-, and container-runs was crashing from a nil pointer. We fixed the issue and made sure it can't crash because of a single nil pointer.
0.52.2 (12-08-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.2/conveyor_darwin_amd64.tar.gz
bugfixes
- [CLI]: Fixed a bug that broke the CLI
0.52.1 (12-08-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.1/conveyor_darwin_amd64.tar.gz
features
- [CLI]: When CLI is out of date recommend using datafy update to update the CLI
- [CLI]: Allow configuration files smaller then 1MB to be uploaded as part of the DAG folder
bugfixes
- [CLI]: Added retry logic when fetching logs with
datafy run
fails. - [Airflow]: Fixed an issue where the Airflow 1 graph view would crash when using the
ConveyorSparkSubmitOperator
. Old task executions might still have issues but not ones will work again. - [RBAC]: Fixed non-thread safe code in RBAC checking that resulted in a 500.
- [CLI]: Update templates to 0.12.1
0.52.0 (02-08-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.0/conveyor_darwin_amd64.tar.gz
features
- [Airflow]: Added on-demand options for the Airflow Conveyor V2 Operators. Look at the docs of ConveyorSparkSubmitOperatorV2 and ConveyorContainerOperatorV2 to find out more.
- [General]: We installed node local dns into the cluster, this should improve dns responsiveness and reduce errors. We also improved the robustness of the DNS setup
- [Aiflow]: Conveyor application runs button on operators now goes to the project's page i.s.o. environment.
- [Airflow]: Added s3 committer option for the Airflow Conveyor V2 Spark Operator, allowing the usage of the S3 magic committer. Look at the docs of ConveyorSparkSubmitOperatorV2 to find out more.
- [CLI]: Added
datafy update
command, this will replace your current executable with the newest version available. - [CLI]: Fixed homebrew installation of zsh autocomplete, and added fish autocompletion via homebrew
- [Doc]: Added documentation on how to install datafy completion scripts on linux.
0.51.0 (27-07-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.51.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.51.0/conveyor_darwin_amd64.tar.gz
features
- [Airflow]: Added ConveyorExternalTaskSensor
- [Airflow]: Upgrade to airflow 2.1.2
- [General]: Memory optimisations on the tools running on the cluster
- [General]: Upgrade the eks cni to the latest 1.8 version as recommended by aws, and corrected it's settings
- [General]: Upgrade eks cluster to version 1.21
- [General]: Use containerd as a container runtime i.s.o. docker
- [Spark]: released the following images with hadoop cloud support, and a python 3.8 installation with lmza support.
public.ecr.aws/dataminded/spark-k8s-glue:v3.1.2-hadoop-3.3.1-v2
public.ecr.aws/dataminded/spark-k8s-glue:v3.1.2-hadoop-3.3.1-python-3.8-v2
public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1-v2
public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1-python-3.8-v2
- [Airflow]: Added acryl-datahub python package to the airflow images
bugfixes
- [Spark History Server]: We give it a gigabyte more memory so it can keep up with a lot of spark jobs being scheduled
- [General]: Fixed an issue when calculating memory for the datafy instance types
0.50.2 (13-07-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.2/conveyor_darwin_amd64.tar.gz
features
- [Airflow]: Upgrade airflow 2 version to 2.1.1
- [UI]: Added environment and project links to the task executions page
bugfixes
- [General]: Fixed a bug were spark application sometimes weren't cleaned up properly
- [CLI]: Token was automatically refresh 10min after expiry, instead of 10 minutes before
0.50.1 (07-07-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [CLI]: Print the airflow version in environment list correctly
- [CLI]: Print logs correctly for the
datafy run
command
0.50.0 (07-07-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.0/conveyor_darwin_amd64.tar.gz
features
- [UI]: Added task executions to the project page
- [UI]: Task executors page make task type an icon
- [UI]: Allow task executions links to be opened in new tab
- [General]: Enable image scanning on push on project ECR images
- [General]: Upgrade eks to version 1.20
- [General]: Upgraded to terraform 1.0.1
- [Airflow]: The ConveyorContainerSensor has been released
- [Spark]: released the following images with improved logging output, and support for hadoop 3.3.1 and spark 3.0.3.
For migration to hadoop 3.3.1 see here.
public.ecr.aws/dataminded/spark-k8s-glue:v3.0.2-hadoop-3.3.0-v2
public.ecr.aws/dataminded/spark-k8s-glue:v3.1.2-hadoop-3.3.0-v2
public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1
public.ecr.aws/dataminded/spark-k8s-glue:v3.1.2-hadoop-3.3.1
- [Airflow]: ConveyorSparkSubmitOperatorV2 set setting
spark.hadoop.fs.s3a.aws.credentials.provider
by default to:com.amazonaws.auth.DefaultAWSCredentialsProviderChain
- [CLI]: Cleaned up the output of all commands
- [Template]: Update template to 0.11.0
bugfixes
- [Spark History Server]: Improvements to make the spark history server more stable
- [CLI]: Text was sometimes printing wrong when using a spinner resulting in words like:
imagege
,resourceses
...
0.49.3 (22-6-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.3/conveyor_darwin_amd64.tar.gz
bugfixes
- [General]: Fixed a small issue in Airflow authentication
0.49.2 (21-6-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.2/conveyor_darwin_amd64.tar.gz
bugfixes
- [General]: Fixed an issue with cleaning up old builds
0.49.1 (18-6-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [UI]: Fixed issues with rendering Airflow inside of our UI
- [General]: Fixed an issue with cleaning up old builds
0.49.0 (17-6-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.0/conveyor_darwin_amd64.tar.gz
features
- [UI]: Embedded airflow into the Conveyor UI. This means on the environment page you can look at Airflow without leaving the Conveyor UI. You can still open the Airflow UI full screen if you want to.
bugfixes
- [UI]: Fixed issues with pagination in the UI
0.48.1 (16-6-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.48.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.48.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [RBAC]:
datafy project run
resulted in unauthorized with RBAC enabled. - [UI]: Users panel project/environment role selection is empty after assigning to a user
0.48.0 (11-6-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.48.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.48.0/conveyor_darwin_amd64.tar.gz
features
- [CLI]: Adding often used project commands to the root command.
- [General]: Improved the cleaning up of old builds, this should reduce the bill on your cloud account.
- [Airflow]: Upgraded to Airflow 2.1.0. A known issue related to the auto-refresh on the dag tree view exists, and will be fixed in a next release of Airflow: https://github.com/apache/airflow/pull/16018/files
- [General]: Upgraded the version of Terraform used by the Conveyor agent to 1.0.0.
bugfixes
- [Airflow]: Fix CSRF issues after Airflow Web restart.
- [Airflow]: ConveyorSparkSubmitOperatorV2 no longer results in an error when inputting an integer as value in the Spark config.
- [CLI]: DAG validation no longer warns about Conveyor plugins in Airflow.
- [Airflow]: An edge case is fixed in the V2 Operators where failed applications were not properly detected.
- [CLI]: The description in
datafy environment new
anddatafy environment update
no longer says to enable experimental mode for Airflow 2 (as this is not the case anymore). - [General]: Fixed an issue where deleting a project could result in an update to an Airflow 2 environment failing.
- [General]: Allow CI/CD token access to the Airflow 2 API.
0.47.2 (03-6-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.2/conveyor_darwin_amd64.tar.gz
features
- [General]: Airflow V2 operator, and
datafy project run
now detect application being evicted because of disk pressure and will warn you about this happening. - [CLI]: Added execution date support to
datafy project run
.
bugfixes
- [UI]: The logs UI was fetching finished application logs in the wrong order, this is now fixed.
0.47.1 (02-6-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [Spark]: Fix Spark history server upload when spark application names are very long.
0.47.0 (01-6-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.0/conveyor_darwin_amd64.tar.gz
features
- [UI]: The UI received some new paint. We migrated to a new framework which will allow us to make a UI that is more uniform and easier to maintain.
- [CLI]: Restructured the output of
datafy project run
to make it more focussed - [General]: Using datafy Airflow operator V2 or
datafy project run
will now warn you if you use a container image that can't be pulled.
bugfixes
- [CLI]: Make
datafy project run
robust against print statements and logs in dags. - [Spark]: Fixed a bug where setting
spark.executor.cores
andspark.drives.cores
resulted in unauthorized with the ConveyorSparkSubmitterV2.
0.46.2 (27-5-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.2/conveyor_darwin_amd64.tar.gz
features
- [CLI]: Turn on quiet flag with env variable
QUIET=true
. - [Airflow]: Changed our recommendation warning in the ConveyorContainerOperator when setting cpu/memory limits. In short, you should not set these yourself.
- [General]: Make spark.driver.cores and spark.executor.cores user editable in the ConveyorSparkSubmitOperatorV2. This can be useful for IO/CPU bound jobs.
bugfixes
- [Airflow]: Fixed the ConveyorSparkSubmitOperatorV2 and ConveyorContainerOperatorV2 not sending application run events.
0.46.1 (26-5-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [General]: Bugfix where terraform wouldn't be destroyed when using
datafy project undeploy
- [Spark]: Fixed the executors page in the spark history server.
- [UI]: Spark UI button was not working.
0.46.0 (26-5-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.0/conveyor_darwin_amd64.tar.gz
features
- [Spark]: Added spark history server support when using the ConveyorSparkSubmitOperatorV2.
- [CLI]: Clarify
datafy project run
documentation. This does not support deploying resources, this is made clear in the docs.
bugfixes
- [General]: Long project, dag or task name could result in jobs not being scheduled this is now fixed
- [General]: When pending application are canceled they would stay in pending forever. They are now set to failed.
- [Airflow]: Fixed an issue when running an extra datafy cluster in another region/account.
0.45.3 (20-5-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.3/conveyor_darwin_amd64.tar.gz
bugfixes
- [Airflow]: Fixed some caching issues with forwarding the Airflow UI
0.45.2 (20-5-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.2/conveyor_darwin_amd64.tar.gz
features
- [Airflow]: Added ui colors to v2 operators
- [Airflow]: ConveyorSparkSubmitOperatorV2 add support for
env_vars
- [Airflow]: ConveyorContainerOperatorV2 added support for legacy kube2iam way of assuming roles.
bugfixes
- [Airflow]: ConveyorSparkSubmitOperatorV2 and ConveyorConainerOperatorV2 would result in failures when project had underscore.
The service account that contains the project name replaces underscores
_
with dots.
0.45.1 (19-5-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.1/conveyor_darwin_amd64.tar.gz
features
- [CLI]:
datafy project upgrade-dags
also replaces the import for ExternalTaskSensor
bugfixes
- [Airflow]: Fix issues with long airflow task names and V2 operators
- [Airflow]: Fix issues with characters in task names that are now allow on Kubernetes for V2 operators
- [UI]: Empty log pages will be skipped in cloudwatch to return the first non-empty page that can be found
0.45.0 (18-5-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.0/conveyor_darwin_amd64.tar.gz
The main feature of this release is the release of new version of our Airflow operators with datafy project run
support.
This allows you to locally start a job on the remote cluster without having to build, deploy and clear the task in Airflow.
features
- [Airflow]: Airflow 2.0 has been upgraded to 2.0.2
- [Airflow]: Release ConveyorSparkSubmitOperatorV2and ConveyorContainerOperatorV2
- [CLI]:
datafy project run
for airflow tasks that use the new V2 Operators - [CLI]: Dag validation also runs the NoAdditionalArgsInOperatorsRule from airflow, as undefined args give an error in Airflow 2.0.
- [Documentation]]: Updated documentation structure
- [Template]: Set default templates version to 0.9.0.
bugfixes
- [CLI]: Fixing support for resource templates
- [General]: When deleting a project environments weren't properly updated.
- [General]: Make our Kubernetes setup more tolerant to spot interruption failures.
0.44.5 (03-5-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.5/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.5/conveyor_darwin_amd64.tar.gz
bugfixes
- [Airflow]: Extending the job_heartbeat_sec from 5s to 10s and configuring a db connection timeout of 30 seconds to avoid jobs failing when missing a heartbeat.
0.44.4 (29-4-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.4/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.4/conveyor_darwin_amd64.tar.gz
features
- [Agent]: Upgrade our agent to use terraform 0.14.11
bugfixes
- [CLI]: CLI downloading was failing for the endpoint: https://app.conveyordata.com/api/info/cli/location/linux/amd64
0.44.3 (28-4-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.3/conveyor_darwin_amd64.tar.gz
bugfixes
- [Internal]: Reduce CPU load on airflow manager
0.44.2 (26-4-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.2/conveyor_darwin_amd64.tar.gz
bugfixes
- [UI]: Some CSS changes made the logs font too big, some buttons overflow etc. We fixed these changes.
0.44.1 (26-4-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [General]: Airflow RDS proxy is not available in all regions, so disable the use in region that don't have it available
0.44.0 (26-4-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.0/conveyor_darwin_amd64.tar.gz
features
- [Airflow]: Airflow 2.0 has been made generally available.
- [Airflow]: We use a new way of deploying to Airflow, further speading up deployments.
- [Airflow]: The ConveyorContainerOperator does not require you to fill in the name parameter anymore.
- [Airflow]: Airflow 2.0, enable access to the API using a datafy token. You can get the token with
datafy auth get
, look in the docs for more info. - [Airflow]: Set default parallelism to 64 up from 32, and default dag concurrency to 32 up from 16.
- [CLI]: Remove experimental flag from cluster commands as this is GA.
- [Template]: Set default templates version to 0.8.1.
bugfixes
- [General]: Do not allow to deploy, undeploy, delete an environment that is being deleted.
- [UI]: Show friendly message when log aren't available yet instead of generic error
- [Airflow]: Fixed a bug that stopped dags from being synced.
- [Airflow]: Added a proxy to the Airflow Database so we can handle more connection to the database then before.
0.43.1 (12-4-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.43.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.43.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [General]: Update agent terraform to 0.14.10, and update the terraform locked providers
0.43.0 (12-4-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.43.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.43.0/conveyor_darwin_amd64.tar.gz
features
- [DOCS]: Added Airflow 2.0 migration path to the docs.
- [Airflow]: Support for Airflow 2 has been made available for environments with the experimental flag enabled.
- [General]: Experimental environment can enjoy an even faster upgrade experience.
- [General]: Experimental on an environments can not be disabled anymore. This allows us to only have make sure migration work in one way.
- [CLI]: The command
datafy project upgrade-dags
, will update your dags for usage with Airflow 2. Your dags will still work on Airflow 1. - [UI]: Revamped the logging UI. It will now show the latest logs when a job is finished. You can also choose to see the latest logs when the job is running.
- [Templates]: Are upgraded to 0.8.0
0.42.0 (7-4-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.42.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.42.0/conveyor_darwin_amd64.tar.gz
features
- [General]: The experimental dag syncer has now become the default making deploys at least twice as fast!
- [General]: The terraform version used by our agent has been upgraded to 0.14, to enable this transition we added an automatic upgrade process from 0.12 to 0.13 to 0.14 to the agent.
- [General]: Added experimental flag to environment to enable experimental features. This flag will be used in the future to test new features. At the moment behind this flag we test a new way of deploying airflow.
- [Airflow]: Upgraded airflow to version 1.10.15
- [Airflow]: Added Conveyor macros validation to Conveyor dag validation. Macros should be changed from
macros.env
tomacros.datafy.env
- [Docs]: We added docs about airflow alerting using the ConveyorContainerOperator
bugfixes
- [Airflow]: Fixed a bug when using dag syncer where dag files where unavailable for a short time.
0.41.0 (24-3-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.41.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.41.0/conveyor_darwin_amd64.tar.gz
features
- [General]: EKS 1.19 upgrade, if you have the following in your DAG somewhere you can safely remove it:
security_context={
"fsGroup": 185,
}
- [General]: Updated the documentation url to https://docs.datafy.cloud
- [Airflow]: Airflow 2.0 warnings during dag validation
- [Airflow]: Made the Airflow UI more stable by running the http proxy component on fargate.
- [General]: Send less information to cloudwatch to reduce costs.
bugfixes
- [CLI]: Ignore the possible
__pycache__
folder in the dags folder when building a project - [UI]: Fixed issue where we failed in parsing the logs from cloudwatch to be shown in the UI
0.40.2 (19-3-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.2/conveyor_darwin_amd64.tar.gz
bugfixes
- [UI]: Fixed an issue where certain logs couldn't be shown in the UI
0.40.1 (16-3-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [UI]: Fixed an issue where newer and older logs buttons where not working anymore
- [Airflow]: Fixed an issue with experimental dag sync
0.40.0 (12-3-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.0/conveyor_darwin_amd64.tar.gz
If you are using resources with datafy we added default provider "aws" when deploying your code. If you have a provider "aws" without an alias in your code this can break things, just rename your provider to use an alias or remove it.
features
- [CLI]: Added build arg support to the CLI
- [UI]: Application runs page has a spark UI button added and the logs page is the default i.s.o. the metrics page
- [UI]: Application runs are now clickable on the environment page
- [Airflow]: Added and experimental dag sync way of deploying dags to airflow.
This should make deploys quite faster (in the range of 20 - 40s for a deploy).
You can test this out in your development environment by updating the environment and using the flag
--experimental-dag-sync=true
. To update your environment do:datafy environment update --name YOURENVIRONMENT --experimental-dag-sync=true
. However it is not recommended for production. - [Spark]: spark 3.1.1 images release:
public.ecr.aws/dataminded/spark-k8s-glue:v3.1.1-hadoop-3.3.0, datamindedbe/spark-k8s-glue:v3.1.1-hadoop-3.3.0
bugfixes
- [CLI]: When creation of a project fails we now clean up the project that was generated if you used a template
- [UI]: When pressing refresh logs button on the main logs page the URL path would get into an undefined state, you couldn't share this URL. This is now fixed.
0.39.1 (01-3-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.39.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.39.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [Airflow]: Fixed an issue rolling out or creating a new airflow environment
0.39.0 (01-3-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.39.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.39.0/conveyor_darwin_amd64.tar.gz
features
- [UI]: Added a refresh button to the application run log view
- [General]: The
image
field does not need to be specified to use Conveyor Airflow operators anymore, it uses the name of the project by default. - [General]: There is no restriction anymore on the IAM role names that the Conveyor operators can use.
There used to be a constraint where the role needed to have the prefix
datafy-dp-{env}
but not anymore. - [General]: Conveyor instance types now support all spark memory options
- [General]: We updated to spark to version 3.0.2. New docker image available, see here.
- [Templates]: Updated the templates to use the spark 3.0.2 image.
- [Documentation]: Updated the CI/CD authentication documentation, see here
- [Documentation]: Added a documentation page about setting up Conveyor using WSL 2 on Windows, see here
0.38.0 (15-2-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.38.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.38.0/conveyor_darwin_amd64.tar.gz
features
- [General]: Allow the terraform resources to attach existing AWS policies
- [General]: Support datafy instance types. See here and here
- [General]: Optimise IP usage by the Kubernetes cluster
- [Templates]: Update templates to the latest version. See here for release notes
0.37.0 (29-1-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.37.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.37.0/conveyor_darwin_amd64.tar.gz
features
- [UI]: Added logs to the application runs UI. From now on you can view the logs of your application directly in the Conveyor UI instead of being redirected to AWS CloudWatch.
- [CLI]: Added flag
--no-browser
to the CLI. This prevents the Conveyor CLI from opening a browser automatically, but instead prints the url for loging in.
bugfixes
- [CLI]: Fixed wrong upload message when deploying a project.
- [CLI]: Print an error when files needed can't be read during project build instead of silently crashing
- [General]: Fixed one last small instance where we used a docker hub image instead of public ECR
- [General]: If using project resources and if you still had a
state.tf
file things would break because of a new terraform Kubernetes provider release. We now fixed this.
0.36.0 (15-1-2021)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.36.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.36.0/conveyor_darwin_amd64.tar.gz
features
- [Airflow]: We are migration certain pieces to be airflow compatible. This should have no impact on your environment:
- Migrated our spark submit operator to use the backported spark submit operator from Airflow 2.0.
- We migrate the Kubernetes executor config to be Airflow 2.0 compatible
- [CLI]: Upgraded to datafy-template 0.4.0 which uses the new import paths for operators used in Airflow 2.0
- [General]: Migrated to use public ECR where possible
bugfixes
- [UI]: In the metrics page certain charts where not properly unloaded resulting in a memory leak and a slow page
- [UI]: Fixed a bug where the metrics page could crash when no spark executors were started up.
- [UI]: Fixed a small bug where we were showing a wrong message in the metrics page when your application was not running for long enough
- [General]: removed m5zn type aws instances from autoscaling groups as they gave issues with Kubernetes.
- [CLI]: Skip airflow dag validation in
datafy project build
when the project does not contain dags - [CLI]: Fixed the wrong error message when trying to apply a non-existing template in
datafy template apply
0.35.1 (30-12-2020)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.35.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.35.1/conveyor_darwin_amd64.tar.gz
features
- [General]: Migrated to use the new GP3 volumes instead of GP2, these are 20% cheaper, provide more IOPS and same throughput. This will make IO intensive jobs on the platform faster.
- [General]: The Kubernetes autoscaler will now downscale nodes after 5 minutes instead of 10m.
- [General]: Made our spark images available on ECR public: https://gallery.ecr.aws/dataminded/spark-k8s-glue
- [CLI]: Updated to the latest datafy templates release
bugfixes
- [UI]: Changed the airflow logo to the new flat version
- [UI]: Fixed the Conveyor logo to work on all platform (unix, mac and windows)
- [UI]: Spark jobs with a lot of executors (40+) could not show all of their executors in the legend of the metrics page. This has been fixed.
- [UI]: Metrics page, failure reason did not show properly when hovering over the information icon.
- [General]: Better cleanup of resources in the Kubernetes cluster after building airflow images
- [General]: Installation of the Conveyor cluster would fail in an existing VPC network. This has been fixed
- [General]: Migrated the installation of the kube2iam Helm chart to the new version
0.35.0 (22-12-2020)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.35.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.35.0/conveyor_darwin_amd64.tar.gz
features
- [Airflow]: Upgraded airflow to 1.10.14 this should fix some scheduling issues users experienced with
depends_on_past
ortask_concurrency
. For more information see the airflow release notes. - [General]: Made environment rollout 10 to 20s faster by caching the terraform providers used when rolling out an environment.
bugfix
- [CLI]: Dag validation was failing when airflow variables were used in the dags. This has been fixed in this release.
0.34.1 (09-12-2020)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.34.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.34.1/conveyor_darwin_amd64.tar.gz
features
- [General]: Added an experimental feature that allows you to restrict which aws roles a project can use.
This feature is hidden behind an experimental flag and can be enabled in the CLI by setting the environment variable
CONVEYOR_EXPERIMENTAL=1
. More documentation on how to use this feature will follow later. - [General]: We added the new m5zn instance in our spot pools for Kubernetes. That way we have even more instance types to choose from, this should result in less spot interrupts.
- [Templates]: Update to templates version 0.3.2
bugfix
- [General]: Because of our rewrite of our control plane we had a regression. When you delete a project we normally trigger it to be removed from all environments. This behaviour has been reinstated.
- [CLI]: Airflow dag validation failed if you imported for example a utils file from your dags folder into your DAGs. We now do validation correctly so that this isn't mistakenly flagged as an issue.
- [CLI]: Airflow dag validation failed when you were using it outside the aws eu-west-1 region. This has been fixed.
- [UI]: When you opened an application run on a spark application without executors. The UI would still try to fetch the metrics for these executors resulting in a crash.
0.34.0 (04-12-2020)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.34.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.34.0/conveyor_darwin_amd64.tar.gz
features
- [CLI]: Added dag validation to the CLI. When you do a
datafy project build
a dag validation phase is done first. This just checks if your DAG can run on airflow. You can skip this phase if you want to. You can also use thedatafy project validate-dags
to validate the dags of your project without doing a build. - [Airflow]: Upgraded Airflow to 1.10.13
- [Doc]: Added documentation on how to run a container as non-root with the ConveyorContainerOperator.
- [General]: Upgraded the k8s environment to EKS 1.18
- [General]: We now run the k8s autoscaler as a high priority pod, so that it always gets priority. This makes sure the cluster can always autoscale when needed.
- [General]: We enabled the free autoscaling metrics for our autoscaling groups. That way you can see when the limit is reached, and we can more easily recommend when to scale it up.
bugfix
- [CLI]: The Conveyor CLI returned exit code 0 when a build failed. This has now been fixed to return an exit code 1. This makes it easier to chain multiple commands or to find out in CI/CD that a build has failed.
- [Airflow]: When you manually trigger a job and then use the Conveyor application runs button you would find nothing. This has now been fixed.
- [General]: Updating a project description failed when there were more than 256 characters used. We updated this field to take a bigger amount of text.
- [General]: Fixed the Conveyor logo in the UI
0.33.1 (20-11-2020)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.33.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.33.1/conveyor_darwin_amd64.tar.gz
bugfix
- [UI]: Fixed a bug where we showed wrong finished date in application run detail page
- [UI]: We showed a Nan duration when the application was not yet finished in the application run detail page. Now we show the current duration
- [UI]: We showed a message that no metrics where available yet while they were visible. This is now fixed
- [General]: By accident we used an image from docker hub that ran into rate limiting. We now also copied that one to ECR.
0.33.0 (20-11-2020)
CLI
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.33.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.33.0/conveyor_darwin_amd64.tar.gz
features
- [General]: We now offer CPU/Memory metrics of your jobs running on Conveyor. If you go to the application runs page, you can for every run go to the metrics of that run.
- [CLI]: We now show a warning when you are trying to deploy new dags or resources for your project, but first forgot to do a build.
- [CLI]: When an undeploy or promote fails we know show you the last event on that environment, similar to how deploys work.
- [General]: We have updated our API to a new technology to be sure we can continue on growing. This means that old version of the CLI (< 0.30.0), might not fully work any longer so please upgrade to the latest CLI version
0.32.0 (16-11-2020)
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.32.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.32.0/conveyor_darwin_amd64.tar.gz
features
- [CLI]: Conveyor project build now uses the same credentials als the docker CLI. Allowing you to pull images from private registries
- [CLI]: The
datafy project undeploy
anddatafy project promote
command now also show the latest event when it fails like thedatafy project deploy
command. - [General]: Send CPU and Memory metrics of jobs to cloudwatch. Later we will make these metrics available in the UI.
- [General]: Duplicated the spark images on docker hub on ECR and shared it with our customers. This is a temporary fix for docker hub rate limiting untill aws releases their solution. For more info read the following article: https://aws.amazon.com/blogs/containers/advice-for-customers-dealing-with-docker-hub-rate-limits-and-a-coming-soon-announcement/. The repository is: 776682305951.dkr.ecr.eu-west-1.amazonaws.com/datafy/data-plane/mirroring/datamindedbe/spark-k8s-glue
0.31.2 (30-10-2020)
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.2/conveyor_darwin_amd64.tar.gz
bugfixes
- [Application runs]: The application runs app was crashing because of a value that overflowed a 32bit integer
0.31.1 (30-10-2020)
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [Agent]: When manually deleting the role created with project resources, the agent would fail applying the terraform update because of not enough AWS IAM rights
- [Airflow]: There was an error popping up on airflow that did not fail your job but was confusing we have fixed so you should not see this error anymore.
- [Templates]: Bumped templates to the next version, this fixed problems with the dbt template: https://github.com/datamindedbe/datafy-templates/releases/tag/0.3.1
0.31.0 (30-10-2020)
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.0/conveyor_darwin_amd64.tar.gz
features
- [General]: Added description field to environment.
- [UI]: Added a markdown editor for the description field for project and environment. This allows you to give users more context to your projects and environment.
- [UI]: Added a git repository to projects. By filling in this field we can provide link to the actual git hashes that deployments were build with.
- [Doc]: Added more info on the available parameters for the Conveyor spark and container operators
- [General]: We mirrored all images used in the Kubernetes cluster on ECR. Since docker hub has started rate limiting users.
bugfixes
- [General]: When deleting a project something went wrong in are backend causing the project to remain stuck in deleting mode
- [CLI]: When something went wrong with generating projects we sometimes produced a very cryptic long error message. The unnecessary parts were cleaned up and the error message is smaller now.
0.30.0 (23-10-2020)
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.30.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.30.0/conveyor_darwin_amd64.tar.gz
features
- [CLI]: Improved date printing to be less verbose
- [General]: Added created at field to all objects. These can be seen in the UI and the CLI.
- [General]: Added last activity field to projects. This field shows the last build/deployment done for a project and can be used to see how inactive a project is. This can help you decide if a project is actively developed or not.
bugfixes
- [CLI]: Creating a new project with a template that is from a git repository was broken. It is now fixed
0.29.2
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.2/conveyor_darwin_amd64.tar.gz
bugfixes
- [Internal]: This is a release with only internal changes in the way we capture metrics
0.29.1
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [CLI]: When you create a build and have untracked files you would get a non dirty git hash, this has been fixed
- [General]: Fixed a bug where if an application got in a certain state the airflow ui would stop working and the application runs would stop updating.
0.29.0
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.0/conveyor_darwin_amd64.tar.gz
features
- [General]: We now add git information to a build. When you do
datafy project build
we store the git hash that has been used to create this build. If you have uncommited changes we add the.dirty
appendix. You can see this information when you usedatafy project deployments
ordatafy environment deployments
or in the UI. If the build is done outside a git repo no information is added. You need to upgrade to the CLI 0.29.0 to take advantage of this feature. - [General]: Better project and environment cleanup. When removing project we now delete the associated files on S3. We also clean up the associated ssm parameters we are using to track deployed versions on environments.
- [Documentation]: The spark 3 docker images don't run under root anymore. But this makes pip install fail. We documented the way to do this properly here.
- [Documentation]: Added documentation on the changes in the spark 3.0.1 image here.
- [Templates]: Upgraded to the latest version of the templates, see release note here.
bugfixes
- [CLI]: Fixed a typo in delete environment description.
0.28.0
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.28.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.28.0/conveyor_darwin_amd64.tar.gz
features
- [Documentation]: We got a bug report that in certain use cases logs for a ConveyorContainerOperator did not show up in both Airflow and Cloudwatch. The reason for this is that python print statements don't behave properly in a production environment. It's best to use the python logging framework for more information see the FAQ.
- [Airflow]: We only keep the last 14 days in airflow logs. This is the same retention we have in cloudwatch now. This should make the logs volume smaller and result in a cost reduction.
- [Airflow]: We added a liveness probe to the airflow scheduler to check if it's still running and restart it if it isn't.
- [UI]: When your ConveyorContainerOperator application dies with an out of memory error we show this in the UI. When you use the ConveyorSparkSubmitOperator we show you when your driver has died with an out of memory error.
bugfixes
- [Airflow]: There was a bug in the airflow UI that could show you were logged in under another user. This is now fixed.
- [Airflow]: When you have a DAG in airflow with many tasks. The frontend code in airflow makes it very slow to open the modal when you click a task. We set the standard number of runs in the tree view to 5 down from 25 as this helps the javascript code to be faster. See this ticket in airflow Jira.
- [CLI]: When building a docker image that takes longer than 15 minutes you got a timeout error. This limit has been raised to an hour.
- [General]: When deleting an environment we now also cleanup the database associated with it.
0.27.0
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.27.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.27.0/conveyor_darwin_amd64.tar.gz
features
- [Spark]: Release spark image with spark 3.0.1 and hadoop 3.0.0 support
bugfixes
- [Airflow]: The official airflow runs under user 50000 which is more secure than running under root. But all airflow logs were still owned by user root. So we added chown to update those to the scheduler. However, this can take a very long time for production airflow instances. We now do this seperate from the scheduler and only once. So the scheduler will start up fast again once we have upgraded.
- [Airflow]: The list task instances page had log buttons that redirected to the wrong page. This is now fixed.
- [Airflow]: We got a bug notification that editing a dag run showed a forbidden page. However, editing dag runs has been deprecated in airflow. The button will be removed in a later release. For more info see here and here.
- [Templates]: We upgraded to version 0.2.2 of the templates that contains 3 bugfixes. For more information look here.
- The pyspark template spark 2.4 support was fixed.
- When using the python image when you don't want role management we clean up the terraform files
- We upgraded to spark 3.0.1 and hadoop 3.0.0
- [General]: Creating projects that start with a dash - or underscore _. Resulted in a failure now we don't allow such name schemes anymore.
0.26.0
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.26.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.26.0/conveyor_darwin_amd64.tar.gz
features
- [Agent]: We reworked the agent in the background to be more future-proof. Normally this should have no impact for you but it allows us to make more/faster progress in the future. We also created more end-to-end tests to have more quality checks before we release.
- [Airflow]: We upgrade to the official airflow image here. Again this should have no impact for users but allows us to make fast progress in the future.
- [General]: The bastion used for
datafy forward
has been removed since it is not needed anymore. This will result in a small cost saving.
bugfixes
- [Airflow]: When going to view a rendered task instance you got an error. This has now been fixed and rendered task instance can be shown in the UI.
- [CLI]: On linux generating templates resulted in files being created with root ownership this has been fixed.
- [CLI]: The unlock, get, events command did not work on environments in a failed state. This has been fixed.
0.25.1
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.25.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.25.1/conveyor_darwin_amd64.tar.gz
bugfixes
- [CLI]: Fixed a bug where
datafy template apply
did not work with resource templates
0.25.0
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.25.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.25.0/conveyor_darwin_amd64.tar.gz
CLI1 is deprecated, but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.
features
- [Airflow]: Upgraded airflow to 1.10.12
- [Airflow]: Upgraded to use the RBAC UI of airflow. The RBAC UI of airflow is the new UI. Only this UI will receive updates and new features. In airflow 2.0 the previous UI will be deprecated.
- [Airflow]: Added links to the datafy application runs dashboard from airflow when you select a task in airflow. See picture below. It automatically filters to show the runs of this dag and task.
bugfixes
- [General]: Show better error message when deleting an environment that has deletion protection turned on
0.24.2
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.2/conveyor_darwin_amd64.tar.gz
CLI1 is deprecated, but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.
bugfixes
- [Airflow]: The Kubernetes pod operator has support for specifying ephemeral storage on Kubernetes since airflow 1.10.11. But they way they support it they set it to zero by default. In this release we set the limit to 100Gi by default so that jobs using local storage can continue working. For bugs logged in airflow https://github.com/apache/airflow/issues/9827 and https://github.com/apache/airflow/issues/9812. The issue if fixed in this PR: https://github.com/apache/airflow/pull/10084/files. Once that is available in a release we will remove our fix.
0.24.1
CLI2
brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.1/conveyor_darwin_amd64.tar.gz
CLI1 is deprecated, but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.