Skip to main content

Release notes

1.16.1 (18-04-2024)

security

bugfixes

  • [CLI]: Remove leftover debug output

1.16.0 (16-04-2024)

preview

  • [IDE]: This release introduces support for creating custom base images for your IDEs and share them across projects. This feature is currently in preview, if you want to try it out, you can start by taking a look at the introduction video or the how-to-guide for more details.

bugfixes

  • [Airflow]: Correctly detect when an invalid container image is provided as argument of the Conveyor operators.
  • [UI]: Disable downloading of streaming application logs for now until we are sure how to handle it.
  • [UI]: Fix an issue where navigating the streaming application logs worked incorrectly.

features

  • [IDE]: Allow editing user settings within an IDE.
  • [Airflow]: Added the posibility to use the xcom value from a ConveyorContainerOperatorV2 in dynamic task mapping for more info see here

1.15.7 (09-04-2024)

features

  • [IDE]: Made IDE's more robust to memory kills, the IDE service itself has a memory limit set, so instead of crashing the whole container only the IDE service itself will restart. This way your local changes to the IDE or your code do not get lost!
  • [IDE]: Allow users to remove IDE settings through the UI and CLI

1.15.6 (09-04-2024)

Release skipped

1.15.5 (03-04-2024)

features

  • [General]: We have a new logo!
  • [Airflow]: Upgrade Airflow to 2.8.4
  • [AWS]: Upgrade karpenter from v0.35.1 to version v0.35.4

bugfixes

  • [UI]: Filtering the application logs on a term containing parentheses works again.
  • [AWS]: Improve the handling of missing timestamps in the application logs.
  • [General]: The allowed characters for project names haven been made more strict. This prevents issues when creating the container repository for a project.
  • [General]: Deleting the same project multiple times will no longer result in an error.
  • [General]: Improve the detection of spot node interrupts in Airflow.

1.15.4 (28-03-2024)

bugfixes

  • [General]: Fix an issue with cleaning up old builds
  • [General]: Fix an issue with new auth0 users not having the tenant information attached

1.15.3 (26-03-2024)

features

  • [Spark]: Introduce mode cluster-v2 in the ConveyorSparkSubmitOperatorV2. This mode speeds of the launching of your spark job, by at most 3 minutes. It is the new default mode when using conveyor run, but can also be set in the operator. After extensive testing and feedback this will become the default.
  • [Airflow]: Allow xcom to be used with ConveyorContainerSensors in the same way as the ConveyorContainerOperatorV2
  • [Spark]: Automatically prefix the application property of the ConveyorSparkSubmitOperatorV2 with local:// if it is not present. For more details look here

bugfixes

  • [Airflow]: Revert acryl-datahub-airflow-plugin to version 0.12.1 as the latest version introduced a breaking change in the Ownership model.

1.15.2 (19-03-2024)

features

  • [Airflow]: Upgrade Airflow to 2.8.3

bugfixes

  • [General]: Fixed a bug where canceling an application might result in the state getting stuck in canceling. This happens when a canceled application completes successfully in the 30s window where a canceled application can cleanly shut down

1.15.1 (13-03-2024)

features

  • [AWS]: Upgrade EKS VPC CNI from v1.15.5-eksbuild.1 to version v1.16.4-eksbuild.2

bugfixes

  • [UI]: Improve the error handling when updating the environment/project settings
  • [UI]: Ensure chart area of the executor metrics remains visible when many executor need to be shown
  • [UI]: Fix regression in the text color of Airflow tasks when using dark mode

1.15.0 (05-03-2024)

features

  • [Airflow]: Upgrade Airflow to 2.8.2
  • [Airflow]: Upgrade acryl-datahub-airflow-plugin to 0.12.1.5
  • [Spark]: Remove support for spark_main_version variable, this was only needed for spark 2.x support, but that has been deprecated for a long time
  • [DBT]: Released new dbt image: public.ecr.aws/dataminded/dbt:v1.7.8
  • [IDE]: Upgrade code-server to v4.21.1
  • [Spark]: We've released new images for Spark 3.5.1, this also updates the delta lake (3.1.0) and iceberg (1.4.3) libraries:
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.5.1-hadoop-3.3.6-v1
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.5.1-2.13-hadoop-3.3.6-v1
  • [Templates]: Upgraded templates to 1.6.0.

bugfixes

  • [UI]: The state of the Airflow UI is now fully synchronized to the page URL, allowing for easier link sharing and navigation.
  • [UI]: Fixed an issue with IDE build permission in the UI
  • [UI]: Fixed a bug where undeploying a project would not update the deployment list
  • [UI]: Make sure the error message is correctly shown in the UI when something goes wrong.
  • [General]: Fixed the processing of errors of jobs to clearly label disk pressure issues

1.14.8 (22-02-2024) - Hotfix

bugfixes

  • [Azure]: Hotfix for node creation on Azure

1.14.7 (21-02-2024)

bugfixes

  • [Docs]: Remove trailing slashes from URLs so that links keep on working.
  • [CLI]: The previous release broke interactive task selection in the conveyor run command, this now works again.
  • [General]: Improve handling of cancelled tasks

1.14.6 (14-02-2024)

features

  • [Airflow]: Add a new macro to fetch the project that a task belongs to
  • [dbt]: Release a new dbt image, based on dbt 1.7.7
  • [IDE]: Upgrade code-server to v4.20.1
  • [UI]: Log filtering is now case-insensitive

bugfixes

  • [IDE]: The spark-history executors page will now be properly rendered when ran inside an IDE.
  • [UI]: Fix an issue where the Spark Submitter logs failed to show.
  • [UI]: Fix a visual bug when deleting two projects after each-other.

1.14.5 (31-01-2024) - Hotfix

features

  • [Docs]: The implementation of the search functionality has been changed to offer better results.

bugfixes

  • [General]: Fix our integration with Auth0

1.14.4 (30-01-2024)

features

  • [Spark]: Our ConveyorSparkSubmitOperatorV2 now supports setting the --verbose flag on the spark-submit command
  • [Airflow]: This change has also been incorporated in the new version of the types-conveyor package

bugfixes

  • [AWS]: We now correctly handle an additional error case during secret mounting.
  • [AWS]: Switch to using the regional STS endpoints for the cloudwatch agent.

1.14.3 (23-01-2024)

features

  • [UI]: Show the logs of an IDE build
  • [UI]: Show the SSO groups of users in the UI
  • [CLI]: Allow changing the spark image used by conveyor project spark-history
  • [Airflow]: Update python types-conveyor package to version 0.0.6, take a look here
  • [Terraform]: Released a new version of the terraform provider

bugfixes

  • [UI]: Make sure IDP Group mapping syncing happens on every logout/login cycle
  • [CLI]: Make it possible to open the spark history in a Conveyor IDE
  • [IDE]: sudo is no longer required in order to update the Conveyor CLI within an IDE

1.14.2 (17-01-2024)

bugfixes

  • [General]: Fixed an issue when tailing logs of pods. When using conveyor run or conveyor ide cache, sometimes errors could occur when following the logs. We now wait until the node is fully ready before trying to show the logs.
  • [IDE]: Fixed the "failed writing project config" error when executing conveyor run inside IDEs.
  • [IDE]: Properly set the project folder used by git clone during IDE startup
  • [IDE]: The .bashrc config file should now remain constant between restarts

1.14.1 (15-01-2024)

features

  • [CLI]: You can now change where the conveyor CLI stores settings and tokens by setting the CONVEYOR_HOME environment variable
  • [IDE]: Upgrade code-server to v4.20.0
  • [IDE]: Upgrade sysbox to version 0.6.3
  • [IDE]: Added to possibility to use Snowflake setup with SSO to be used inside a Conveyor IDE, look at the how-to-guide for more details
  • [AWS]: Upgrade eks to 1.28
  • [Azure]: Upgrade aks to 1.28

bugfixes

  • [CLI]: Using conveyor run for a job that failed with secrets access would not give any helpful error message. The error message is now properly propagated.
  • [Airflow]: Fixed an issue were the scheduler could stop properly scheduling when sending out alerts when a dag run fails

1.14.0 (10-01-2024)

features

  • [General]: Allows teams to be mapped to groups from the SSO provider
  • [Airflow]: Support the automatic configuration of connections/variables for your Airflow environments. The content is loaded from a secret store (e.g. AWS secrets manager or Azure Key vault). For more information look here
  • [AWS]: Added support for C instances, for information on cpu and memory see Instances
  • [UI]: Added support for undeploying a project from an environment in the UI
  • [UI]: Added multiline parsing for python and java stack traces. This makes it easier to see your stack trace when using search functionality
  • [IDE]: Sped up shutdown of the IDE when suspending, this means the snapshot process can start earlier and suspending will be faster
  • [General]: Switch links in our emails from the legacy https://datafy.cloud... to https://conveyordata.com... domain.
  • [CLI]: Improve conveyor auth configure documentation
  • [AWS]: Upgrade coredns to version v1.10.1-eksbuild.6
  • [AWS]: Upgrade karpenter from v0.32.4 to version v0.33.1
  • [AWS]: Upgrade eks vpc cni from v1.15.3-eksbuild.1 to version v1.15.5-eksbuild.1

bugfixes

  • [Airflow]: Bring back the old behavior to support configuration options when triggereing a DAG in the UI. This was removed in Airflow 2.7.0, new advice is described here
  • [UI]: Fix an issue where adding users to environment/projects used the wrong user in some cases
  • [Notebook]: Do not allow the creation of notebooks that have a too long name (>63 characters)
  • [UI]: Fixed an issue with wrap long lines in the logs
  • [General]: Fixed a rights issue when downloading logs as a non-admin
  • [IDE]: When storing too much data (apt packages, pip install, or data) inside your IDE it could happen snapshotting would fail because we ran out of storage. We made snapshotting more storage efficient and faster, and now mount a separate volume. This way we can never run out of storage.

1.13.1 (20-12-2023)

features

  • [UI]: Allow admins to disable the Notebooks and/or IDE features
  • [CLI]: Switch to using the regional STS endpoint by default
  • [IDE]: The AWS CLI now comes pre-installed in the IDEs
  • [AWS]: Karpenter has been upgraded to 0.32.4

bugfixes

  • [IDE]: Fix a small issue where certain images displayed in the readme were not properly loaded
  • [General]: Fixed the caching of terraform provider on the agent, this results in faster deploys of resources

1.13.0 (11-12-2023)

features

  • [UI]: Added a project create button
  • [UI]: Allow you to make the git repo blank in the UI
  • [UI]: Added the IDE button for Conveyor on the project page even if the git repo is not specified
  • [Airflow]: Added a proxy to serve the static Airflow files, this will result in faster load times of web pages. All static files are served almost twice as fast
  • [Airflow]: Add support for automatically mounting Azure keyvault secrets in containers and spark jobs. For more details look here
  • [AWS]: Upgrade Karpenter from v0.31.1 to v0.32.2
  • [AWS]: Upgrade our conveyor-install-role-and-policies.template to version 1.0.1, we removed some rights for Karpenter that are now unnecessary in version v0.32.2
  • [Airflow]: Add support for multiple task_ids in the ConveyorExternalTaskSensor, for more details look here
  • [IDE]: Upgrade code-server to version v4.19.1
  • [Spark]: Released new images for Spark 3.3.3 public.ecr.aws/dataminded/spark-k8s-glue:v3.3.3-hadoop-3.3.5-v1 and public.ecr.aws/dataminded/spark-k8s-glue:v3.3.3-2.13-hadoop-3.3.5-v1
  • [DBT]: Released new dbt images:
    • public.ecr.aws/dataminded/dbt:v1.6.9
    • public.ecr.aws/dataminded/dbt:v1.7.3
  • [General]: When using on-demand nodes prefer recent instance generation over older generations
  • [General]: Added support for m7i, m7a, r7i and r7a instances

bugfixes

  • [IDE]: Bash autocompletion was removed in the IDE with an upgrade, it was added again. Now bash auto-complete will work for Conveyor in the IDE

1.12.0 (28-11-2023)

features

  • [Airflow]: Upgrade Airflow to 2.7.3
  • [UI]: The homepage has changed from Environments to Projects
  • [UI]: The UI is now available in Dutch as well as English
  • [UI]: You can now control the Gitpod URL in the integrations, this is useful when using your own installed Gitpod environment
  • [AWS]: Upgrade the VPC CNI addon from v1.13.4-eksbuild.1 to v1.14.1-eksbuild.1
  • [AWS]: Upgrade Secrets Store CSI driver from v1.3.4 to v1.4.0
  • [CLI]: Improve the debug output for build commands
  • [Docs]: The table of container images is now split in multiple tables for better ease of use
  • [Terraform]: Release Conveyor terraform provider 0.3.0

bugfixes

  • [General]: Restrict the allowed environment names
  • [Azure]: Downloading logs now also works on Azure
  • [Azure]: Fix a regression with IDEs on Azure where the Azure SDK cannot be installed
  • [Airflow]: Make sure conveyor run works for DAGs in a local timezone and with a cron schedule multiple times a day
  • [Airflow]: Set the correct message when retrieving the logs for an Airflow task that did not run
  • [CLI]: Fixed an issue when using conveyor run to run an image of a project your project depends on

1.11.23 (20-11-2023)

features

  • [CLI]: Add the image build to the json output of conveyor build
  • [UI]: Add the ability to download the logs of a task
  • [AWS]: Update multiple components on the EKS cluster:
    • Upgrade EBS CSI driver from v1.19.0 to v1.25.0
    • Upgrade kube proxy addon from v1.27.4-eksbuild.2 to v1.27.6-eksbuild.2
    • Upgrade vpc cni addon from v1.12.6-eksbuild.2 to v1.13.4-eksbuild.1
    • Upgrade aws-observability/aws-for-fluent-bit from 2.23.3 to 2.32.0
    • Upgrade cloudwatch agent from 1.247352.0b251908 to 1.300031.1b317
    • Upgrade k8s dns node cache from 1.22.16 to 1.22.27
  • [Azure]: Update multiple components on the AKS cluster:
    • Upgrade Cilium from 1.13.6 to 1.13.9
    • Upgrade fluent-bit from 1.8.15 to 1.9.10
    • Upgrade k8s dns node cache from 1.22.16 to 1.22.27

1.11.22 (13-11-2023)

features

  • [Spark]: We've released new images for Spark 3.5.0, containing support for Delta and Iceberg table formats.
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.5.0-hadoop-3.3.6-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.5.0-2.13-hadoop-3.3.6-v2
  • [General]: Upgrade pgbouncer to version 1.21.0

bugfixes

  • [UI]: Make sure the retention parameter for cloudwatch logs is taken into account when querying at application runs.

1.11.21 (07-11-2023)

bugfixes

  • [AWS]: Increase the memory available for the secrets backend to resolve a second issue in secret mounting.
  • [AWS]: Reduce the inline policy used for building projects as to not hit the PackedPolicyTooLarge when using long project names.

1.11.20 (30-10-2023)

bugfixes

  • [IDE]: Fixed a race condition that can happen when suspending an IDE, the result was the snapshotting of the IDE was started twice resulting in the second snapshot being empty and a failure when starting up the IDE again.
  • [IDE]: Node startup for IDEs is now more robust after fixing two small issues.
  • [AWS]: Resolved an issue causing secret mounting to fail under high load.
  • [CLI]: Running conveyor build could remove the default-iam-role and other fields by accident from the project. Please make sure to update your CLI.

1.11.19 (24-10-2023)

features

  • [CLI]: Change the usage of conveyor build --build-arg argument to be inline with how it is used in Docker.
  • [CLI]: Improve conveyor project validate-dags to also detect duplicated DAG ids.
  • [IDE]: Upgrade code-server version to v4.17.1
  • [Airflow]: Migrate to securecookie webserver backend, the database implementation has some advantages that are currently covered by the Conveyor authorization checks, and the securecookie implementation puts less pressure in the database. This change also speeds up the airflow webserver by quite a lot.

bugfixes

  • [CLI]: The stdout stream of conveyor build --output json is no longer polluted by other messages.
  • [Airflow]: Airflow proxying is working again when using dark mode.
  • [UI]: The support chat button will now hide itself when scrolling down, preventing overlap with other components.

1.11.18 (16-10-2023)

features

  • [UI]: Support viewing and updating the environment settings.
  • [UI]: Allow updating the default IDE configuration in the project settings.
  • [Airflow]: We released a new version of the types-conveyor package, which adds a couple of missing function parameters.
  • [AWS]: Karpenter has been upgraded to 0.31.1.

bugfixes

  • [UI]: Fix an issue where clearing the defaultIamIdentity does not work on the project settings page.
  • [General]: Fix an issue where we could get throttled by auth0 when checking user permissions.
  • [IDE]: Fix an issue where 2XLarge IDE nodes would not start up on Azure.

1.11.17 (09-10-2023)

caution

We introduced a breaking change in the conveyor deploy command when using the git hash flag. If you are using: conveyor deploy --env <some-env> --git-hash <some-git-hash>, we will now fail immediately if the git hash is dirty. The reason for this is that a dirty git hash may link to multiple conveyor builds, which can result in you deploying unwanted code.

features

  • [Spark]: Released initial Spark 3.5.0 images. Iceberg and Delta Lake unfortunately do not support Spark 3.5 yet, so these dependencies are not included for now.
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.5.0-hadoop-3.3.6-v1
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.5.0-2.13-hadoop-3.3.6-v1
  • [Airflow]: You can now set the navbar color of Airflow on a Conveyor environment
  • [CLI]: Add support for deleting tags in the CLI

bugfixes

  • [Airflow]: Fix an issue when using Airflow in dark mode where some panels would fail to load.
  • [Airflow]: Cosmetic improvements when using dark mode.
  • [UI]: Fixed an issue where users could not be unlinked from projects/environments after being removed from AD.
  • [UI]: Fix an issue where deleting an environment resulted in weird behavior in the UI.
  • [UI]: Fix sorting of teams in the settings page.
  • [CLI]: Prevent deploying a project when using the git hash flag and passing a dirty hash. This fix contains a breaking change, see warning.

1.11.16 (26-09-2023)

bugfixes

  • [Airflow]: Downgrade Airflow dependencies upgrade in 1.11.15, this resulted in increased memory pressure for Airflow web in certain environments
  • [Airflow]: Fix an issue in the links between Conveyor and Airflow where they always landed on the Airflow home page

1.11.15 (26-09-2023)

features

  • [Docs]: The table describing our Docker images now includes a column for the OpenJDK version
  • [UI]: Move Instance type in the Create IDE Dialog from advanced to the main form. This removes the advanced form

bugfixes

  • [IDE]: Fix an issue where writing to the /tmp directory fails for IDE builds
  • [CLI]: DAG validation now works when using podman on Apple Silicon
  • [UI]: Filtering deployments by "deployed by" works correctly again

1.11.14 (18-09-2023)

features

  • [IDE]: Allow users to configure user-specific settings for their IDEs

bugfixes

  • [Airflow]: Fix issue causing unnecessary scheduler restarts

1.11.13 (12-09-2023)

features

  • [UI]: The application runs details page now shows the container image that was used
  • [IDE]: Reuse the manifest of previous IDE builds to speed up subsequent builds
  • [IDE]: Ensure that all environment variables are also visible in a Jupyter notebook environment
  • [Notebooks]: Upgrade the Spark installation used by notebooks from Spark 3.3 to Spark 3.4
  • [AWS]: Improve the mounting options of EFS, this will result in less downtime when AWS upgrades EFS in October
  • [Airflow]: Upgrade Python from 3.9 to 3.11

bugfixes

  • [UI]: The tag creation dialog works properly again
  • [Azure]: Spark event logs are now correctly expired on Azure
  • [Airflow]: The recycling rate of Airflow web workers has been increased to reduce the memory pressure. This change should result in fewer restarts

1.11.12 (12-09-2023)

Release skipped

1.11.11 (04-09-2023)

features

  • [Azure]: Upgrade AKS to 1.27.3
  • [AWS]: Upgrade EKS to 1.27
  • [UI]: Pinning of environments now also has a dedicated action button.
  • [Airflow]: Airflow scheduler and web now exactly match an mx.small conveyor instance, this means that a bit more resources are available

bugfixes

  • [General]: Fixed an issue with the permissions in our minimal installation role. This prevented proper deletion of an IDE when launched in a different account than the default Conveyor cluster.
  • [IDE]: Fix issue where Azure tokens could not be used in IDEs.
  • [Airflow]: Backport Airflow PR #33063 to fix the URLs generated by TaskInstances.
  • [General]: Improve default error handling of secret mount issues for cases that we do not explicitly handle.
  • [Spark]: Fix an issue with the eventlog file was not uploaded in certain cases.
  • [IDE]: Improve the robustness of the IDE snapshotting process

1.11.10 (28-08-2023)

features

  • [Airflow]: Upgrade to Airflow 2.6.3
  • [Terraform]: Release terraform provider 0.2.0
  • [General]: Improve performance of checking notebook access
  • [General]: Improve performance of checking Airflow access
  • [IDE]: Allow creating an IDE from the project list

bugfixes

  • [CLI]: The spark-history command will now wait for the server to finish loading before opening the browser window.
  • [IDE]: Handle failures when suspending an IDE in a more robust way.
  • [General]: Fix a race condition that could return a ProjectNotFound error for a valid DeleteProject request.
  • [Airflow]: The Airflow UI will now automatically refresh in case of temporary unavailability.

1.11.9 (21-08-2023)

features

  • [IDE]: Improved the performance of proxying the IDE for the user
  • [DBT]: The ConveyorDbtTaskFactory will now filter out ephemeral models and not add them as tasks
  • [DBT]: Released our 1.6.0 dbt image: public.ecr.aws/dataminded/dbt:v1.6.0, you can see find the full list of supported software in our docs
  • [Airflow]: Added support for email alerting on failures of dags, you can read more here
  • [Airflow]: Upgrade apache-airflow-providers-slack to 7.2.3
  • [CLI]: Make Docker BuildKit the default when using a Docker version of 23.0.0 or higher, just like Docker does
  • [Azure]: Upgrade AKS to 1.26.6
  • [UI]: Pinning projects in the overview page now has a dedicated action button
  • [General]: Upgrade template to 1.4.0

bugfixes

  • [AWS]: Fixed an issue where searching on logs could fail if an input like | was used.
  • [Spark]: AWS_REGION and AWS_DEFAULT_REGION should now be properly set on Spark jobs.
  • [Airflow]: Fix an issue where Airflow crashed when displaying ConveyorExternalTaskSensor links
  • [UI]: Legend in the metrics page was be partially cut off

1.11.8 (08-08-2023)

bugfixes

  • [AWS]: Fixed an issue when building the AMI for IDEs

1.11.7 (07-08-2023)

Release skipped

features

  • [CLI]: Upgrade template to 1.3.2

bugfixes

  • [UI]: Fix an issue where the build ID was not correctly passed when creating a new IDE.
  • [UI]: Fixed an issue where the create IDE/Notebook button is in another location for an admin vs a non-admin
  • [IDE]: Fix an issue with IDE builds on Azure
  • [IDE]: Fix an issue with deleting IDEs on Azure

1.11.6 (01-08-2023)

features

  • [AWS]: Only allow a single metadata hop on aws ec2 instances, the limit used to be 2 for when kube2iam was still used. After its removal, the hop count can be reduced to 1, improving the security of the setup.
  • [Streaming]: Streaming applications now support default roles as well.
  • [CLI]: Selecting an IDE to resume or suspend now uses the name instead of the ID of the IDE in the selector, thus making it easier to select the right one
  • [Spark]: Release Spark 3.4.1 images, you might need to make some changes to your scala spark jobs, see docs here
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.4.1-hadoop-3.3.6-v1
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.4.1-hadoop-3.3.6-v1
  • [IDE]: Automatically clone the configured git repo for a Project when opening an IDE for the project
  • [IDE]: Changed the username from coder to conveyor in the IDE
  • [UI]: Support creating IDEs from the UI

1.11.5 (24-07-2023)

features

  • [IDE]: Added the ability to set a default IDE config for a project through the CLI conveyor update project --default-ide-config command. This will later be used when people start their IDE from the UI, for now it is used as a fallback when no ide.yaml is included in the project.
  • [IDE]: Prefer 6th generation instances, in our testing 6th generation instances can be twice as fast as first generation instances when running tests in a pyspark project

bugfixes

  • [IDE]: Fix using a cloud identity when using IDE's

1.11.4 (19-07-2023)

features

  • [IDE]: Installed some extra packages to make the first experience smoother: build-essentials, python3.10-venv, python3-pip, python3.10-dev

bugfixes

  • [IDE]: Make sure packages that install man pages in /usr/share/man/man1 can do so, for example, installing openjdk-11-jre was not able to install
  • [IDE]: Resuming after auto-suspend sometimes, suspended the IDE automatically again. This is now fixed

1.11.3 (18-07-2023)

features

  • [IDE]: Added suspend and resume functionality for IDEs. IDEs will also automatically suspend after 60 minutes of inactivity to save costs
  • [IDE]: IDE's are upgraded from ubuntu 20.04 to ubuntu 22.04
  • [CLI]: Update conveyor templates to 1.3.1

bugfixes

  • [IDE]: Fixed an issue with pulling the base image when building a custom IDE

1.11.2 (10-07-2023)

features

  • [Terraform]: We've released v0.1.5 of our Terraform provider.
  • [Airflow]: The Conveyor Airflow operators now support using default IAM identities configured on the project level. You can find more information on the default IAM identity in the project documentation.
  • [UI]: It is now possible to promote deployments to a different environment using the web interface.
  • [IDE]: AKS nodes will be pre-warmed to reduce the startup time of IDEs on Azure.

bugfixes

  • [RBAC]: We now assign the correct permissions to 'Contributor' users for them to view logs and metrics of streaming applications.
  • [IDE]: The git routine used by IDEs has been made more robust so that the generated git clone command is always valid.

1.11.1 (10-07-2023)

Release skipped

1.11.0 (05-07-2023)

features

  • [IDE]: This release container a preview version of IDE support on Conveyor. To get started, check out our two how-to-guides:
  • [AWS]: Karpenter has been upgraded to version 0.28.1

bugfixes

  • [Notebooks]: Make sure that starting a notebook from the CLI works when messages are processed with a delay.
  • [Notebooks]: Fix an issue during the creation of a notebook which was not persisted in the notebooks.yaml.
  • [CLI]: Airflow images will now be pulled using the correct credentials, avoiding rate-limiting errors.

1.10.18 (12-06-2023)

bugfixes

  • [General]: Make sure there can be no overlapping names on kubernetes for the spark operator, there was a very small chance we used overlapping names for certain resources, these are now guaranteed to be unique.
  • [Airflow]: Fixed a bug in the ConveyorDbtTaskFactory where a model could have no dependency on start when a previous model was filtered out because of tag filtering.

1.10.17 (05-06-2023)

features:

  • [AWS]: Upgraded eks to 1.26
  • [AKS]: Upgrade aks to 1.26.3
  • [General]: Improve logging of internal components to reduce the costs on Azure

1.10.16 (31-05-2023)

bugfixes:

  • [Azure]: Make sure conveyor build and conveyor run works again on Azure

1.10.15 (31-05-2023)

bugfixes:

  • [CLI]: Make sure conveyor update downloads the latest CLI instead of 1.10.2
  • [General]: Fix referencing the same secret from aws ssm parameter store multiple times

1.10.14 (31-05-2023)

bugfixes

  • [Notebooks]: Fix an issue with websockets used in notebooks

1.10.13 (30-05-2023)

features

  • [General]: We've updated the machine images that are used on the cluster, resulting in even faster start-up times for your jobs.

bugfixes

  • [Airflow]: Revert the changes made in 1.10.4 regarding the DataHub integration. The 'cluster' parameter will now default to "prod" again.
  • [General]: Make sure you can use the same secret/ssm parameter multiple times in environment variables.
  • [Spark]: Make sure to set the aws-region for Glue automatically such that we do not rely on ec2 instance metadata.
  • [CLI]: Make sure we take quiet flag into account during conveyor build
  • [CLI]: On a default installation of docker desktop the daemon socket is not at unix:///var/run/docker.sock but at another location in the home directory. We now check that one first
  • [CLI]: Improved the conveyor run connection, sometimes it timed out and we restarted it. We added keep alive messages on the connection to make sure it isn't closed
  • [General]: Make cleaning up old builds more robust, this should result in fewer old builds staying around
  • [Airflow]: Fix an issue with Airflow scheduler liveness probe.

1.10.12 (09-05-2023)

experimental

  • [CLI]: We offer a new way to run dbt commands on Conveyor, directly from your command line. As this is an experimental feature, all feedback is welcome! Please refer to the documentation page for more information on the feature.

bugfixes

  • [CLI]: Change location for temporary folders in conveyor project commands. In CI, we use local working directory and otherwise the default temp folder for your OS.
  • [Spark]: Our latest Spark images contain a fix for reading tables that are partitioned by date from AWS Glue. For more info, have a look here.
  • [Airflow]: Fix an issue where Airflow workers are not scheduled on on-demand nodes for Spark jobs.
  • [Airflow]: Backport PR https://github.com/apache/airflow/pull/31128 to Conveyor Airflow.

1.10.11 (02-05-2023)

features

  • [General]: Upgraded Azure AKS to 1.25, this forces the use of cgroupsv2 on Azure. Certain JVM versions do not detect memory correctly, for Spark jobs this is no issue since we set the memory correctly. However, Java jobs using the ConveyorContainerOperatorV2 need to be upgraded to use JDK 11 (patch 11.0.16 and later) or JDK 15 and above.
  • [General]: Upgrade AWS EKS to 1.25
  • [Airflow]: Upgrade Airflow to 2.5.3
  • [General]: Release DBT image 1.5.0
  • [CLI]: Upgrade to use conveyor templates 1.3.0

bugfixes

  • [General]: Fix an issue where a notebook can get stuck in pending due to a bug in the EBS CSI driver.
  • [UI]: Improve the performance when fetching application runs.
  • [UI]: Make sure that the default IAM role can be rendered for a project contributor.
  • [UI]: The behavior of the notebook creation dialog was adapted to match your expectations.
  • [UI]: Toast messages which report errors are now rendered correctly.

1.10.10 (25-04-2023)

features

  • [Azure]: Upgrade AKS to 1.24.9
  • [UI]: We've made a minor visual update to our breadcrumbs.
  • [Terraform]: release terraform provider version 0.1.4

bugfixes

  • [CLI]: Fix the root cause for the file already closed message while building a notebook

1.10.9 (24-04-2023)

bugfixes

  • [Airflow]: Added a friendly error message when passing invalid values into the cmds argument of the ConveyorConveyorOperatorV2.
  • [General]: Fixed a race condition that popped up when running lots of tasks that use secrets from AWS Parameter Store or AWS Secrets Manager.

1.10.8 (19-04-2023)

bugfixes

  • [Airflow]: Fix an issue with the external task sensor
  • [Terraform]: Add examples for tags in the terraform provider

1.10.7 (19-04-2023)

features

  • [CLI]: Add support for team commands on an environment.

preview

  • [Airflow]: We've released type hints that you can use when developing your Airflow DAGs on Conveyor. You can refer to our documentation page for the installation instructions and more information.

bugfixes

  • [CLI]: We now make sure that container image pulls and runs triggered by Conveyor use the correct platform configuration.

1.10.6 (12-04-2023)

bugfixes

  • [UI]: We now show an informative message if a page is not found instead of a blank screen.
  • [CLI]: Fix an issue where conveyor build failed when using Podman.
  • [General]: Change cloudwatch metrics configuration to not use ec2 instance metadata.

1.10.5 (04-04-2023)

features

  • [Spark]: We've built a new Spark 3.3.2 image that only packages 1 version of netty. For more details you can refer to our technical-reference.
  • [Spark]: Additionally, we released new Spark 3.3.2 image also comes with hadoop 3.3.5, which contains several CVE fixes as well as an important fix for using Azure blob storage. More details can be found in our technical-reference as well. Azure users that use Hadoop 3.3.2 or 3.3.4 are recommended to switch to hadoop 3.3.5.
  • [Spark]: When using Spark on Azure with Hadoop 3.3.5, you can now use the ManifestCommitter as an alternative to the FileCommitter. More information on this committer can be found here.
  • [CLI]: Update conveyor deploy to show a message on how to rollback to the previous deployment
  • [CLI]: Add functionality to conveyor deploy for deploying a project based on a git hash as an alternative to the build id

bugfixes

  • [General]: Make sure AWS_REGION is also defined when users set AWS_DEFAULT_REGION. as the region cannot be retrieved from EC2 instance metadata when kube2iam is disabled.
  • [CLI]: Fix an issue where the validate-dags functionality does not work on GitLab CI anymore.
  • [UI]: Some visual edge-cases in the tag creation flow have been taken care of.

1.10.4 (29-03-2023)

features

  • [CLI]: Support dependent projects in Conveyor build/run/validate-dags command. More details can be found here
  • [Notebooks]: It is now possible to define a default IAM identity for your projects. This default role can be automatically used when creating a new notebook. More information on how to configure the default identity can be found in the CLI docs.
  • [Docs]: Time for spring-cleaning! We made some modifications to our documentation to make sure everything is clear and tidy.
  • [Airflow]: Add support for XComs in the ConveyorContainerOperatorV2.

bugfixes

  • [Airflow]: When enabling the DataHub integration, the 'cluster' parameter is now set to the name of your environment, instead of defaulting to "prod".
  • [Notebooks]: The integrated notebooks view again scales properly to your page height.
  • [Notebooks]: Fixed an issue where we asked to download files when deleting notebooks in ide mode.

1.10.3 (22-03-2023)

features

  • [Airflow]: Upgraded Airflow to 2.5.2
  • [UI]: Allow filtering on sensor applications in the application runs pages for environment and project

bugfixes

  • [CLI]: Improve cancellation handling in the Conveyor run command
  • [Airflow]: Fix the Conveyor application runs button for sensors, this button used to filter out the application runs incorrectly and showed nothing as a result

1.10.2 (15-03-2023)

features

  • [Docs]: Added documentation on how to improve dbt startup latency, you can find the documentation here.
  • [Airflow]: Add support for grouping tasks into taskGroups with the ConveyorDbtTaskFactory, more details here
  • [Terraform]: Released the conveyor terraform provider 0.1.0, with extra data sources concerning users of projects, environments and teams

bugfixes

  • [General]: Added extra validation on environment variable keys, before when the key was invalid the job would hang. Now we fail it
  • [Airflow]: In the generation of kubernetes object from the Conveyor operator, we sometimes produced invalid names. This is fixed now

1.10.1 (08-03-2023)

features

  • [Notebooks]: The number of driver cores will be automatically configured depending on the instance size
  • [CLI]: The behaviour of the ide-connect cli command is now consistent with the other notebook commands
  • [UI]: You can now create and attach tags to your projects for easier filtering
  • [UI]: The notebook creation dialog now hides optional configuration options by default
  • [General]: On AWS we optimised the way we are running EC2 machines for all jobs. We now collocate the Airflow scheduler and web pods. This should result in less Airflow downtime and an optimised the cluster. For job smaller than xlarge, we know keep the VM alive for up to 5 minutes to accept new jobs. This should result in less VM churn, which results in lower EC2 startup overhead, and less AWS Config costs if that is enabled.
  • [Templates]: Upgraded templates to 1.2.1.

bugfixes

  • [RBAC]: Fix an issue where the GetDefaultCluster call fails for non admin users
  • [General]: Improve the error handling when the IAM role cannot be assumed by your container/spark application
  • [UI]: Fix an issue where the collapsed sidebar icons were not clickable
  • [UI]: Fix an issue where the login can get stuck

1.10.0 (27-02-2023)

features

  • [Spark]: Create new spark images for Spark 3.3.2, more details can be found here
  • [General]: AWS: Improved node startup latency, by upgrading to EKS CNi to 1.12.2, and configuring it correctly the average startup latency of a node went down to 50s from 60s.
  • [Airflow]: Added support for configuring certain parameters of Airflow. At the moment we support 2 parameters in the core section parallelism and max_active_tasks_per_dag. You can find more information on when to configure these here.

1.9.0 (21-02-2023)

features

  • [General]: Added support for sending Airflow lineage to DataHub, read our how-to to get started!
  • [Azure]: Update the VM's used for user nodepools to the Dsv5-series instead of the Dsv3-series
  • [Terraform]: Release the Conveyor terraform provider 0.0.8 with support for configuring the Airflow DataHub integration

bugfixes

  • [CLI]: Fix an issue where creating a project from a template uses the wrong project name in some cases
  • [Notebooks]: Fix an issue where ide notebooks did not get deleted after they were idle for too long

1.8.9 (16-02-2023)

features

  • [Docs]: Small improvements to the RBAC documentation

bugfixes

  • [General]: Fix prepush of images on Azure when using images from public ecr
  • [Airflow]: Fix an issue where Airflow web restart fails due to pid file

1.8.8 (15-02-2023)

note

Skipped due to database performance issues.

1.8.7 (09-02-2023)

bugfixes

  • [Airflow]: Fixed an issue with scheduling on-demand jobs on Conveyor. After this we will reset all jobs currently stuck in scheduling

1.8.6 (09-02-2023)

features

  • [Notebooks]: Add support for updating files of a notebook running in IDE mode
  • [Airflow]: Upgrade the database used to use gp3 storage, which should give the same or better performance for a lower price
  • [Airflow]: Support attaching external storage to container pods. This is useful for container jobs that need a lot of temporary storage.

bugfixes

  • [Airflow]: Fixed an issue that made retries stop working when Spark submit fails

1.8.5 (06-02-2023)

features

  • [UI]: Display a clear error message when there are no logs because the retention period is expired

bugfixes

  • [Notebooks]: Fix an issue where deleting a notebook by name deleted the wrong notebook
  • [General]: Fixed an issue in the onboarding flow, where the role could not be properly registered

1.8.4 (02-02-2023)

bugfixes

  • [UI]: Fixed a bug where the memory metrics where not showing correctly for the spark driver, or container application
  • [Airflow]: Fixed an issue that made retries stop working with the Conveyor Airflow Operators

1.8.3 (01-02-2023)

features

  • [Airflow]: In airflow we now show you your last 100 log lines of your failed application. We also add a link to the logs that takes you straight to the logs in Conveyor. Tou can copy and paste into your web browser.
  • [Airflow]: Update datahub integration to 0.9.6
  • [Airflow]: Added the ability to override the start and end task in the ConveyorDbtTaskFactory
  • [Templates]: Upgraded templates to 1.2.0.

bugfixes

  • [CLI]: rework conveyor completion command to work without internet connectivity such that you can source is in a terminal.
  • [Airflow]: Fixed a bug where the Conveyor Application Runs button would not work when a user created a manual run with a self supplied run_id.
  • [UI]: show correct spark executor instance lifecycle and type in details table.
  • [Airflow]: Fix an issue where the application run button does not generate the correct link for new projects
  • [UI]: The spark executor metrics legend sometimes did not match the actual executor of which the metrics are shown. We now make sure the legend matches the correct metric
  • [UI]: Fix broken redirect when pressing the logs button on the task executions page

1.8.2 (19-01-2023)

bugfixes

  • [UI]: Fix an issue with the sorting of executors
  • [Spark]: Fix an issue where Karpenter is allowed remove nodes with a Spark submitter pod

1.8.1 (18-01-2023)

features

  • [Docs]: Improved the documentation of the ConveyorDbtTaskFactory in regard to start and stop task

bugfixes

  • [UI]: Fix an issue where the filtered string was not taken into account when switching between pods
  • [UI]: Fix an issue where navigating from execution details to executor log is broken

1.8.0 (18-01-2023)

note

In this release we reworked the task execution pages. When clicking on a row in the task executions page, you now go to the details of that run, showing more info on different pods of the Spark/container/sensor job. By clicking on a row in the details page, you will go the logs of that specific pod (driver, submitter or executor). If you immediately want to go from the task executions page to the logs of Spark driver or container pod, as was the default behavior in previous releases, you should now use the logs button in the Actions column.

features

  • [Airflow]: Do not restart Airflow web on redeploys as it is not needed anymore, we know made sure the Conveyor Application Runs button still works
  • [UI]: Rework the task executions in the UI to add a tab that displays detailed information for a job next to the logs and metrics. This tab shows the start date, duration, instance type as well as the failure reason for all pods of a given job. This is useful for Spark to get a quick overview on why executors are failing.
  • [UI]: allow filtering when choosing an executor/driver on the task execution logs page.
  • [UI]: add tooltip with resources available when displaying the instance types.

bugfixes

  • [Spark]: A spark application used to hang when setting "spark.driver.memoryOverheadFactor" or "spark.executor.memoryOverheadFactor" for a spark version < 3.3.0, now we accept but ignore the value just like Spark would
  • [Logs]: Fixed an issue where sometimes the logs of short running jobs would not be stored. We fixed the startup issue of the log aggregation service so now logs should always be available.
  • [UI]: Show specific error message when a spark executor has no logs instead of showing generic error message.
  • [Spark]: Fix an issue with the spark history command to allow cross account access to the artifacts bucket

1.7.9 (10-01-2023)

bugfixes

  • [Airflow]: Revert the change to not restart airflow on redeploys, the change made the Conveyor Appliation Runs button fail

1.7.8 (10-01-2023)

features

  • [Templates]: Mount a template git repo to the cookiecutter container instead of a folder within the repo
  • [Airflow]: Do not restart Airflow web on redeploys as it is not needed anymore
  • [Airflow]: Change ConveyorExternalTaskSensor to request taskInstances from the db in order to fix 404 errors on Airflow web restarts

bugfixes

  • [RBAC]: Fix an issue where a non-admin user could not list Streaming Applications
  • [Airflow]: Fix an issue where restarting the scheduler leads to temporary missing dags in the web UI.

1.7.7 (09-01-2023)

features

  • [Airflow]: Log the node on which the Airflow worker is running, this makes it easier when needing to debug node outages

1.7.6 (04-01-2023)

bugfixes

  • [Airflow]: Fix an issue with the Airflow dag-syncer init container going OOM
  • [General]: Added a fix to logs and metric fetching on Azure

1.7.5 (03-01-2023)

bugfixes

  • [RBAC]: Fix an issue with RBAC when fetching the list of task executions

features

  • [Spark]: Add support for Iceberg in the latest images
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.3.1-hadoop-3.3.4-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.3.1-2.13-hadoop-3.3.4-v2

1.7.4 (02-01-2023)

features

  • [General]: Upgrade to eks 1.24
  • [General]: Upgrade to aks 1.24
  • [General]: On Azure run the CNI controller on the default node
  • [Spark]: Add the ability to see the spark executor and submitter logs in the UI for both batch and streaming
  • [notebooks]: Upgrade images to use ubuntu 22.04 as a base
  • [notebooks]: Remove support for python 3.6

1.7.3 (27-12-2022)

features

  • [Notebooks]: Use Spark 3.3.1 in notebook images

bugfixes

  • [Spark]: Fix an issue in Spark controller where eventlog directory was deleted before driver was terminated.

1.7.2 (21-12-2022)

features

  • [CLI]: Support updating the project description through the project update --description <content> command
  • [Airflow]: Allow operatorLinks to be inherited

bugfixes

  • [General]: Reduce the time and improve robustness when deploy Conveyor projects
  • [UI]: Made the log fetching in the UI more robust for AWS. This should make certain uses cases where AWS returned nothing work again.
  • [Spark]: Fix an issue when using spark on-demand executors
  • [Costs]: Fix an issue in calculating costs for new g5 and g4dn instances
  • [UI]: Fix an issue with displaying top x project/environment costs

1.7.1 (14-12-2022)

features

  • [Airflow]: Upgrade Airflow to 2.4.3
  • [Airflow]: Upgrade datahub packages to 0.9.3.1

1.7.0 (05-12-2022)

info

🎉 We are happy to announce that Notebooks, one year after the initial introduction, are now out of preview 🎉 .

Over the past year we improved the Notebooks feature in terms of stability as well as user experience. From now on the API is stable and we will ensure that all changes are backwards compatible.

features

  • [General]: Remove the unused Airflow role
  • [UI]: Improved the task execution page for a spark job running in local mode. We do not try to show spark executor metrics now, and we show the mode in the overview
  • [CLI]: add Support for Podman next to Docker
  • [Spark]: Released spark 3.2.3 images:
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.3-2.13-hadoop-3.3.4-v1
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.3-2.13-hadoop-3.3.4-v1
  • [General]: Switch the EFS volume used for Spark event log, and Airflow logs from bursting to elastic (the new recommended mode)

bugfixes

  • [UI]: Do not allow the deletion of admin users, make sure they are removed from the administrators first.
  • [UI]: Fix an issue where inviting a user would sometimes fail because of a wrongly generated password
  • [UI]: Fix an issue where filtering on streaming applications would not work
  • [UI]: Fix an issue where the create environment modal was behind the guided tour
  • [UI]: Improved log readability. It used to be that multiple whitespaces were collapsed into on, printing for whitespaces like this: foo bar, would show like: foo bar. Reducing readability when printing tables with whitespace.
  • [General]: Fixed an issue where spark applications using multiple R-instance executors would not launch
  • [General]: Fixed an issue with zone limitations when using R-instances
  • [CLI]: Make sure conveyor run loads the Airflow backports
  • [Costs]: Fixed a bug in cost calculation for Spark, this has a low impact on the cost calculation

1.6.0 (22-11-2022)

note

In this release we add support for R instances on AWS. These instances use Karpenter (a new autoscaler) for the cluster which should result in faster node startup. It is a new component, with which we have less experience, so we will be monitoring it closely. If you notice any issues when using these instances, do not hesitate to contact Conveyor support.

features

  • [General]: Add support for R instances on AWS. These instances have a higher memory to cpu ratio than the M instances on AWS. They are useful when loading in large data sets with limited computation. They are currently only supported on AWS.
  • [General]: Use the new price-capacity-optimized allocation strategy. This is a new strategy that will use the spot pools which are least likely to be interrupted and selects the lowest price from these spot pools. It is the new recommended strategy by AWS. It will result in a cost savings compared to the previous recommended capacity-optimized strategy. For more info you can look at the release blog by AWS
  • [Docs]: Added documentation on how the executor_disk_size setting may improve performance.
  • [General]: Optimised the log storage on AWS, this should reduce the cloudwatch logging bill
  • [General]: Release terraform provider 0.0.7

bugfixes

  • [Spark]: Handle Spark applications that launch multiple spark contexts gracefully and do not upload their event logs
  • [UI]: Fix an issue with filtering the logs of streaming applications
  • [UI]: Make cancel button for Spark tasks clickable without using horizontal scrollbar
  • [Airflow]: Fixed a bug where the Airflow scheduler could get into a faulty state when scheduling Dynamic tasks. This is now fixed

1.5.15 (16-11-2022)

features

  • [Airflow]: Remove db cleanup job now it has executed successfully

1.5.14 (16-11-2022)

features

  • [General]: Expose extra Airflow metrics to monitor Airflow environments more thoroughly

bugfixes

  • [UI]: Improve the spark history server dialog

1.5.13 (15-11-2022)

bugfixes

  • [Airflow]: ConveyorContainerOperatorV2 arguments allow None values to be passed
  • [Airflow]: ConveyorSparkSubmitOperatorV2 application_args allow None values to be passed

1.5.12 (15-11-2022)

bugfixes

  • [General]: Fix an issue with internal Airflow metrics

1.5.11 (15-11-2022)

features

  • [UI]: In the task executions list, change the default match from exact to fuzzy
  • [Airflow]: In the ConveyorContainerOperatorV2 we now raise an exception if the arguments passed are not all of type string
  • [Airflow]: In the ConveyorSparkSubmitOperatorV2 we now raise an exception if the application_args passed are not all of type string

bugfixes

  • [Airflow]: Fix incorrectly stored on-demand task instances affected by bugfix

1.5.10 (14-11-2022)

features

  • [Spark]: Rework spark-history support, now you can create a spark-history server locally instead of sharing one per client, which caused many issues

bugfixes

  • [Airflow]: Load the backport of Airflow bugfix earlier so that the scheduler picks it

1.5.9 (09-11-2022)

bugfixes

  • [General]: Fixed an issue with deleting environment, where certain undeploys where processed multiple times. This resulted in long waiting times to delete environments
  • [Airflow]: Backported an Airflow bugfix which impacted on-demand tasks

1.5.8 (09-11-2022)

features

  • [Airflow]: Upgrade to Airflow 2.3.4
  • [Spark]: Release spark 3.3.1 images, more details in the docs
  • [UI]: Show progress when deleting/updating an environment in the environment events
  • [Airflow]: Airflow version 1 support has been removed

bugfixes

  • [General]: Fix an issue with cleaning up temporary files while deleting an environment
  • [CLI]: conveyor notebook start had an issue where it asked for you notebook twice, this was fixed
  • [CLI]: conveyor notebook commands start, stop, open now only show notebooks you own
  • [General]: Fix autoscaling from zero for our autoscaling groups, there was an issue with notebooks not properly detecting the Availability zone of the volume

1.5.7 (27-10-2022)

features

  • [General]: Deployments are now attributed to the user who trigger them
  • [Docs]: Added documentation on the needed IAM rights to use secrets in the operators
  • [UI]: The guided tour can now be stopped and resumed even after refreshing the application
  • [UI]: Admin users are now displayed in the user list page
  • [UI]: Allow multiple users to follow the guided tour without interfering with each other

bugfixes

  • [CLI]: Let the exit code of the conveyor run command mimic the exit code of the Airflow task
  • [UI]: Previously used logins are now always kept in lowercase
  • [General]: Fixed a bug where old spark application where not cleaned up properly. This could block the deletion of environment.

1.5.6 (18-10-2022)

features

  • [Airflow]: Added support for dynamic tasks in conveyor run. For more information go here
  • [Spark]: Release support for Spark 3.2.2
  • [CLI]: Added support for updating the airflow instance lifecycle (on-demand, spot) via the CLI. Before you could only supply it during creation.
  • [CLI]: Added conveyor completion commands to the docs
  • [General]: AWS Upgrade EKS to version 1.23
  • [General]: Azure upgrade AKS to version 1.23.12
  • [General]: AWS and Azure decrease logging costs for fluent-bit
  • [General]: Azure decrease logging cost, we do not store the logs of certain verbose Azure managed components
  • [Notebooks]: When opening a terminal, the working directory is your conveyor project
  • [Notebooks]: Install ssh by default

bugfixes

  • [UI]: Fixed the project link not work on the task execution detail page
  • [UI]: Fix the option to partially or fully select a task in the task executions page
  • [UI]: Fix an issue where multiple people could not create a notebook with the same name
  • [CLI]: Do not validate the setup.py and requirements.txt when using your own Dockerfile for notebooks
  • [General]: Fix a bug in the airflow scheduler liveness probe
  • [Spark]: Fixed a bug where the spark history server link would point to a wrong file name in certain rare cases, this could happen when a spark job was shut down unexpectedly
  • [Notebook]: In a certain edge case a notebook delete would not do anything, this has been fixed

1.5.5 (06-10-2022)

bugfixes

  • [UI]: Fix a bug that made the executions page fail to load

1.5.4 (06-10-2022)

features

  • [General]: Added a Conveyor tag to all aws resources
  • [General]: Increase the responsiveness of the kubernetes cluster autoscaler after a failure

bugfixes

  • [Airflow]: Improved the liveness check of Airflow to use standard airflow code, this should result in less scheduler restarts

1.5.3 (21-09-2022)

bugfixes

  • [CLI]: The previous release introduced a bug in the promotion mechanism which is now fixed.
  • [UI] Prevent accidental clicks on project/environment pins

1.5.2 (20-09-2022)

features

  • [UI]: You can now use Conveyor in Dark Mode, including the Airflow and Notebooks UIs.
  • [UI]: Added an option to list all past deployments of a project, with their build ids and git commit hash ; not only the active one.

bugfixes

  • [Airflow]: Fixed an issue with dbt factory when models contain the word model.
  • [UI]: Sorting projects on last activity was not working correctly.
  • [Streaming/RBAC]: Fixed an issue with a missing RBAC permission to validate a Streaming application.

1.5.1 (13-09-2022)

features

  • [Airflow]: Added function to add conveyor executions URL to alerting, see documentation here
  • [Spark]: Backport hadoop 3.3.4 for older Spark images. Full details on the images are available at docs
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-hadoop-3.3.4-v8
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-2.13-hadoop-3.3.4-v8
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.3-hadoop-3.3.4-v3
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.4-v7

bugfixes

  • [CLI]: fix an issue where the git repo detection was not correctly filled in on a project.
  • [Notebooks]: downloading notebooks was mistakenly overwriting the local src folder with the content of the notebooks folder.
  • [CLI]: use a different command for detecting the current git commit hash such that it also works for repos with annotated tags.
  • [General]: fix an issue with cost calculation such that it works again.
  • [UI]: recent and pinned projects/environments were not properly removed when the corresponding project/environment is deleted.

1.5.0 (05-09-2022)

features

  • [CLI]: Add commands for teams and use them in tf provider
  • [Notebooks]: The notebook storage has been migrated from EFS to EBS. The migration will be handled automatically as soon as you have paused your notebook. The reason for this is that the EBS performance is more consistent, the downside is you have to specify a size up front, the default is 10Gi.
  • [Notebooks]: We now not only persist your code and notebooks but also your venv. So if in a notebook you install new packages these will be persisted across restarts. We do this by persisting the folder /home/jovyan/work.
  • [Notebooks]: Opening a terminal in the notebook has your virtual environment automatically activated (only for newly started notebooks).
  • [Notebooks]: Added the plugin jupyterlab-system-monitor by default to show the current memory usage of your notebook
  • [Notebooks]: Added start/play button to the notebook details overview.
  • [Notebooks]: The details overview can now always be opened even if the notebook is not ready or stopped.
  • [Notebooks]: In the list view, the action open notebook UI now always opens in a new tab
  • [Notebooks]: Added documentation on how to install jupyter extensions for notebooks.
  • [CLI]: When running conveyor notebook start we now open it automatically when it has been started
  • [CLI]: Added the conveyor notebook open command to open your notebook in a browser
  • [CLI]: conveyor notebook create gives more feedback to the user on what is going on
  • [CLI]: Added icons to indicate that certain steps have finished when using conveyor notebook create and conveyor run
  • [UI]: When creating a notebook, we now open it automatically
  • [UI]: Rework how logs are shown in the UI. Make the UX simpler and allow searching logs across all pages.
  • [Spark]: Include delta 2.1.0 in spark 3.3 images
  • [UI]: Task executions can now be filtered by clicking on the corresponding environment/project/task/dag buttons from the table columns
  • [UI]: Filtering on task executions columns now matches exactly the filter, unless you prefix it with ~
  • [UI]: Added shortcut buttons for recently visited environments and projects
  • [UI]: You can now pin projects and environments so that they show up on top of the lists

bugfixes

  • [General]: Inviting a user could sometimes result in an email with a non-working url. We have fixed the encoding so inviting someone should always work.
  • [RBAC]: Users were not properly removed from RBAC when being deleted. It is now fixed.
  • [RBAC]: When RBAC is disabled show all the settings pages

1.4.2 (23-08-2022)

features

  • [Airflow]: Support running Airflow scheduler/web on on-demand nodes to ensure that they are always available, which might be useful for production environments. For more info
  • [Spark]: Added spark images for hadoop 3.3.4
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.3.0-hadoop-3.3.4-v1
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.3.0-2.13-hadoop-3.3.4-v1
  • [UI]: All users can now be deleted by admins, including SSO users. If an SSO user gets deleted, he will simply be recreated on the next login. However, his previous permissions will be lost and will need to be set again.
  • [Templates]: Upgrading to the latest version of the templates 1.1.0
  • [UI]: Include chat option in UI to contact the support team
  • [UI]: Add readonly demo environment for potential users
  • [Templates]: Use strict uuid pattern matching in the resources. (@stijndehaes)
  • [Templates]: Upgrade the resource folder assume role policies to use the service account. (@stijndehaes)
  • [Templates]: Upgrade spark images to our latest releases. (@stijndehaes)
  • [Templates]: Upgrade to dbt 1.1.0 (@pascal-knapen)
  • [Templates]: Add gitpod and codespaces configuration (@pascal-knapen)

bugfixes

  • [Azure]: Fixed an issue with calculating the memory of pods on Azure
  • [General]: Fixed detection of the command or entrypoint of a container not working. Also adding documentation on how to debug the issue

1.4.1 (09-08-2022)

features

1.4.0 (01-08-2022)

features

  • [General]: We improved the security of containers running on the kubernetes were possible. Our own kubernetes management applications and Airflow run with the following settings:
    • We enable read only root file system where possible, this makes it harder for attackers to exploit the root file system
    • We run the containers as a non root user, this makes privilege escalation harder
    • We disable privilege escalation, this disables privilege escalation to the node
    • We disable the Automount of service account tokens on pods that do not need it, by disabling this an attacker can not receive an authorization token for the kubernetes API We run user containers with the following extra settings:
    • We disable privilege escalation
    • We disable the Automount of service account tokens on pods that do not need it In the future we might look into enabling more of these settings, or allow the you to define them for your containers.
  • [UI]: The embedded Airflow view now persists the navigation to its various pages in the conveyor url. This makes it easier to share the url with others, or revisit it yourself.
  • [UI]: You can now easily navigate from the Task Execution list and details pages to the Airflow DAGs and Tasks pages
  • [RBAC]: The error message that is returned when an action is not authorized is now more descriptive.
  • [General]: The reset password email was still mentioning Datafy. It is now changed to Conveyor.
  • [Docs]: Added a contact and support page
  • [UI]: We now show the guided tour to first-time users by default
  • [Airflow]: Decrease the load on NFS by Airflow by decreasing the dag processing logging to Error, and only creating the files when needed
  • [Airflow]: Added docs on assume cross account iam roles for operators.
  • [General]: Fixed an issue with the user invitation flow where an invalid invitation link was sometimes generated
  • [UI]: You can now view the AWS IAM role or the Azure Application Client Id used by task executions in the Task Execution details view.

bugfixes

  • [UI]: Fixed an issue with filtering task executions on multiple values of status, type, ...
  • [CLI]: Fix an issue where the cli does not listen to your keystrokes after logging in, e.g. when creating a notebook

1.3.12 (26-07-2022)

features

  • [General]: On Azure we upgraded kubernetes to 1.22.11

bugfixes

  • [General]: Lower the usage of EFS by airflow executors, EFS was a critical component for airflow. Because of issues from the past we are lowering our dependency on it, thus increasing availability.
  • [General]: Fixed a memory leak by upgrading one of the libraries we use

1.3.11 (19-07-2022)

features

  • [General]: On aws we enable image tag immutability on the ECR repositories we manage

bugfixes

  • [General]: On Azure we are seeing networking timeouts when pods start up, this is very problematic for the airflow scheduler as it never recovers. We now restart it when we detect the issue. We are also into contact with Azure support to find the root cause.

1.3.10 (14-07-2022)

bugfixes

  • [UI]: Fixed an issue with the project pages not properly loading

1.3.9 (14-07-2022)

bugfixes

  • [UI]: Fixed a race condition when logging in

1.3.8 (13-07-2022)

features

  • [UI]: Adding buttons to open a project in gitpod or github codespaces

bugfixes

  • [General]: Make the algorithm to detect different memory configurations from 1.3.7 more robust. We do not accept changes that are considered too low, this way we are more tolerant to faulty configurations.
  • [Airflow]: Remove some dangling tables after airflow 2.3 migration

1.3.7 (12-07-2022)

bugfixes

  • [General]: Not all aws instances get the same amount of memory. In one instance group we see heavy fluctuations. The difference between from one m5.4xlarge to another can be in the range of several 100Mbs. This can result in an issue when calculating the memory for your container. We now autocorrect these values.
  • [Notebooks]: Fixed an issue where downloading notebook files would fail if the old datafy.cloud domain was configured
  • [Docs]: Add mx.nano instance type back to our ConveyorContainerOperatorV2 documentation.

1.3.6 (11-07-2022)

caution

The notebook API has been updated in a non backwards compatible way, please upgrade to the latest CLI.

features

  • [Notebooks]: Reworked our notebook API to be more consistent

bugfixes

  • [Airflow]: Fixed an issue in fetching the spark submit logs from airflow
  • [Notebooks]: Fixed an issue with notebooks where notebooks would be invisible when the SSO connection creates usernames with capitals
  • [CLI]: Fixed conveyor run execution date detection when no schedule is set on a DAG
  • [Spark]: Fixed an issue where environment variables weren't propagated in local mode
  • [UI]: Fixed an issue where the embedded Airflow view would not render properly when the access token is expired
  • [UI]: Fixed an issue where a project git repo would not be updated after modifying it

1.3.5 (05-07-2022)

features

  • [Airflow]: Reduce logging of airflow scheduler and file processor to efs.
  • [Airflow]: Change how Airflow dag fetching works for the scheduler and web instance, this reduces load on EFS by 75%

bugfixes

  • [UI]: Fixed an issue where the UI would use the user email with capital letters

1.3.4 (30-06-2022)

bugfixes

  • [Airflow]: Preventing Airflow tasks to be marked as failed when an API 410 exception is thrown
  • [Airflow]: Lower the logging of Airflow to EFS

1.3.3 (30-06-2022)

features

  • [Spark]: Big spark jobs (mx.xlarge, mx.2xlarge, mx.4xlarge and more then 1 executor), will now be scheduled in a single Availability zone. We select the availability zone based on the least amount of spot interrupt change when running on spot. This will reduce network costs, and reduce network overhead for spark.

bugfixes

  • [Spark]: We are investigating and issue where the EFS volume used for the spark event log is overloaded. We added a global Admin option for ourselves to disable spark event log upload, that we can enable when we notice issues in an environment.
  • [Airflow]: Backported https://github.com/apache/airflow/pull/24478 to fix an issue with retrying old tasks in the UI
  • [Airflow]: Backported https://github.com/apache/airflow/pull/24117 to fix an issue with retrying old tasks in the UI

1.3.2 (29-06-2022)

features

  • [docs]: Advocate the use of strict uuid pattern matching when assuming roles
  • [Airflow]: Upgrade to airflow 2.3.2
  • [Airflow]: Allow users to template num_executors in ConveyorSparkSubmitOperatorV2
  • [General]: Allow folders in dags and resources folder
  • [General]: Added warning about v1 operator deprecation into UI
  • [CLI]: Conveyor run now lets you select an environment interactively
  • [CLI]: Conveyor run now lets you select a DAG and a task interactively
  • [CLI]: Conveyor run now automatically uses the last execution date compatible with the DAG schedule if none is provided
  • [Spark]: Added support for Spark streaming on Azure
  • [Spark]: Added a spark local mode to the ConveyorSparkSubmitOperatorV2, see the docs for more info
  • [CLI]: Support passing additional build arguments to the container engine
  • [Spark]: Added section on improving performance when Spark on Conveyor
  • [Costs]: Added a global overview per day
  • [Azure]: Initial version of Azure metrics available in UI
  • [Spark]: We now run big (more than 1 executor, and executor instance type mx.xlarge, mx.2xlarge, mx.4xlarge) batch spark applications in a single AZ by default. When using spot we use the aws spot placement score API to determine the best AZ when your spark application is launched. This improves the availability of the spark application, and reduces network costs and overhead.
  • [Spark]: Added spark 3.3.0 images:
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.3.0-hadoop-3.3.1-v1
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.3.0-2.13-hadoop-3.3.1-v1
  • [Azure]: Support enabling microsoft defender for cloud on Azure
  • [Spark]: Released new images with reduced logging when using spark local mode:
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1-v6
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.3-hadoop-3.3.1-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-hadoop-3.3.1-v7
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-2.13-hadoop-3.3.1-v7
  • [Templates]: Use strict uuid pattern matching in the resources. (@stijndehaes)
  • [Templates]: Upgrade the resource folder assume role policies to use the service account. (@stijndehaes)
  • [Templates]: Upgrade spark images to our latest releases. (@stijndehaes)

bugfixes

  • [UI]: Pressing ENTER when filtering columns was not working
  • [UI]: Add executor info to spark detail page
  • [UI]: Show all states for an environments in the UI, such that users can see what is going on when it is being deleted
  • [UI]: Fixed an issue where inviting a user would result in Airflow UI's not loading untill the users logged out and in again
  • [CLI]: Fixed an issue where logger wouldn't respect being set to quiet
  • [CLI]: Fixed an issue with deleting notebooks
  • [General]: Update ebs csi driver so it doesn't go out of memory when many pods are being scheduled. This improves reliability when using the spark option executor_disk_size
  • [General]: Fixed a bug where the metrics would show double the CPU usage on AWS
  • [CLI]: Do not print the m2m token when logging in

1.3.1 (27-06-2022)

bugfix

  • [Azure]: Correct daemonset overhead calculation to include the azure-cni component after switching away from calico

1.3.0 (20-06-2022)

features

1.2.5 (14-06-2022)

bugfixes

  • [Airflow]: Fixed a bug where removing tags from a dag would make it fail to load
  • [UI]: Fixed the link to the git repo in deployments, and on the project view
  • [CLI]: Do not show version out of date warning when using conveyor update command

1.2.4 (13-06-2022)

bugfixes

  • [Notebooks]: Update spark image used in notebooks to version: 3.2.1-hadoop-3.3.1-v6
  • [UI]: Tour won't let you move past step 7

1.2.3 (09-06-2022)

bugfixes

  • [General]: Fixed an issue where the new single availabilty zone option that could result in jobs being slow to launch

1.2.2 (08-06-2022)

features

  • [General]: Added more instance types to our autoscaling groups. This will help to get the most stability out of spot instance on AWS
  • [General]: Made the single availability zone option more robust for on-demand. When the preferred instance type is unavailable it will move on to the next preferred type in a list

bugfixes

  • [General]: Fixed an issue where the new single availabilty zone option would always use on-demand instances

1.2.1 (08-06-2022)

features

  • [UI]: Add optional tracking for analytics purposes
  • [Airflow]: Resubmit Spark application when spot termination is detected during submission
  • [Spark]: Allow users to select an availability zone for your spark application using ConveyorSparkSubmitOperatorV2

bugfixes

  • [UI]: fix links to docs page
  • [UI]: Fixed an issue on Azure where big logs would fail to load
  • [Spark]: Handle an extra case as spot interruption instead of a regular spark submit failure

1.2.0 (07-06-2022)

features

  • [Spark]: Added support for spark decommissioning, Spark decommissioning helps you to not lose data when an executor has a spot interrupt. Before the spot interrupt goes through spark will try to send all intermediate results to other executors. Thus saving time and money for this job. The feature is only supported in our latest spark 3.2 images:
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-hadoop-3.3.1-v6
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-2.13-hadoop-3.3.1-v6

bugfixes

  • [Spark]: Created a new spark image which fixes a bug in MsalTokenprovider for using spark on Azure

1.1.6 (31-05-2022)

bugfixes

  • [General]: use the api server address as master on Azure instead of internal kubernetes service.
  • [Airflow]: Fix an issue where manual airflow runs were not filtered correctly in the task executions

1.1.5 (31-05-2022)

bugfixes

  • [CLI]: Fix the conveyor update progressbar
  • [General]: Make sure the cluster-autoscaler can handle the taint set by spot termination: aws-node-termination-handler/spot-itn
  • [Spark]: Upload the spark eventlog under the correct name, this makes the spark history server work again
  • [UI]: Fixed a typo on the admin page where it said projects instead of environments

1.1.4 (30-05-2022)

bugfixes

  • [Airflow]: Lower the Airflow usage of EFS by changing min_file_process_interval from 10 to 30 and dag_dir_list_interval from 60 to 300.
  • [General]: Added some improvements to the code that will lower EFS usage

1.1.3 (27-05-2022)

bugfixes

  • [UI]: Fix filtering on task executions started_at field
  • [General]: Allow more parallel processing in our operator. This reduces waiting time when a lot of spark/container jobs are launched

1.1.2 (27-05-2022)

bugfixes

  • [General]: We only keep failed container CRD's around for 30 min instead of 3 days. They piled up and took too much resources.

1.1.1 (27-05-2022)

features

  • [UI]: Remember previously used email in login screen
  • [General]: Implement cleanup of old project builds for Azure

bugfixes

  • [Airflow]: Catch 502 and 504 errors in External Task Sensor
  • [General]: Fix an issue where project deletion on Azure failed
  • [General]: Fixed an issue where an environment that failed to create could get into and unrecoverable state
  • [Notebooks]: Cleanup unused notebook images from ACR
  • [Spark]: Fix the spark eventlog upload failing

1.1.0 (24-05-2022)

features

  • [CLI]: Change CLI to use environment variables with CONVEYOR prefix as a preferred
  • [CLI]: update the upgrade-dags command to also rename imports and classes in dags from Datafy to Conveyor
  • [CLI]: Move the ~/.datafy profiles directory to ~/.conveyor
  • [CLI]: Add warnings when running conveyor build if the dags still use Datafy instead of Conveyor.
  • [General]: Display the node id in the UI as well as in Airflow when the node got spot terminated.
  • [Templates]: The github repository for templates has been renamed to conveyor-templates
  • [Notebooks]: The working directory for notebooks has been renamed from datafy_project to conveyor_project. This might cause loss of data for existing notebooks.
  • [Projects]: Projects now get their configuration from the ./conveyor directory, with fallback to the ./datafy directory

bugfixes

  • [CLI]: Conveyor run now creates the date interval just like a scheduled Airflow run, before it behaved like a manual Airflow run
  • [Spark]: Wait with uploading the event log until the spark application has finished, before there were instances where an upload happened before the spark application was shut down
  • [Airflow]: Allow primitive types as env_vars and convert them to strings
  • [Airflow]: Handle spot interrupts in ConveyorContainerSensor and ConveyorExternalTaskSensor tasks with reschedule mode by rescheduling them on another node
  • [Airflow]: Handle spot interrupts in all Sensor tasks which use mode reschedule by rescheduling them instead of crashing
  • [Airflow]: ConveyorExternalTaskSensor now also can now also watch for manually scheduled runs

1.0.2 (20-05-2022)

features

  • [Aiflow]: Increase parallelism from 64 to 128
  • [UI]: Improve the navigation breadcrumbs and the page icons

1.0.1 (19-05-2022)

bugfixes

  • [CLI]: Fix issue with datafy update renaming the cli binary to conveyor

1.0.0 (19-05-2022)

features

  • [General]: Rename Datafy to Conveyor
  • [CLI]: Cleanup command to delete managed docker images
  • [CLI]: The datafy CLI executable will be automatically renamed to conveyor when using datafy update
  • [UI]: Fix a bug with embedded Airflow view auto-resizing

bugfixes

0.63.3 (10-05-2022)

features

  • [Spark]: Integrate azure libraries in our standard spark image.
  • [Templates]: Use new spark image which supports both azure and aws

bugfixes

  • [Notebooks]: Configure the notebook to work on azure
  • [Notebooks]: Fix notebook configuration to include azure specific properties and jars
  • [Notebooks]: Fixed a bug where the memory of the spark context was not changed according to the instance size
  • [Notebooks]: Fixed a bug where files were not persisted when notebook was created from the UI

0.63.2 (05-05-2022)

bugfixes

  • [UI]: Fix an issue to get current user roles when using SSO

0.63.1 (05-05-2022)

features

  • [Airflow]: Removed the possibility to create airflow v1 environments, airflow v1 environments are deprecated for a long time now. They will be phased out in 2022. Airflow 1.x does not receive community support anymore and relies on very old libraries which are becoming less and less secure to use.

bugfixes

  • [UI]: Use username to get user roles in UI

0.63.0 (05-05-2022)

features

  • [CLI]: Pass environment variables to airflow for dag validation or using the run command
  • [UI]: Added the m2m tokens in the Conveyor UI Settings page
  • [General]: Add support for detecting spot termination on Azure
  • [Templates]: Make the templates work for both azure and aws
  • [Templates]: Use newest spark version: 3.2.1 in the templates
  • [Templates]: Update python versions such that they work with Apple Clang 13+

bugfixes

  • [UI]: Fixed an issue with SSO users showing up with weird names in the list, this is only relevant for installations starting from 2022
  • [Airflow]: Fixed an issue with cleaning up old airflow logs
  • [Airflow]: Make airflow workers more robust against connection reset errors while watching kubernetes pods
  • [UI]: Show the azure application client id for notebooks.
  • [General]: Increased timeout on Azure when deleting container repositories.

0.62.7 (27-04-2022)

features

  • [Spark]: Added a new failure mode for spark batch applications, if the application has lost more than 5 times the amount of executors requested the application will fail. The chances of such an application ever finishing are very low, and it would continue to take up resources in the cluster otherwise.
  • [Airflow]: When an airflow executor shuts down unexpectedly we check if this was because of a spot interrupt. If that is the case we put a message in the logs. This should make debugging issues easier
  • [General]: Add new failure mode for batch applications, if kubelet has not enough resources (cpu, memory, pods) but scheduler did assign the pod to the node, which caused it to fail.

bugfixes

  • [UI]: Fixed an issue with SSO users not showing up in the user list, this is only relevant for installations in 2022
  • [General]: Fixed an issue where updates of spark application with more than 600 executors would not be updated in our UI
  • [CLI]: Fixed datafy notebook download being broken
  • [Docs]: Fixed broken link to connect-ide docs

0.62.6 (21-04-2022)

features

  • [CLI]: Add support for Azure DevOps git when using templates to create a project
  • [CLI]: Made the datafy project stop-run command more useful, it can now handle multiple matches. And allows you to stop runs in batch
  • [CLI]: Conveyor run now checks before starting if there are other manual runs with the same properties in the environment. If there are it will ask if you want to clean these up first. This will stop manual runs from piling up.

bugfixes

  • [UI]: Fixed an issue where the cancel run button wouldn't work, but just redirect to the job logs
  • [Airflow]: Fixed an issue where the datafy application runs button would not filter on the environment
  • [Airflow]: The ConveyorExternalTaskSensor would fail if the Airflow instance was unavailable(for example because of a spot interrupt). Now we gracefully retry the sensor on the next poke
  • [General]: Fix a bug where a new users wouldn't be able to register when using SSO
  • [CLI]: Fix a bug for datafy notebook commands delete, start, stop, download where filling in only the name or environment flag would not filter properly on these flags

0.62.5 (15-04-2022)

bugfixes

  • [Airflow]: Make airflow workers more robust against glitches in kubernetes instead of failing immediately
  • [UI]: Return a 404 instead of a 500 when requesting nonexisting logs such the UI does not handle it as an error.
  • [Notebook]: Fix errors in our notebook operator when using cross account clusters

features

  • [Airflow]: Support instance_life_cycle option for dbt tasks
  • [Airflow]: Support instance_life_cycle option for airflow sensors

0.62.4 (13-04-2022)

features

  • [General]: Upgrade Aws EKS version to 1.22

0.62.3 (08-04-2022)

features

  • [General]: Upgrade Aws for fluent bit to version: 2.23.3
  • [General]: Upgrade aws load balancer controller in preparation of eks 1.22 upgrade

0.62.2 (07-04-2022)

features

  • [Airflow]: Upgrade following dependencies for Airlfow 2: Airflow to 2.2.5, upgrade apache-airflow-providers-apache-spark to 2.1.3, apache-airflow==2.2.5 apache-airflow-providers-cncf-kubernetes to 2.2.0, apache-airflow-providers-slack to 4.2.3, acryl-datahub to 0.8.31.6, boto3 to 1.21.32
  • [General]: When scheduling an application, we now don't fail at the first ImagePullBackOff happening in kubernetes, we need three failure events. This makes the operator more robust to temporary network failures.
  • [UI]: On the costs page add the selected cost range to the URL, this makes it easier to share URL's with other people
  • [UI]: On the streaming application pages added the selected filter to the URL, this makes it easier to share URL's with other people
  • [CLI]: Added the command datafy project generate-config, which will generate the .datafy/project.yaml file for a project. This is useful when forgot to check it into git or when you use the terraform provider.

0.62.1 (04-04-2022)

features

  • [CLI]: add login fallback when the automatic cli login does not work or is not supported.
  • [UI]: If your login is expired, and you go to your Airflow URL, we now redirect you to your Airflow page again after logging in.

bugfixes

  • [UI]: Go to the correct landing page after logging in from an invitation link
  • [DOCS]: Small cleanup in the pyspark and spark tutorial

0.62.0 (29-03-2022)

features

  • [General]: Run the on-demand instances autoscaling group as a mixed instance fleet, that way we can handle a certain instance type note being available on aws
  • [Spark]: Released new spark images that add a new log4j-executor.properties file that reduces logging for spark executors, this results in cloudwatch cost savings. The new images are:
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-hadoop-3.3.1-v3
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-2.13-hadoop-3.3.1-v3
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.3-hadoop-3.3.1
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1-v5
  • [UI]: Added type email to our login email field, this way browsers and password managers will recognize it better
  • [UI]: Simplify create environment modal when there is only 1 cluster

bugfixes

  • [General]: Fixed an issue when migration our datafy config file
  • [CLI]: Fix an issue when deleting of notebooks fails
  • [UI]: Fix login flow from CLI

0.61.7 (18-03-2022)

bugfixes

  • [UI]: Revert runtime-config

0.61.6 (18-03-2022)

bugfixes

  • [UI]: Correct runtime-config for production

0.61.5 (17-03-2022)

bugfixes

  • [General]: Add cluster endpoints to management API

0.61.4 (16-03-2022)

bugfixes

  • [CLI]: Fix cicd flow when passing environment variables

0.61.3 (16-03-2022)

bugfixes

  • [General]: Fix listing users
  • [CLI]: Fix cicd flow

0.61.2 (16-03-2022)

bugfixes

  • [General]: Make sure the Conveyor team can access the tenants

0.61.1 (16-03-2022)

bugfixes

  • [Docs]: Regenerated docs for template release 0.15.5
  • [UI]: Fix IDP login for dataminded users

0.61.0 (16-03-2022)

features

  • [UI]: Revamped login flow
  • [Templates]: Update the spark settings so the aws glue integration works again
  • [Templates]: Use dots instead of underscores for specifying the Conveyor_instance_type. (@nclaeys)

0.60.1 (10-03-2022)

bugfixes

  • [Spark Streaming]: Fix missing applications in the UI.

0.60.0 (08-03-2022)

features

  • [UI]: Add button for administrators to invite new users
  • [Airflow]: Make conveyor_instance_type specification consistent by using dots everywhere
  • [Spark Streaming]: Added an alerting option to spark streaming support, for more info see here
  • [General]: Upgrade aws ebs csi driver to 2.6.3
  • [Airflow]: Upgrade Airflow 2 to 2.2.4

bugfixes

  • [CLI]: Remove debug message when copy image fails due to access denied
  • [UI]: Fix filtering on environment and schedule in task executions page
  • [UI]: Refresh page of streaming application that is in state pending every 10 seconds
  • [CLI]: Refactor the result of datafy project list-users and datafy environment list-users to not include /User/ string

0.59.4 (18-02-2022)

bugfixes

  • [Airflow]: Do not set tcp_keepalive when using airflow v1 as it does not exist.
  • [CLI]: Fix datafy update for Apple Silicon Macs
  • [General]: Change the spark version used by the spark history server to not have issues with verifying the S3 ssl certificates

0.59.3 (16-02-2022)

bugfixes

  • [Notebooks]: Use the correct images when launching a notebook from the UI

0.59.2 (16-02-2022)

features

  • [UI]: Added instance type and lifecycle to task execution details page
  • [UI]: Show deletion protection status in environments page
  • [UI]: Added delete button in environments page
  • [UI]: Added button to create new environments
  • [UI]: Added button to create notebooks
  • [General]: We run our agent on the cluster instead of on ECS
  • [Spark]: Release spark 3.2.1 images
  • [Docs]: Added documentation about spark hive integration issues.
  • [Docs]: Migrated the documentation to Docusaurus, this should allow us to make the documentation more user-friendly
  • [Notebooks]: Do not copy the virtual environment to nfs but only the project related files in order to speed up notebook creation

0.59.1 (03-02-2022)

bugfixes

  • [CLI]: Make sure the docker client used by Conveyor also uses the typical Docker environment variables
  • [CLI]: Allow uploading of dag files larger up to 16MB, up from 1MB. Also fail if a larger file is detected instead of printing a warning
  • [UI]: Fix an error that was appearing on the first page load
  • [UI]: Fix a problem in the admin user panel where a project could not be added to a user
  • [UI]: Fix a scrolling issue in the embedded Airflow page

0.59.0 (31-01-2022)

features

  • [Notebooks]: Added support for notebooks persistence. This means notebooks can now be stopped and started using the CLI and the UI.
  • [General]: Cleanup unused aws secrets manager secrets
  • [General]: Improved the pg-bouncer SSL setup to the RDS server by validating the RDS CA, the RDS only accepts encrypted connections now
  • [CLI]: Added the same docker build flags for notebook create that are supported in project build
  • [CLI]: Support for podman as container manager

bugfixes

  • [General]: fix 2 small issues with the configuration of notebook properties
  • [Airflow]: Update the connections used by Airflow 2 to be sourced from the environment. That way we should have fixed the issues with a connection being temporarily unavailable and that leading to a job failure
  • [RBAC]: Fixed an issue where a project admin could not manager users on a project

0.58.1 (10-01-2022)

bugfixes

  • [CLI]: Fixed an issue when building a project would not work
  • [PySpark]: Update pyspark images such that setuptools (>=60.0.0) also installs global python packages in the correct directory for debian.
  • [Template]: Upgrade templates to 0.15.4

0.58.0 (7-01-2022)

features

  • [General]: Release the preview of the costs feature.
  • [General]: Pre cache notebook and project base images to speed up uploading images the first time.
  • [General]: use m6i instances instead of m5 when launching new nodes as they are more cost efficient

bugfixes

  • [UI]: Fixed a bug where a job duration wasn't updated, and it also refused to show the metrics because of that.
  • [General]: Fixed an issue where trying to use a secrets from aws without an IAM role would take 30m to fail.
  • [CLI]: Fixed an issue where starting a new notebook wouldn't work

0.57.1 (6-01-2022)

bugfixes

  • [CLI]: Fixed a bug where starting a notebook could scramble the order in your notebooks.yaml if you had multiple definitions
  • [Spark]: Released new spark images that fix a vulnerability wit log4j 1.x, for more information, see our documentation.

0.57.0 (3-01-2022)

features

  • [CLI] Rename all new CLI commands to create to consistently use verbs for CLI commands (the new commands still work as aliases of the create commands for backwards compatibility)
  • [UI] Added Git hash and repo link to task execution detail view
  • [UI] Added "Trigger" column to the task execution page to distinguish tasks triggered by Airflow or via datafy run
  • [UI] Display the task operator version in the task execution page
  • [UI] Add a button to cancel a task execution
  • [UI] Add option to wrap long lines in task logs view

0.56.5 (20-12-2021)

features

  • [General]: add command to stop a project run, when your terminal gets detached.
  • [General]: add gcc and g++ libraries to the notebook base image and extend the notebook documentation to describe how to install pyodbc.

bugfixes

  • [General]: make sure a failed applicationEvent is sent when cancelling a manual project run for both spark and container tasks.
  • [General]: fix crashlooping notebook when mounting secrets from SSM or Secretsmanager.

0.56.4 (13-12-2021)

features

  • [CLI]: The CLI binary is now available for Apple Silicon.
  • [UI]: All page headers are now collapsible

bugfixes

  • [UI]: Fix redirect after login and logout

0.56.3 (9-12-2021)

features

  • [Airflow]: Indicate whether airflow workers are killed due to spot termination (container, spark, container_sensor)

bugfixes

  • [General]: When using V2 Operators we now enable the sts regional endpoints by default. This removes the dependency for using your IAM roles on the us-east-1 region and is recommended by AWS

0.56.2 (7-12-2021)

bugfixes

  • [Streaming]: Fixed an issue where the spark application could not be created if the name of the application was too long
  • [Notebooks]: Use the same service account pattern as projects running from airflow and streaming
  • [Airflow]: ConveyorSparkSubmitOperatorV2 tasks with a . in the name could not be started properly
  • [General]: Performance improvements when processing task executions

0.56.1 (2-12-2021)

features

  • [Notebooks]: Added possibility to work on datafy notebooks from your IDE
  • [Docs]: Updated dbt tutorial using latest template version
  • [Airflow]: test tasks can be turned on/off when using the dbt task factory
  • [General]: Detect spot node interruptions and handle it as a specific failure for a container/spark job

bugfixes

  • [CLI]: Removed some excess logging statements when doing a login

0.56.0 (30-11-2021)

features

  • [CLI]: Automatically add the remote repo url in the project info at project build time when it was empty
  • [General]: Release the beta version of the notebooks feature
  • [General]: Map container start error to a pod failure state such that it can be shown in application runs
  • [UI]: Add route to root from Conveyor logo
  • [Template]: Upgrade templates to 0.15.3

bugfixes

  • [Airflow]: Remove trailing dot from conveyor_ui_domain variable. This ensures that the airflow variables: base_url, jwt_audience are correctly set.
  • [Airflow]: Support passing None values in env variables to conveyor_container_operator_v2
  • [General]: Fix deletion of streaming applications for projects that have an underscore in the name
  • [General]: Improve error message with serviceAccountName when assumeRole fails when fetching secrets
  • [UI]: Correctly show start time and finished time in application runs details when the container is still pending

0.55.2 (22-11-2021)

bugfixes

  • [General]: Fixed deleting of users with the operator role from an environment

0.55.1 (19-11-2021)

bugfixes

  • [General]: Revert cleanup of project versions in ssm as it failes when there are more than 10 projects in the request

0.55.0 (19-11-2021)

features

  • [Airflow]: Upgrade Airflow 2 to version 2.2.2
  • [General]: Remove the use of terraform when creating/deleting environments. This makes the creation/deletion of environments faster
  • [CLI]: Remove irrelevant warning about encryption when using airflow validate dags

bugfixes

  • [General]: Fixed an issue with spark executor metrics not showing up
  • [General]: Fixed a bug with mount ssm parameters or aws secrets manager secrets as environment variables. If you mounted the same secret twice but with a different path the application would never start
  • [Airflow]: Open Application Runs Airflow button in new tab again, the behaviour got changed by upgrading to Airflow 2.2.x, but it makes more sense to open in a new tab by default
  • [CLI]: Remove warning about validation being run with Airflow 2
  • [CLI]: When checking if the Dockerfile exists during datafy build we know take the project config docker path into account
  • [Templates]: Correct resources S3 template to use like instead of equals in trust relationship condition

0.54.12 (15-11-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.12/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.12/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [General]: Fixed an issue with the migration from RDS proxy to pg-bouncer not going smoothly

0.54.11 (14-11-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.11/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.11/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [General]: Temporary rollback in the way we capture events of running applications

0.54.10 (11-11-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.10/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.10/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [General]: Fixed an issue processing events of running applications

0.54.9 (10-11-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.9/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.9/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [General]: Finished the migration away from the deprecated EFS provisioner. The new EFS file system we use is encrypted as well.
  • [Airflow]: Make Airflow database migration more robust by giving them longer to finish.
  • [UI]: Show the reason a job fails in the UI. We used to only detect out of memory issues, we now expanded this with secrets issues, image pull issues, etc....

bugfixes

  • [General]: Fixed an issue with Spark event uploading where sometimes Spark did not cleanup the in progress properly, resulting in two eventlog files on the system.
  • [Airflow]: Merged an upstream Airflow patch to fix an issue with get_next_data_interval so that it does not fail when there is no next_run defined yet. This will be fixed in future Airflow releases as well.

0.54.8 (05-11-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.8/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.8/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [General]: Enable KMS encryption for our SQS queues
  • [Airflow]: Upgrade to Airflow 2.2.1
  • [Spark]: Added Spark 3.2.0 image with support for Scala 2.13
  • [Template]: Upgrade templates to 0.15.1

bugfixes

  • [Airflow]: Fixed an issue where a spot interrupt could result in a false task success in Airflow for the ConveyorSparkSubmitOperatorV2
  • [General]: Fixed an issue with cleaning up Spark applications where the driver node gets interrupted
  • [General]: When using datafy run and printing big log lines, datafy run would crash. We now split these lines into multiple chunks fixing the issue.

0.54.7 (28-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.7/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.7/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [General]: Remove RDS proxy as we switched to using pg-bouncer instead
  • [General]: Migrate the dags volume to the encrypted EFS drive
  • [CLI]: Update used templates to 0.15.0

bugfixes

  • [General]: Fix slow deletion of environment, we were trying to delete files while they were in use. Now we make sure they are not in use before deleting them

0.54.6 (27-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.6/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.6/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [General]: Upgrade node local dns to 1.21.1
  • [General]: Use 1.0.0 of secrets csi driver on the kubernetes cluster
  • [Airflow]: Added the opsgenie provider to airflow 2
  • [General]: Added support for spark 3.2.0
  • [CLI]: Added explanation to datafy run about the default execution-date used to start your job

0.54.5 (26-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.5/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.5/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [Airflow]: Migrate EFS logs storage to an encrypted volume
  • [General]: Make datafy run more robust, and add extra logging when something goes wrong

0.54.4 (22-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.4/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.4/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [General]: Upgrade to aws eks cni 1.9.3
  • [General]: Migrate to new encrypted EFS volume for spark events

0.54.3 (20-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.3/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [Airflow]: Support for environment variables when using the dbt task factory
  • [General]: Upgraded components of the K8s cluster to newer version that use IMDSv2
  • [General]: Enable deletion protection on the RDS database instance used by Airflow
  • [General]: Enabled deletion protection and drop invalid headers on the ALB used by Conveyor

bugfixes

  • [General]: Fix bug where Spark applications were never cleaned from Kubernetes when the Spark event log directory was never created
  • [CLI]: Fixed an issue where the Conveyor yaml migration would update old project to automatically use Airflow 2 validation, now this defaults to using Airflow 1.10 again

0.54.2 (15-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.2/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [General]: Make sure the necessary components for secrets are installed on all nodes
  • [General]: Give a bit more memory to the components managing the metrics and the logs to keep them from running out of memory
  • [General]: Make Airflow 2 validation the default for new projects, bringing it in line with Airflow 2 being the default for a new environment

0.54.1 (14-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.1/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [General]: Remove the old (and unused) NLB, we migrated to an ALB in our previous release but kept this around to be able to reverse

bugfixes

  • [CLI]: Fix datafy run and dag validation not working because of IAM credential issues.

0.54.0 (13-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.0/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [Airflow]: ConveyorSparkSubmitOperatorV2 and ConveyorContainerOperatorV2 added support for mounting secrets from SSM and Secrets Manager as environment variables.
  • [General]: Removed unneeded access to S3 from Conveyor clusters on other accounts
  • [General]: Increased S3 security by not allowing non-SSL requests on Conveyor artifacts bucket
  • [General]: Encrypt the root EBS volume of Kubernetes worker nodes
  • [CLI]: Update templates to 0.14.0

0.53.3 (28-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.3/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [UI]: Enable streaming UI by default, it was hidden behind a feature flag before
  • [UI]: Enable RBAC UI by default, it was hidden behind a feature flag before

0.53.2 (28-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.2/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [UI]: Fix issues with the logs page

0.53.1 (27-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.1/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [CLI]: Fix broken table output

0.53.0 (27-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.0/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [General]: Release spark streaming support
  • [General]: Upgrade postgresql 13
  • [Airflow]: Upgrade to Airflow 2.1.4
  • [General]: Make Airflow 2 the default version and deprecated Airflow 1 support, this means that in a future release Airflow 1 will be removed.
  • [General]: Upgrade the aws k8s cni to version 1.9.1
  • [General]: Adding support for mounting external environment variables

bugfixes

  • [Airflow]: Handle connection issues with kubernetes gracefully for V2 operators.

0.52.10 (16-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.10/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.10/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [Airflow]: Fixed an issue where a spark submit could be scheduled a second time making the job fail in Airflow. This only happened very sporadically.

0.52.9 (16-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.9/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.9/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [RBAC]: The Airflow and Spark UIs of a given environment are now only accessible for users who have the Operator/Contributor/Administrator role for this environment.

bugfixes

  • [General]: Several small bug fixes in the code used by the Airflow v2 operator.
  • [Airflow]: Fixing the dbt factory with dependencies on sources items.

0.52.8 (09-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.8/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.8/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [RBAC]: Added the operator role to an environment. This role allows users to view and operate the airflow of an environment, but does not allow them to deploy new releases to an environment.

bugfixes

  • [General]: Several small bug/performance fixes in the code used by the Airflow v2 operator.

0.52.7 (07-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.7/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.7/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [General]: Fixed a bug where Airflow thought a job was failed but the datafy UI showed the application was finished correctly. While looking for this bug we enhanced our code to make it easier to figure out the root cause next time.

0.52.6 (20-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.6/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.6/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [Airflow]: Slack providers available in Airflow
  • [Airflow]: Tag filtering on dbt task factory
  • [CLI]: Selecting the airflow version for validation and run

0.52.5 (18-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.5/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.5/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [General]: Upgrade the eks cni used to 1.9.0

bugfixes

  • [Airflow]: Make sure the datafy runs button goes to the correct page
  • [Airflow]: Airflow 2.1 introduced a new kubernetes setting worker_pods_pending_timeout that kills pods by default that are more than 300s pending. This sometimes resulted in jobs being killed, we are setting it to 600s by default and will look into making scheduling faster.

0.52.4 (17-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.4/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.4/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [CLI]: Asking for the task when no task is passed to the run command
  • [General]: Preparations for future releases

0.52.3 (13-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.3/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [General]: Fixed an issue where our operator managing airflow, spark-, and container-runs was crashing from a nil pointer. We fixed the issue and made sure it can't crash because of a single nil pointer.

0.52.2 (12-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.2/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [CLI]: Fixed a bug that broke the CLI

0.52.1 (12-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.1/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [CLI]: When CLI is out of date recommend using datafy update to update the CLI
  • [CLI]: Allow configuration files smaller then 1MB to be uploaded as part of the DAG folder

bugfixes

  • [CLI]: Added retry logic when fetching logs with datafy run fails.
  • [Airflow]: Fixed an issue where the Airflow 1 graph view would crash when using the ConveyorSparkSubmitOperator. Old task executions might still have issues but not ones will work again.
  • [RBAC]: Fixed non-thread safe code in RBAC checking that resulted in a 500.
  • [CLI]: Update templates to 0.12.1

0.52.0 (02-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.0/conveyor_darwin_amd64.tar.gz

Release notes

features

  • [Airflow]: Added on-demand options for the Airflow Conveyor V2 Operators. Look at the docs of ConveyorSparkSubmitOperatorV2 and ConveyorContainerOperatorV2 to find out more.
  • [General]: We installed node local dns into the cluster, this should improve dns responsiveness and reduce errors. We also improved the robustness of the DNS setup
  • [Aiflow]: Conveyor application runs button on operators now goes to the project's page i.s.o. environment.
  • [Airflow]: Added s3 committer option for the Airflow Conveyor V2 Spark Operator, allowing the usage of the S3 magic committer. Look at the docs of ConveyorSparkSubmitOperatorV2 to find out more.
  • [CLI]: Added datafy update command, this will replace your current executable with the newest version available.
  • [CLI]: Fixed homebrew installation of zsh autocomplete, and added fish autocompletion via homebrew
  • [Doc]: Added documentation on how to install datafy completion scripts on linux.

0.51.0 (27-07-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.51.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.51.0/conveyor_darwin_amd64.tar.gz

features

  • [Airflow]: Added ConveyorExternalTaskSensor
  • [Airflow]: Upgrade to airflow 2.1.2
  • [General]: Memory optimisations on the tools running on the cluster
  • [General]: Upgrade the eks cni to the latest 1.8 version as recommended by aws, and corrected it's settings
  • [General]: Upgrade eks cluster to version 1.21
  • [General]: Use containerd as a container runtime i.s.o. docker
  • [Spark]: released the following images with hadoop cloud support, and a python 3.8 installation with lmza support.
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.2-hadoop-3.3.1-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.2-hadoop-3.3.1-python-3.8-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1-python-3.8-v2
  • [Airflow]: Added acryl-datahub python package to the airflow images

bugfixes

  • [Spark History Server]: We give it a gigabyte more memory so it can keep up with a lot of spark jobs being scheduled
  • [General]: Fixed an issue when calculating memory for the datafy instance types

0.50.2 (13-07-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.2/conveyor_darwin_amd64.tar.gz

features

  • [Airflow]: Upgrade airflow 2 version to 2.1.1
  • [UI]: Added environment and project links to the task executions page

bugfixes

  • [General]: Fixed a bug were spark application sometimes weren't cleaned up properly
  • [CLI]: Token was automatically refresh 10min after expiry, instead of 10 minutes before

0.50.1 (07-07-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [CLI]: Print the airflow version in environment list correctly
  • [CLI]: Print logs correctly for the datafy run command

0.50.0 (07-07-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.0/conveyor_darwin_amd64.tar.gz

features

  • [UI]: Added task executions to the project page
  • [UI]: Task executors page make task type an icon
  • [UI]: Allow task executions links to be opened in new tab
  • [General]: Enable image scanning on push on project ECR images
  • [General]: Upgrade eks to version 1.20
  • [General]: Upgraded to terraform 1.0.1
  • [Airflow]: The ConveyorContainerSensor has been released
  • [Spark]: released the following images with improved logging output, and support for hadoop 3.3.1 and spark 3.0.3. For migration to hadoop 3.3.1 see here.
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.2-hadoop-3.3.0-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.2-hadoop-3.3.0-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.2-hadoop-3.3.1
  • [Airflow]: ConveyorSparkSubmitOperatorV2 set setting spark.hadoop.fs.s3a.aws.credentials.provider by default to: com.amazonaws.auth.DefaultAWSCredentialsProviderChain
  • [CLI]: Cleaned up the output of all commands
  • [Template]: Update template to 0.11.0

bugfixes

  • [Spark History Server]: Improvements to make the spark history server more stable
  • [CLI]: Text was sometimes printing wrong when using a spinner resulting in words like: imagege, resourceses...

0.49.3 (22-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.3/conveyor_darwin_amd64.tar.gz

bugfixes

  • [General]: Fixed a small issue in Airflow authentication

0.49.2 (21-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.2/conveyor_darwin_amd64.tar.gz

bugfixes

  • [General]: Fixed an issue with cleaning up old builds

0.49.1 (18-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [UI]: Fixed issues with rendering Airflow inside of our UI
  • [General]: Fixed an issue with cleaning up old builds

0.49.0 (17-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.0/conveyor_darwin_amd64.tar.gz

features

  • [UI]: Embedded airflow into the Conveyor UI. This means on the environment page you can look at Airflow without leaving the Conveyor UI. You can still open the Airflow UI full screen if you want to.

bugfixes

  • [UI]: Fixed issues with pagination in the UI

0.48.1 (16-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.48.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.48.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [RBAC]: datafy project run resulted in unauthorized with RBAC enabled.
  • [UI]: Users panel project/environment role selection is empty after assigning to a user

0.48.0 (11-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.48.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.48.0/conveyor_darwin_amd64.tar.gz

features

  • [CLI]: Adding often used project commands to the root command.
  • [General]: Improved the cleaning up of old builds, this should reduce the bill on your cloud account.
  • [Airflow]: Upgraded to Airflow 2.1.0. A known issue related to the auto-refresh on the dag tree view exists, and will be fixed in a next release of Airflow: https://github.com/apache/airflow/pull/16018/files
  • [General]: Upgraded the version of Terraform used by the Conveyor agent to 1.0.0.

bugfixes

  • [Airflow]: Fix CSRF issues after Airflow Web restart.
  • [Airflow]: ConveyorSparkSubmitOperatorV2 no longer results in an error when inputting an integer as value in the Spark config.
  • [CLI]: DAG validation no longer warns about Conveyor plugins in Airflow.
  • [Airflow]: An edge case is fixed in the V2 Operators where failed applications were not properly detected.
  • [CLI]: The description in datafy environment new and datafy environment update no longer says to enable experimental mode for Airflow 2 (as this is not the case anymore).
  • [General]: Fixed an issue where deleting a project could result in an update to an Airflow 2 environment failing.
  • [General]: Allow CI/CD token access to the Airflow 2 API.

0.47.2 (03-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.2/conveyor_darwin_amd64.tar.gz

features

  • [General]: Airflow V2 operator, and datafy project run now detect application being evicted because of disk pressure and will warn you about this happening.
  • [CLI]: Added execution date support to datafy project run.

bugfixes

  • [UI]: The logs UI was fetching finished application logs in the wrong order, this is now fixed.

0.47.1 (02-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [Spark]: Fix Spark history server upload when spark application names are very long.

0.47.0 (01-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.0/conveyor_darwin_amd64.tar.gz

features

  • [UI]: The UI received some new paint. We migrated to a new framework which will allow us to make a UI that is more uniform and easier to maintain.
  • [CLI]: Restructured the output of datafy project run to make it more focussed
  • [General]: Using datafy Airflow operator V2 or datafy project run will now warn you if you use a container image that can't be pulled.

bugfixes

  • [CLI]: Make datafy project run robust against print statements and logs in dags.
  • [Spark]: Fixed a bug where setting spark.executor.cores and spark.drives.cores resulted in unauthorized with the ConveyorSparkSubmitterV2.

0.46.2 (27-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.2/conveyor_darwin_amd64.tar.gz

features

  • [CLI]: Turn on quiet flag with env variable QUIET=true.
  • [Airflow]: Changed our recommendation warning in the ConveyorContainerOperator when setting cpu/memory limits. In short you should not set these yourself.
  • [General]: Make spark.driver.cores and spark.executor.cores user editable in the ConveyorSparkSubmiteOperatorV2. This can be usefull for IO/CPU bound jobs.

bugfixes

  • [Airflow]: Fixed the ConveyorSparkSubmitOperatorV2 and ConveyorContainerOperatorV2 not sending application run events.

0.46.1 (26-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [General]: Bugfix where terraform wouldn't be destroyed when using datafy project undeploy
  • [Spark]: Fixed the executors page in the spark history server.
  • [UI]: Spark UI button was not working.

0.46.0 (26-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.0/conveyor_darwin_amd64.tar.gz

features

  • [Spark]: Added spark history server support when using the ConveyorSparkSubmitOperatorV2.
  • [CLI]: Clarify datafy project run documentation. This does not support deploying resources, this is made clear in the docs.

bugfixes

  • [General]: Long project, dag or task name could result in jobs not being scheduled this is now fixed
  • [General]: When pending application are canceled they would stay in pending forever. They are now set to failed.
  • [Airflow]: Fixed an issue when running an extra datafy cluster in another region/account.

0.45.3 (20-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.3/conveyor_darwin_amd64.tar.gz

bugfixes

  • [Airflow]: Fixed some caching issues with forwarding the Airflow UI

0.45.2 (20-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.2/conveyor_darwin_amd64.tar.gz

features

  • [Airflow]: Added ui colors to v2 operators
  • [Airflow]: ConveyorSparkSubmitOperatorV2 add support for env_vars
  • [Airflow]: ConveyorContainerOperatorV2 added support for legacy kube2iam way of assuming roles.

bugfixes

  • [Airflow]: ConveyorSparkSubmitOperatorV2 and ConveyorConainerOperatorV2 would result in failures when project had underscore. The service account that contains the project name replaces underscores _ with dots .

0.45.1 (19-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.1/conveyor_darwin_amd64.tar.gz

features

  • [CLI]: datafy project upgrade-dags also replaces the import for ExternalTaskSensor

bugfixes

  • [Airflow]: Fix issues with long airflow task names and V2 operators
  • [Airflow]: Fix issues with characters in task names that are now allow on kubernetes for V2 operators
  • [UI]: Empty log pages will be skipped in cloudwatch to return the first non empty page that can be found

0.45.0 (18-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.0/conveyor_darwin_amd64.tar.gz

The main feature of this release is the release of new version of our Airflow operators with datafy project run support. This allows you to locally start a job on the remote cluster without having to build, deploy and clear the task in Airflow.

features

  • [Airflow]: Airflow 2.0 has been upgraded to 2.0.2
  • [Airflow]: Release ConveyorSparkSubmitOperatorV2and ConveyorContainerOperatorV2
  • [CLI]: datafy project run for airflow tasks that use the new V2 Operators
  • [CLI]: Dag validation also runs the NoAdditionalArgsInOperatorsRule from airflow, as undefined args give an error in Airflow 2.0.
  • [Documentation]]: Updated documentation structure
  • [Template]: Set default templates version to 0.9.0.

bugfixes

  • [CLI]: Fixing support for resource templates
  • [General]: When deleting a project environments weren't properly updated.
  • [General]: Make our kubernetes setup more tolerant to spot interruption failures.

0.44.5 (03-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.5/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.5/conveyor_darwin_amd64.tar.gz

bugfixes

  • [Airflow]: Extending the job_heartbeat_sec from 5s to 10s and configuring a db connection timeout of 30 seconds to avoid jobs failing when missing a heartbeat.

0.44.4 (29-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.4/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.4/conveyor_darwin_amd64.tar.gz

features

  • [Agent]: Upgrade our agent to use terraform 0.14.11

bugfixes

0.44.3 (28-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.3/conveyor_darwin_amd64.tar.gz

bugfixes

  • [Internal]: Reduce CPU load on airflow manager

0.44.2 (26-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.2/conveyor_darwin_amd64.tar.gz

bugfixes

  • [UI]: Some CSS changes made the logs font too big, some buttons overflow etc. We fixed these changes.

0.44.1 (26-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [General]: Airflow RDS proxy is not available in all regions, so disable the use in region that don't have it available

0.44.0 (26-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.0/conveyor_darwin_amd64.tar.gz

features

  • [Airflow]: Airflow 2.0 has been made generally available.
  • [Airflow]: We use a new way of deploying to Airflow, further speading up deployments.
  • [Airflow]: The ConveyorContainerOperator does not require you to fill in the name parameter anymore.
  • [Airflow]: Airflow 2.0, enable access to the API using a datafy token. You can get the token with datafy auth get, look in the docs for more info.
  • [Airflow]: Set default parallelism to 64 up from 32, and default dag concurrency to 32 up from 16.
  • [CLI]: Remove experimental flag from cluster commands as this is GA.
  • [Template]: Set default templates version to 0.8.1.

bugfixes

  • [General]: Do not allow to deploy, undeploy, delete an environment that is being deleted.
  • [UI]: Show friendly message when log aren't available yet instead of generic error
  • [Airflow]: Fixed a bug that stopped dags from being synced.
  • [Airflow]: Added a proxy to the Airflow Database so we can handle more connection to the database then before.

0.43.1 (12-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.43.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.43.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [General]: Update agent terraform to 0.14.10, and update the terraform locked providers

0.43.0 (12-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.43.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.43.0/conveyor_darwin_amd64.tar.gz

features

  • [DOCS]: Added Airflow 2.0 migration path to the docs.
  • [Airflow]: Support for Airflow 2 has been made available for environments with the experimental flag enabled.
  • [General]: Experimental environment can enjoy an even faster upgrade experience.
  • [General]: Experimental on an environments can not be disabled anymore. This allows us to only have make sure migration work in one way.
  • [CLI]: The command datafy project upgrade-dags, will update your dags for usage with Airflow 2. Your dags will still work on Airflow 1.
  • [UI]: Revamped the logging UI. It will now show the latest logs when a job is finished. You can also choose to see the latest logs when the job is running.
  • [Templates]: Are upgraded to 0.8.0

0.42.0 (7-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.42.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.42.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: The experimental dag syncer has now become the default making deploys at least twice as fast!
  • [General]: The terraform version used by our agent has been upgraded to 0.14, to enable this transition we added an automatic upgrade process from 0.12 to 0.13 to 0.14 to the agent.
  • [General]: Added experimental flag to environment to enable experimental features. This flag will be used in the future to test new features. At the moment behind this flag we test a new way of deploying airflow.
  • [Airflow]: Upgraded airflow to version 1.10.15
  • [Airflow]: Added Conveyor macros validation to Conveyor dag validation. Macros should be changed from macros.env to macros.datafy.env
  • [Docs]: We added docs about airflow alerting using the ConveyorContainerOperator

bugfixes

  • [Airflow]: Fixed a bug when using dag syncer where dag files where unavailable for a short time.

0.41.0 (24-3-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.41.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.41.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: Eks 1.19 upgrade, if you have the following in your DAG somewhere you can safely remove it:
security_context={
"fsGroup": 185,
}
  • [General]: Updated the documentation url to https://docs.datafy.cloud
  • [Airflow]: Airflow 2.0 warnings during dag validation
  • [Airflow]: Made the Airflow UI more stable by running the http proxy component on fargate.
  • [General]: Send less information to cloudwatch to reduce costs.

bugfixes

  • [CLI]: Ignore the possible __pycache__ folder in the dags folder when building a project
  • [UI]: Fixed issue where we failed in parsing the logs from cloudwatch to be shown in the UI

0.40.2 (19-3-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.2/conveyor_darwin_amd64.tar.gz

bugfixes

  • [UI]: Fixed an issue where certain logs couldn't be shown in the UI

0.40.1 (16-3-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [UI]: Fixed an issue where newer and older logs buttons where not working anymore
  • [Airflow]: Fixed an issue with experimental dag sync

0.40.0 (12-3-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.0/conveyor_darwin_amd64.tar.gz
caution

If you are using resources with datafy we added default provider "aws" when deploying your code. If you have a provider "aws" without an alias in your code this can break things, just rename your provider to use an alias or remove it.

features

  • [CLI]: Added build arg support to the CLI
  • [UI]: Application runs page has a spark UI button added and the logs page is the default i.s.o. the metrics page
  • [UI]: Application runs are now clickable on the environment page
  • [Airflow]: Added and experimental dag sync way of deploying dags to airflow. This should make deploys quite faster (in the range of 20 - 40s for a deploy). You can test this out in your development environment by updating the environment and using the flag --experimental-dag-sync=true. To update your environment do:datafy environment update --name YOURENVIRONMENT --experimental-dag-sync=true . However it is not recommended for production.
  • [Spark]: spark 3.1.1 images release: public.ecr.aws/dataminded/spark-k8s-glue:v3.1.1-hadoop-3.3.0, datamindedbe/spark-k8s-glue:v3.1.1-hadoop-3.3.0

bugfixes

  • [CLI]: When creation of a project fails we now clean up the project that was generated if you used a template
  • [UI]: When pressing refresh logs button on the main logs page the URL path would get into an undefined state, you couldn't share this URL. This is now fixed.

0.39.1 (01-3-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.39.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.39.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [Airflow]: Fixed an issue rolling out or creating a new airflow environment

0.39.0 (01-3-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.39.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.39.0/conveyor_darwin_amd64.tar.gz

features

  • [UI]: Added a refresh button to the application run log view
  • [General]: The image field does not need to be specified to use Conveyor Airflow operators anymore, it uses the name of the project by default.
  • [General]: There is no restriction anymore on the IAM role names that the Conveyor operators can use. There used to be a constraint where the role needed to have the prefix datafy-dp-{env} but not anymore.
  • [General]: Conveyor instance types now support all spark memory options
  • [General]: We updated to spark to version 3.0.2. New docker image available, see here.
  • [Templates]: Updated the templates to use the spark 3.0.2 image.
  • [Documentation]: Updated the CI/CD authentication documentation, see here
  • [Documentation]: Added a documentation page about setting up Conveyor using WSL2 on Windows, see here

0.38.0 (15-2-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.38.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.38.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: Allow the terraform resources to attach existing AWS policies
  • [General]: Support datafy instance types - see here _**_and here
  • [General]: Optimise IP usage by the Kubernetes cluster
  • [Templates]: Update templates to the latest version - See here for release notes

0.37.0 (29-1-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.37.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.37.0/conveyor_darwin_amd64.tar.gz

features

  • [UI]: Added logs to the application runs UI. From now on you can view the logs of your application directly in the Conveyor UI instead of being redirected to AWS cloudwatch.
  • [CLI]: Added flag --no-browser to the cli. This prevents the Conveyor cli from opening a browser automatically but instead prints the url for loging in.

bugfixes

  • [CLI]: Fixed wrong upload message when deploying a project.
  • [CLI]: Print an error when files needed can't be read during project build instead of silently crashing
  • [General]: Fixed one last small instance where we used a docker hub image instead of public ECR
  • [General]: If using project resources and if you still had a state.tf file things would break because of a new terraform kubernetes provider release. We now fixed this.

0.36.0 (15-1-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.36.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.36.0/conveyor_darwin_amd64.tar.gz

features

  • [Airflow]: We are migration certain pieces to be airflow compatible. This should have no impact on your environment:
    • Migrated our spark submit operator to use the backported spark submit operator from Airflow 2.0.
    • We migrate the kubernetes executor config to be Airflow 2.0 compatible
  • [CLI]: Upgraded to datafy-template 0.4.0 which uses the new import paths for operators used in Airflow 2.0
  • [General]: Migrated to use public ECR where possible

bugfixes

  • [UI]: In the metrics page certain charts where not properly unloaded resulting in a memory leak and a slow page
  • [UI]: Fixed a bug where the metrics page could crash when no spark executors were started up.
  • [UI]: Fixed a small bug where we were showing a wrong message in the metrics page when your application was not running for long enough
  • [General]: removed m5zn type aws instances from autoscaling groups as they gave issues with kubernetes.
  • [CLI]: Skip airflow dag validation in datafy project build when the project does not contain dags
  • [CLI]: Fixed the wrong error message when trying to apply a non-existing template in datafy template apply

0.35.1 (30-12-2020)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.35.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.35.1/conveyor_darwin_amd64.tar.gz

features

  • [General]: Migrated to use the new GP3 volumes instead of GP2, these are 20% cheaper, provide more IOPS and same troughput. This will make IO intensive jobs on the platform faster.
  • [General]: The kubernetes autoscaler will now downscale nodes after 5 minutes instead of 10m.
  • [General]: Made our spark images available on ECR public: https://gallery.ecr.aws/dataminded/spark-k8s-glue
  • [CLI]: Updated to the latest datafy templates release

bugfixes

  • [UI]: Changed the airflow logo to the new flat version
  • [UI]: Fixed the Conveyor logo to work on all platform (unix, mac and windows)
  • [UI]: Spark jobs with a lot of executors (40+) could not show all of their executors in the legend of the metrics page. This has been fixed.
  • [UI]: Metrics page, failure reason did not show properly when hovering over the information icon.
  • [General]: Better cleanup of resources in the kubernetes cluster after building airflow images
  • [General]: Installation of the Conveyor cluster would fail in an existing VPC network. This has been fixed
  • [General]: Migrated the installation of the kube2iam Helm chart to the new version

0.35.0 (22-12-2020)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.35.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.35.0/conveyor_darwin_amd64.tar.gz

features

  • [Airflow]: Upgraded airflow to 1.10.14 this should fix some scheduling issues users experienced with depends_on_past or task_concurrency . For more information see the airflow release notes.
  • [General]: Made environment rollout 10 to 20s faster by caching the terraform providers used when rolling out an environment.

bugfix

  • [CLI]: Dag validation was failing when airflow variables were used in the dags. This has been fixed in this release.

0.34.1 (09-12-2020)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.34.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.34.1/conveyor_darwin_amd64.tar.gz

features

  • [General]: Added an experimental feature that allows you to restrict which aws roles a project can use. This feature is hidden behind an experimental flag. And can be enabled in the CLI by setting the environment variable CONVEYOR_EXPERIMENTAL=1 . More documentation on how to use this feature will follow later.
  • [General]: We added the new m5zn instance in our spot pools for kubernetes. That way we have even more instance types to choose from, this should result in less spot interrupts.
  • [Templates]: Update to templates version 0.3.2

bugfix

  • [General]: Because of our rewrite of the our control plane we had a regression. When you delete a project we normally trigger it to be removed from all environments. This behaviour has been reinstated.
  • [CLI]: Airflow dag validation failed if you imported for example a utils file from your dags folder into your DAGs. We now do validation correctly so that this isn't mistakingly flagged as an issue.
  • [CLI]: Airflow dag validation failed when you were using it outside the aws eu-west-1 region. This has been fixed.
  • [UI]: When you opened an application run on a spark application without executors. The UI would still try to fetch the metrics for these executors resulting in a crash.

0.34.0 (04-12-2020)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.34.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.34.0/conveyor_darwin_amd64.tar.gz

features

  • [CLI]: Added dag validation to the CLI. When you do a datafy project build a dag validation phase is done first. This just checks if your DAG can run on airflow. You can skip this phase if you want to. You can also use the datafy project validate-dags to validate the dags of your project without doing a build.
  • [Airflow]: Upgraded Airflow to 1.10.13
  • [Doc]: Added documentation on how to run a container as non-root with the ConveyorContainerOperator.
  • [General]: Upgraded the k8s environment to EKS 1.18
  • [General]: We now run the k8s autoscaler as a high priority pod, so that it always gets priority. This makes sure the cluster can always autoscale when needed.
  • [General]: We enabled the free autoscaling metrics for our autoscaling groups. That way you can see when the limit is reached, and we can more easily recommend when to scale it up.

bugfix

  • [CLI]: The Conveyor CLI returned exit code 0 when a build failed. This has now been fixed to return an exit code 1. This makes it easier to chain multiple commands or to find out in CI/CD that a build has failed.
  • [Airflow]: When you manually trigger a job and then use the Conveyor application runs button you would find nothing. This has now been fixed.
  • [General]: Updating a project description failed when there were more than 256 characters used. We updated this field to take a bigger amount of text.
  • [General]: Fixed the Conveyor logo in the UI

0.33.1 (20-11-2020)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.33.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.33.1/conveyor_darwin_amd64.tar.gz

bugfix

  • [UI]: Fixed a bug where we showed wrong finished date in application run detail page
  • [UI]: We showed a Nan duration when the application was not yet finished in the application run detail page. Now we show the current duration
  • [UI]: We showed a message that no metrics where available yet while they were visible. This is now fixed
  • [General]: By accident we used an image from docker hub that ran into rate limiting. We now also copied that one to ECR.

0.33.0 (20-11-2020)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.33.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.33.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: We now offer CPU/Memory metrics of your jobs running on Conveyor. If you you go to the application runs page, you can for every run go to the metrics of that run.
  • [CLI]: We now show a warning when you are trying to deploy new dags or resources for your project, but first forgot to do a build.
  • [CLI]: When an undeploy or promote fails we know show you the last event on that environment, similar to how deploys work.
  • [General]: We have updated our API to a new technology to be sure we can continue on growing. This means that old version of the CLI (< 0.30.0), might not fully work any longer so please upgrade to the latest CLI version

0.32.0 (16-11-2020)

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.32.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.32.0/conveyor_darwin_amd64.tar.gz

features

  • [CLI]: Conveyor project build now uses the same credentials als the docker CLI. Allowing you to pull images from private registries

  • [CLI]: The datafy project undeploy and datafy project promote command now also show the latest event when it fails like the datafy project deploy command.

  • [General]: Send CPU and Memory metrics of jobs to cloudwatch. Later we will make these metrics available in the UI.

  • [General]: Duplicated the spark images on docker hub on ECR and shared it with our customers. This is a temporary fix for docker hub rate limiting untill aws releases their solution. For more info read the following article: https://aws.amazon.com/blogs/containers/advice-for-customers-dealing-with-docker-hub-rate-limits-and-a-coming-soon-announcement/ The repository is:

    776682305951.dkr.ecr.eu-west-1.amazonaws.com/datafy/data-plane/mirroring/datamindedbe/spark-k8s-glue

0.31.2 (30-10-2020)

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.2/conveyor_darwin_amd64.tar.gz

bugfixes

  • [Application runs]: The application runs app was crashing because of a value that overflowed a 32bit integer

0.31.1 (30-10-2020)

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [Agent]: When manually deleting the a role created with project resources, the agent would fail applying the terraform update because of not enough AWS IAM rights
  • [Airflow]: There was an error popping up on airflow that did not fail your job but was confusing we have fixed so you should not see this error anymore.
  • [Templates]: Bumped templates to the next version, this fixed problems with the dbt template: https://github.com/datamindedbe/datafy-templates/releases/tag/0.3.1

0.31.0 (30-10-2020)

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: Added description field to environment.
  • [UI]: Added a markdown editor for the description field for project and environment. This allows you to give users more context to your projects and environment.
  • [UI]: Added a git repository to projects. By filling in this field we can provide link to the actual git hashes that deployments were build with.
  • [Doc]: Added more info on the available parameters for the Conveyor spark and container operators
  • [General]: We mirrored all images used in the kubernetes cluster on ECR. Since docker hub has started rate limiting users.

bugfixes

  • [General]: When deleting a project something went wrong in are backend causing the project to remain stuck in deleting mode
  • [CLI]: When something went wrong with generating projects we sometimes produced a very cryptic long error message. The unnecessary parts were cleaned up and the error message is smaller now.

0.30.0 (23-10-2020)

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.30.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.30.0/conveyor_darwin_amd64.tar.gz

features

  • [CLI]: Improved date printing to be less verbose
  • [General]: Added created at field to all objects. These can be seen in the UI and the CLI.
  • [General]: Added last activity field to projects. This field shows the last build/deployment done for a project and can be used to see how inactive a project is. This can help you decide if a project is actively developed or not.

bugfixes

  • [CLI]: Creating a new project with a template that is from a git repository was broken. It is now fixed

0.29.2

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.2/conveyor_darwin_amd64.tar.gz

bugfixes

  • [Internal]: This is a release with only internal changes in the way we capture metrics

0.29.1

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [CLI]: When you create a build and have untracked files you would get a non dirty git hash, this has been fixed
  • [General]: Fixed a bug where if an application got in a certain state the airflow ui would stop working and the application runs would stop updating.

0.29.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: We now add git information to a build. When you do datafy project build we store the git hash that has been used to create this build. If you have uncommited changes we add the .dirty appendix. You can see this information when you use datafy project deployments or datafy environment deployments or in the UI. If the build is done outside a git repo no information is added. You need to upgrade to the CLI 0.29.0 to take advantage of this feature.
  • [General]: Better project and environment cleanup. When removing project we now delete the associated files on s3. We also cleanup the associated ssm parameters we are using to track deployed versions on environments.
  • [Documentation]: The spark 3 docker images don't run under root anymore. But this makes pip install fail. We documented the way to do this properly here.
  • [Documentation]: Added documentation on the changes in the spark 3.0.1 image here.
  • [Templates]: Upgraded to the latest version of the templates, see release note here.

bugfixes

  • [CLI]: Fixed a typo in delete environment description.

0.28.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.28.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.28.0/conveyor_darwin_amd64.tar.gz

features

  • [Documentation]: We got a bug report that in certain use cases logs for a ConveyorContainerOperator did not show up in both Airflow and Cloudwatch. The reason for this is that python print statements don't behave properly in a production environment. It's best to use the python logging framework for more information see the FAQ.
  • [Airflow]: We only keep the last 14 days in airflow logs. This is the same retention we have in cloudwatch now. This should make the logs volume smaller and result in a cost reduction.
  • [Airflow]: We added a liveness probe to the airflow scheduler to check if it's still running and restart it if it isn't.
  • [UI]: When your ConveyorContainerOperator application dies with an out of memory error we show this in the UI. When you use the ConveyorSparkSubmitOperator we show you when your driver has died with an out of memory error.

bugfixes

  • [Airflow]: There was a bug in the airflow UI that could show you were logged in under another user. This is now fixed.
  • [Airflow]: When you have a DAG in airflow with many tasks. The frontend code in airflow makes it very slow to open the modal when you click a task. We set the standard number of runs in the tree view to 5 down from 25 as this helps the javascript code to be faster. See this ticket in airflow Jira.
  • [CLI]: When building a docker image that takes longer than 15 minutes you got a timeout error. This limit has been raised to an hour.
  • [General]: When deleting an environment we now also cleanup the database associated with it.

0.27.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.27.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.27.0/conveyor_darwin_amd64.tar.gz

features

  • [Spark]: Release spark image with spark 3.0.1 and hadoop 3.0.0 support

bugfixes

  • [Airflow]: The official airflow runs under user 50000 which is more secure than running under root. But all airflow logs were still owned by user root. So we added chown to update those to the scheduler. However this can take a very long time for production airflow instances. We now do this seperate from the scheduler and only once. So the scheduler will start up fast again once we have upgraded.
  • [Airflow]: The list task instances page had log buttons that redirected to the wrong page. This is now fixed.
  • [Airflow]: We got a bug notification that editing a dag run showed a forbidden page. However editing dag runs has been deprecated in airflow. The button will be removed in a later release. For more info see here and here.
  • [Templates]: We upgraded to version 0.2.2 of the templates that contains 3 bugfixes. For more information look here.
    • The pyspark template spark 2.4 support was fixed.
    • When using the python image when you don't want role management we clean up the terraform files
    • We upgraded to spark 3.0.1 and hadoop 3.0.0
  • [General]: Creating projects that start with a dash - or underscore _. Resulted in a failure now we don't allow such name schemes anymore.

0.26.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.26.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.26.0/conveyor_darwin_amd64.tar.gz

features

  • [Agent]: We reworked the agent in the background to be more future proof. Normally this should have no impact for you but it allows us to make more/faster progress in the future. We also created more end-to-end tests to have more quality checks before we release.
  • [Airflow]: We upgrade to the official airflow image here. Again this should have no impact for users but allows us to make fast progress in the future.
  • [General]: The bastion used for datafy forward has been removed since it is not needed anymore. This will result in a small cost saving.

bugfixes

  • [Airflow]: When going to view a rendered task instance you got an error. This has now been fixed and rendered task instance can be shown in the UI.
  • [CLI]: On linux generating templates resulted in files being created with root ownership this has been fixed.
  • [CLI]: The unlock, get, events command did not work on environments in a failed state. This has been fixed.

0.25.1

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.25.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.25.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [CLI]: Fixed a bug where datafy template apply did not work with resource templates

0.25.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.25.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.25.0/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

features

  • [Airflow]: Upgraded airflow to 1.10.12
  • [Airflow]: Upgraded to use the RBAC UI of airflow. The RBAC UI of airflow is the new UI. Only this UI will receive updates and new features. In airflow 2.0 the previous UI will be deprecated.
  • [Airflow]: Added links to the datafy application runs dashboard from airflow when you select a task in airflow. See picture below. It automatically filters to show the runs of this dag and task.

bugfixes

  • [General]: Show better error message when deleting an environment that has deletion protection turned on

Link to Conveyor application runs from a task run

0.24.2

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.2/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

bugfixes

0.24.1

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.1/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

bugfixes

  • [CLI]: Small bugfix where deletion protection wasn't applied when updating an environment

0.24.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.0/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

features

  • [Templates]: The templates use by Conveyor have been open sourced: https://github.com/datamindedbe/datafy-templates. You can use these as an example on how to develop your own templates or just have a look at the templates. Suggestions for improvements can be done trough the issues on github, or you can make even make a pull request! 🥳
  • [UI]: Application runs page filtering on execution date, and started at date of jobs in now supported
  • [Airflow]: Conveyor spark submit operator now passes the following configuration by default: "spark.hadoop.fs.s3.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem" . That way users can use s3 instead of s3a without any problems.
  • [Airflow]: Airflow has been upgraded to 1.10.11
  • [CLI]: You can now unlock an environment when something went wrong with deploying your project resources or rolling out your environment trough datafy environment unlock --name ENV
  • [CLI]: We added deletion protection to environments, If deletion protection is enabled users can not delete that environment. This is useful to protect environments like production and development. You can enable with a new environment with datafy environment new --name ENV --deletion-protection=true of you can edit an environment: datafy environment update --name ENV ----deletion-protection=true
  • [General]: Kubernetes has been upgraded to 1.17

0.23.2

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.23.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.23.2/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

bugfixes

  • [UI]: Fixed a bug were in some use cases we connected to the wrong region in the UI when opening cloudwatch logs

0.23.1

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.23.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.23.1/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

bugfixes

  • [CLI]: datafy environment deployments returned the wrong output
  • [CLI]: datafy environments events returned the wrong output
  • [CLI]: By accident datafy project get also had ls as an alias. Resulting in a conflict with datafy project list

0.23.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.23.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.23.0/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

features

  • [CLI]: It is now possible to use custom templates with the cli. We support local templates in a directory and templates from a git repository. When using git you can provide a tag or branch and a directory in your git repo. To find out more check datafy template apply --help or datafy project new --help. We hope this features will allow our customers to make their own templates and bootstrap internal project quicker. This is also done in preparation of open sourcing the Conveyor templates.
  • [CLI]: The dependency on cookiecutter from the CLI has been removed. This is one thing less people have to install to get started with Conveyor. Instead we use docker to run a container with cookiecutter installed. We use the following image by default: https://hub.docker.com/r/datamindedbe/cookiecutter. But you can always specify your own if your templates need more to be installed.
  • [CLI]: We went over all commands to provide more descriptive explanations and examples when doing --help
  • [DOC]: Documented a new pattern for using common airflow code.
  • [General]: We made the deploy to an environment roughly 15 to 20 seconds quicker by optimizing some steps.
  • [Airflow]: Set the setting min_file_process_interval to 30s to lessen the load on the airflow database. This results in jobs triggering a bit slower but makes usage of Variables.get("VARIABLES") less of a problem for the database.

bugfixes

  • [General]: Because of the way we handled events in our API. Your project could remain stuck in the creating state while it was actually created. We have fixed this on our side.
  • [UI]: When switching tabs on an environment page the applications runs wouldn't reload with the latest changes. This has been fixed.

0.22.2

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.22.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.22.2/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

bugfixes

  • [CLI]: We changed the authentication so you don't have to configure it anymore as a first time user. This broke the flow where you set a key and a secret for authentication (for example in CI/CD). This has been fixed in this release.

0.22.1

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.22.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.22.1/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

bugfixes

  • [UI]: Fixed an issue where changing page resulted in filters being reset in the application runs page.
  • [UI]: Fixed a pagination issue on the application runs page where we started at the wrong page.

0.22.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.22.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.22.0/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

features

  • [General]: New application logs page where you can see runs and filter on them. Runs are stored up to 2 weeks and you can see the logs of a failed run for up to 2 weeks.
  • [General]: Made our kubernetes cluster a bit less heavy by trimming of some excess deployments. This means we need less nodes to do the same work.
  • [CLI2]: Added a datafy completion command that can generate completions for the shell of your choice. Bash, Zsh and Fish are supported. To find out how to configure it please use datafy completion --help
  • [CLI2]: Migrate the cli to a new authentication configuration. This means as a first user you don't need to configure authentication anymore. If federated login is setup for your company you can use that otherwise you have to use the account created for you.
  • [CLI2]: The cli now doesn't need aws rights to do a build. This means you can be logged in into another aws account while still being able to do a datafy project build

bugfixes

  • [General]: Deleting environments failed we have fixed it in this release.

0.21.0

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.21.0.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.21.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.21.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: Upgrade k8s to 1.16
  • [General]: Switch log aggregation from fluentd to fluent-bit. Fluent-bit is more memory efficient then fluentd so this means we have less overhead per node.
  • [CLI]: The old CLI 1 which is installed as a python package is deprecated in favour of CLI2. For now we will keep it around and fix bugs if needed. It will be unsupported in the future
  • [CLI2]: The homebrew install of CLI2 now install it as the datafy command instead of the godatafy command.
  • [CLI2]: Removed Conveyor forward
  • [UI]: Removed the old UI in favour of https://app.conveyordata.com

0.20.2

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.20.2.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.20.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.20.2/conveyor_darwin_amd64.tar.gz

bugfixes

[General]: Small bugfix in a new application that gave trouble when it ran in a us region.

0.20.1

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.20.1.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.20.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.20.1/conveyor_darwin_amd64.tar.gz

bugfixes

[UI]: Fixed a bug where task executions did not show up

0.20.0

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.20.0.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.20.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.20.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: Spark 3 support. We have updated our templates to both support spark 2.4 and 3.0. There are some small migration steps needed as described here. You can find out what is new in spark 3.0 here.
  • [Templates]: Added a description to the generated terraform variables.
  • [Docs]: Added documentation on max memory supported on Conveyor.

bugfixes

  • [CLI2]: Made create project call more robust when using a template. We first generate the template and then create the project.
  • [CLI2]: Fixed an issue where users were getting the wrong tokens. This issue should be fixed.

0.19.1

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.19.1.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.19.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.19.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [CLI2]: Changed the way templates ask for input as the previous version proved to be unstable
  • [UI]: The spark UI did not proxy the executors page correctly this has been fixed and the page should now work correctly
  • [Airflow]: The Conveyor environment link was not working anymore and is fixed in this release.

0.19.0

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.19.0.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.19.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.19.0/conveyor_darwin_amd64.tar.gz

features

  • [UI]: New UI hosted on https://app.conveyordata.com. If your authentication has been correctly configured this can be used once your data plane has been upgraded. After that you never have to use the forward command again! For now we keep it working as a fallback but it will be removed in a future upgrade. For the forwarding of airflow to work correctly you should also trigger an airflow deployment to upgrade airflow to the new configuration needed.
  • [Airflow]: Changed the configuration to be able to host the new UI

0.18.1

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.18.1.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.18.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.18.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [CLI2]: Bugfix for the forward command. Aws does not accept public keys with a newline at the end. So we now send it without newline.

0.18.0

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.18.0.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.18.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.18.0/conveyor_darwin_amd64.tar.gz

features

  • [infra]: Reduced costs by disabling certain unused private endpoints.
  • [Agent]: Changed the terraform output to be more readable.

bugfixes

  • [CLI2]: Found multiple changed files when applying a template. We changed the way we are applying templates, so this nasty bug should not happen again.
  • [CLI]: Deleting a project deletes my .datafy home folder. When deleting a project the cli tried to clean up the project folder. But by accident could delete the datafy home folder.
  • [API]: We know properly validate project names to makes sure you can not create a project with an unsupported name

0.17.0

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.17.0.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.17.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.17.0/conveyor_darwin_amd64.tar.gz

features

  • [Documentation]: Document the list of spark images we have available
  • [CLI2]: when a deploy fails automatically print the latest event on the environment
  • [Templates]: Added scala 2.12 option for the scala template
  • [UI]: show the airflow execution timestamp in the application logs overview
  • [CLI][CLI2]: Change default project workflow_start_date to yesterday

bugfixes

  • [CLI2]: Deploy a build id to an environment failed, now it works again
  • [Templates]: The pyspark template contained a bug this has been fixed

0.16.0

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.16.0.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.16.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.16.0/conveyor_darwin_amd64.tar.gz

features

  • [Projects]: Support for Spark 3
  • [CLI]: Graduation of the experimental CLI to CLI2
  • [CLI2]: Support for JSON as output format
  • [API]: Improved event output
  • [Agent]: Configurable agent rights
  • [Documentation]: Documentation on the various authentication options
  • [Documentation]: Brand new tutorial

bugfixes

  • [CLI2]: create .datafy folder if it does not exist
  • [CLI2]: fix path bug when applying templates
  • [CLI2]: return meaningful error messages
  • [CLI2]: CI/CD support

0.15.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.15.0.tar.gz

features

  • [UI]: New read only Conveyor UI, get an overview of environment and projects on this new release of the UI
  • [UI]: Put a link to the documentation in the UI
  • [Experimental CLI]: Added the undeploy and promote commands to the experimental CLI
  • [Experimental CLI]: Added the new authentication flow to the experimental CLI
  • [API]: Always return projects and environments sorted, this way you get a consistent view every time you make a call on the CLI or open the UI

0.14.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.14.0.tar.gz

features

  • [Airflow]: Added a link to the Conveyor environment that the airflow instance belongs to.
  • [Templates]: Updated the python template to include resources.
  • [General]: We clean up old builds of airflow automatically.
  • [Documentation]: Documented how you can use service account with the Conveyor Container Operator

bugfixes

  • [Airflow]: Made the spark submit operator a bit more robust, we had reports of users seeing it try to create certain resources twice, this should be fixed now. If not please let us know.

0.13.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.13.0.tar.gz

features

  • [CLI]: added support to use federated authentication for the CLI. This means we can use the preferred login method of your company (e.g. OneLogin or Okta). This still needs manual configuration from our side so we will contact our customers one by one to set this up and test this out.
  • [CLI]: added support for promotion-based deployments using the datafy project promote command
  • [Experimental CLI]: our experimental cli did not support all flags for project deploy. It is now supported to deploy a build to an environment with the experimental cli.
  • [Experimental CLI]: Installation trough homebrew is now possible for Mac users: brew install datamindedbe/datafy-formulas/datafy This install it as the executable godatafy so it can peacefully coexist with the old Conveyor CLI
  • [General]: We upgraded our runtime to kubernetes 1.15.
  • [Airflow]: Upgrade to 1.10.10

bugfixes

  • [Experimental CLI]: The experimental CLI worked differently with regards to the region configuration. It has been fixed to automatically use your aws region config now.
  • [Experimental CLI]: The experimental CLI did not contain all templates or the latest versions of them.

0.12.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.12.0.tar.gz

features

  • [CLI] undeploy a project from an environment, use datafy project undeploy
  • [CLI] A new experimental version of the CLI written in Go is available. We noticed that all the dependencies needed by our Conveyor CLI got people into trouble in their python environment. The Go versions ships as a single binary. You can read installation instruction here.
  • [UI] Show logs of container spawned by ConveyorContainerOperator in the Conveyor UI
  • [GENERAL] Added support for using kubernetes service account for ConveyorContainerOperator, you can use the new resource template for this: resource/aws/container-iam-role-s3
  • [AIRFLOW] Added a Conveyor menu with links to the Docs and Conveyor UI
  • [DOCS] Documented the airflow behaviour with pools and old tasks in the FAQ
  • [DOCS] Documented on how to use resources to deploy something else then IAM roles, we only give the agent rights to create IAM roles. This for security reasons, but you can specify you own roles if you want more rights.

bugfixes

  • [GENERAL] Fixed a bug when deleting a project with non existing ECR repo
  • [GENERAL] Deploying a project without dags now proceeds as normal without crashing

0.11.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.11.0.tar.gz

features

  • [GENERAL] Parallel deployments: We now support deploying, creating, deleting multiple projects at the same time. At the moment we allow up to 4 in parallel.

bugfixes

  • [CLI] datafy project shell now passes the aws region correctly
  • [CLI] We use temp files to cache tokens, but since datafy forward must be run as root user this sometimes resulted in file permission errors. We now give broader permissions when creating the file.
  • [CLI] printing empty lists in the CLI resulted in errors. We now print a nothing found message.
  • [GENERAL] When deleting a project, if it is the only project in an environment updating the environment failed. This is fixed now

0.10.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.10.0.tar.gz

features

  • [UI] Let failed jobs live longer in the Conveyor UI, this way it's easier to follow them up
  • [UI] Added the Conveyor logo to the Conveyor UI
  • [AIRFLOW] Expose the airflow operators as an installable package. This will allow you the Conveyor operator for unit testing the airflow dags for example
  • [CLI] Open a local shell into a build project container. This allows you to debug this container locally. To try this use: datafy project shell
  • [CLI] You can now see the currently deployed builds on an environment datafy environment deployment or on which environment your project is deployeddatafy project deployments
  • [CLI] We improved the error messages for datafy template apply . They should help guide you towards the correct syntax
  • [CLI] Added project events to the CLI

bugfixes

  • [UI] correct Spark ui to Spark UI
  • [UI] Use t3.nano as bastion host for Conveyor forward instead of t2.nano. This should increase network throughput
  • [CLI] Conveyor configure has been fixed to work again. It tried to check if your CLI was up to date but without a configuration this didn't work.
  • [CLI] We now update the status of a build to Created or Failed depending on the result of the building
  • [AIRFLOW] Conveyor Spark Submit Operator by accident set the option spark.executor.pyspark.memory to be equal to the requested memory. This resulted in a doubling of the memory requested. This flag has been removed in the operator. You can still set it yourself though.
  • [AIRFLOW] Airflow workers sometimes weren't able to get aws credentials this was made more robust and should not be a problem anymore
  • [AGENT] The project delete process has been made more robust. We won't try to delete ECR repositories anymore that do not exist.

0.9.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.9.0.tar.gz

features

  • CRUD support of project resources
  • Example project resource template
  • Expose project events in the cli
  • Upgrade to airflow 1.10.9
  • Warn CLI users when to upgrade

bugfixes

  • Support for large logs in the datafy UI

0.7.1

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.7.1.tar.gz

bugfixes

  • Could not find eks worker instances in all cases
  • Project delete by name now works correctly

0.7.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.7.0.tar.gz

features

  • Default arguments for the ConveyorSparkSubmitOperator
  • Updated documentation including templates
  • Added documentation for tenant hosted templates

0.6.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.6.0.tar.gz

features

  • Template for the ConveyorDockerOperator
  • Template for scala spark
  • Documentation for the ConveyorDockerOperator
  • Added support for tenant hosted templates

bugfixes

  • Allow users to access the UI without access to k8s

0.5.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.5.0.tar.gz

features

  • data-plane - validate the existence of the referenced docker image in the ConveyorSparkSubmitOperator and the ConveyorContainerOperator. This should makes sure you do not have pending pods because of not existing images
  • data-plane - cached fetched tokens on the agent for eks, this to avoid rate limiting by aws
  • data-plane - Increase default wait time out from 2 minutes to 5 minutes in the ConveyorContainerOperator. 2 minutes could be too low when new nodes need to be scheduled, 5 should always be enough.
  • data-plane - shaved another 10 to 15 seconds on environment updates, releases should take 1 min 40ish seconds
  • cli - add possibility to delete project by name
  • cli - added aliases del for delete and ls for list

0.4.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.4.0.tar.gz

bugfixes

  • ui - crash when refreshing
  • ui - forward command times out frequently (beta)
  • ui - can not find bastion

features

  • cli - pyspark project template
  • cli - listing available project templates
  • cli - show version
  • cli - wait flag for deployments
  • cli - support both dtf and datafy
  • cli - consistent naming of delete
  • cli - improve table layout
  • data-plane - new operator ConveyorContainerOperator
  • data-plane - remove aws code build dependency
  • data-plane - faster environment updates
  • data-plane - cleanup old spark jobs