Skip to main content

Release notes

1.3.5 (05-07-2022)

features:

  • [Airflow]: Reduce logging of airflow scheduler and file processor to efs.
  • [Airflow]: Change how Airflow dag fetching works for the scheduler and web instance, this reduces load on EFS by 75%

bugfixes:

  • [UI]: Fixed an issue where the UI would use the user email with capital letters

1.3.4 (30-06-2022)

bugfixes

  • [Airflow]: Preventing Airflow tasks to be marked as failed when an API 410 exception is thrown
  • [Airflow]: Lower the logging of Airflow to EFS

1.3.3 (30-06-2022)

features:

  • [Spark]: Big spark jobs (mx.xlarge, mx.2xlarge, mx.4xlarge and more then 1 executor), will now be scheduled in a single Availability zone. We select the availability zone based on the least amount of spot interrupt change when running on spot. This will reduce network costs, and reduce network overhead for spark.

bugfixes:

  • [Spark]: We are investigating and issue where the EFS volume used for the spark event log is overloaded. We added a global Admin option for ourselves to disable spark event log upload, that we can enable when we notice issues in an environment.
  • [Airflow]: Backported https://github.com/apache/airflow/pull/24478 to fix an issue with retrying old tasks in the UI
  • [Airflow]: Backported https://github.com/apache/airflow/pull/24117 to fix an issue with retrying old tasks in the UI

1.3.2 (29-06-2022)

features:

  • [docs]: Advocate the use of strict uuid pattern matching when assuming roles
  • [Airflow]: Upgrade to airflow 2.3.2
  • [Airflow]: Allow users to template num_executors in ConveyorSparkSubmitOperatorV2
  • [General]: Allow folders in dags and resources folder
  • [General]: Added warning about v1 operator deprecation into UI
  • [CLI]: Conveyor run now lets you select an environment interactively
  • [CLI]: Conveyor run now lets you select a DAG and a task interactively
  • [CLI]: Conveyor run now automatically uses the last execution date compatible with the DAG schedule if none is provided
  • [Spark]: Added support for Spark streaming on Azure
  • [Spark]: Added a spark local mode to the ConveyorSparkSubmitOperatorV2, see the docs for more info
  • [CLI]: Support passing additional build arguments to the container engine
  • [Spark]: Added section on improving performance when Spark on Conveyor
  • [Costs]: Added a global overview per day
  • [Azure]: Initial version of Azure metrics available in UI
  • [Spark]: We now run big (more than 1 executor, and executor instance type mx.xlarge, mx.2xlarge, mx.4xlarge) batch spark applications in a single AZ by default. When using spot we use the aws spot placement score API to determine the best AZ when your spark application is launched. This improves the availability of the spark application, and reduces network costs and overhead.
  • [Spark]: Added spark 3.3.0 images:
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.3.0-hadoop-3.3.1-v1
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.3.0-2.13-hadoop-3.3.1-v1
  • [Azure]: Support enabling microsoft defender for cloud on Azure
  • [Spark]: Released new images with reduced logging when using spark local mode:
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1-v6
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.3-hadoop-3.3.1-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-hadoop-3.3.1-v7
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-2.13-hadoop-3.3.1-v7
  • [Templates]: Use strict uuid pattern matching in the resources. (@stijndehaes)
  • [Templates]: Upgrade the resource folder assume role policies to use the service account. (@stijndehaes)
  • [Templates]: Upgrade spark images to our latest releases. (@stijndehaes)

bugfixes:

  • [UI]: Pressing ENTER when filtering columns was not working
  • [UI]: Add executor info to spark detail page
  • [UI]: Show all states for an environments in the UI, such that users can see what is going on when it is being deleted
  • [UI]: Fixed an issue where inviting a user would result in Airflow UI's not loading untill the users logged out and in again
  • [CLI]: Fixed an issue where logger wouldn't respect being set to quiet
  • [CLI]: Fixed an issue with deleting notebooks
  • [General]: Update ebs csi driver so it doesn't go out of memory when many pods are being scheduled. This improves reliability when using the spark option executor_disk_size
  • [General]: Fixed a bug where the metrics would show double the CPU usage on AWS
  • [CLI]: Do not print the m2m token when logging in

1.3.1 (27-06-2022)

bugfix

  • [Azure]: Correct daemonset overhead calculation to include the azure-cni component after switching away from calico

1.3.0 (20-06-2022)

features

1.2.5 (14-06-2022)

bugfixes

  • [Airflow]: Fixed a bug where removing tags from a dag would make it fail to load
  • [UI]: Fixed the link to the git repo in deployments, and on the project view
  • [CLI]: Do not show version out of date warning when using conveyor update command

1.2.4 (13-06-2022)

bugfixes

  • [Notebooks]: Update spark image used in notebooks to version: 3.2.1-hadoop-3.3.1-v6
  • [UI]: Tour won't let you move past step 7

1.2.3 (09-06-2022)

bugfixes

  • [General]: Fixed an issue where the new single availabilty zone option that could result in jobs being slow to launch

1.2.2 (08-06-2022)

features

  • [General]: Added more instance types to our autoscaling groups. This will help to get the most stability out of spot instance on AWS
  • [General]: Made the single availability zone option more robust for on-demand. When the preferred instance type is unavailable it will move on to the next preferred type in a list

bugfixes

  • [General]: Fixed an issue where the new single availabilty zone option would always use on-demand instances

1.2.1 (08-06-2022)

features

  • [UI]: Add optional tracking for analytics purposes
  • [Airflow]: Resubmit Spark application when spot termination is detected during submission
  • [Spark]: Allow users to select an availability zone for your spark application using ConveyorSparkSubmitOperatorV2

bugfixes

  • [UI]: fix links to docs page
  • [UI]: Fixed an issue on Azure where big logs would fail to load
  • [Spark]: Handle an extra case as spot interruption instead of a regular spark submit failure

1.2.0 (07-06-2022)

features

  • [Spark]: Added support for spark decommissioning, Spark decommissioning helps you to not lose data when an executor has a spot interrupt. Before the spot interrupt goes through spark will try to send all intermediate results to other executors. Thus saving time and money for this job. The feature is only supported in our latest spark 3.2 images:
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-hadoop-3.3.1-v6
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-2.13-hadoop-3.3.1-v6

bugfixes

  • [Spark]: Created a new spark image which fixes a bug in MsalTokenprovider for using spark on Azure

1.1.6 (31-05-2022)

bugfixes

  • [General]: use the api server address as master on Azure instead of internal kubernetes service.
  • [Airflow]: Fix an issue where manual airflow runs were not filtered correctly in the task executions

1.1.5 (31-05-2022)

bugfixes

  • [CLI]: Fix the conveyor update progressbar
  • [General]: Make sure the cluster-autoscaler can handle the taint set by spot termination: aws-node-termination-handler/spot-itn
  • [Spark]: Upload the spark eventlog under the correct name, this makes the spark history server work again
  • [UI]: Fixed a typo on the admin page where it said projects instead of environments

1.1.4 (30-05-2022)

bugfixes

  • [Airflow]: Lower the Airflow usage of EFS by changing min_file_process_interval from 10 to 30 and dag_dir_list_interval from 60 to 300.
  • [General]: Added some improvements to the code that will lower EFS usage

1.1.3 (27-05-2022)

bugfixes

  • [UI]: Fix filtering on task executions started_at field
  • [General]: Allow more parallel processing in our operator. This reduces waiting time when a lot of spark/container jobs are launched

1.1.2 (27-05-2022)

bugfixes

  • [General]: We only keep failed container CRD's around for 30 min instead of 3 days. They piled up and took too much resources.

1.1.1 (27-05-2022)

features

  • [UI]: Remember previously used email in login screen
  • [General]: Implement cleanup of old project builds for Azure

bugfixes

  • [Airflow]: Catch 502 and 504 errors in External Task Sensor
  • [General]: Fix an issue where project deletion on Azure failed
  • [General]: Fixed an issue where an environment that failed to create could get into and unrecoverable state
  • [Notebooks]: Cleanup unused notebook images from ACR
  • [Spark]: Fix the spark eventlog upload failing

1.1.0 (24-05-2022)

features

  • [CLI]: Change CLI to use environment variables with CONVEYOR prefix as a preferred
  • [CLI]: update the upgrade-dags command to also rename imports and classes in dags from Datafy to Conveyor
  • [CLI]: Move the ~/.datafy profiles directory to ~/.conveyor
  • [CLI]: Add warnings when running conveyor build if the dags still use Datafy instead of Conveyor.
  • [General]: Display the node id in the UI as well as in Airflow when the node got spot terminated.
  • [Templates]: The github repository for templates has been renamed to conveyor-templates
  • [Notebooks]: The working directory for notebooks has been renamed from datafy_project to conveyor_project. This might cause loss of data for existing notebooks.
  • [Projects]: Projects now get their configuration from the ./conveyor directory, with fallback to the ./datafy directory

bugfixes

  • [CLI]: Conveyor run now creates the date interval just like a scheduled Airflow run, before it behaved like a manual Airflow run
  • [Spark]: Wait with uploading the event log until the spark application has finished, before there were instances where an upload happened before the spark application was shut down
  • [Airflow]: Allow primitive types as env_vars and convert them to strings
  • [Airflow]: Handle spot interrupts in ConveyorContainerSensor and ConveyorExternalTaskSensor tasks with reschedule mode by rescheduling them on another node
  • [Airflow]: Handle spot interrupts in all Sensor tasks which use mode reschedule by rescheduling them instead of crashing
  • [Airflow]: ConveyorExternalTaskSensor now also can now also watch for manually scheduled runs

1.0.2 (20-05-2022)

features

  • [Aiflow]: Increase parallelism from 64 to 128
  • [UI]: Improve the navigation breadcrumbs and the page icons

1.0.1 (19-05-2022)

bugfixes

  • [CLI]: Fix issue with datafy update renaming the cli binary to conveyor

1.0.0 (19-05-2022)

features

  • [General]: Rename Datafy to Conveyor
  • [CLI]: Cleanup command to delete managed docker images
  • [CLI]: The datafy CLI executable will be automatically renamed to conveyor when using datafy update
  • [UI]: Fix a bug with embedded Airflow view auto-resizing

bugfixes

0.63.3 (10-05-2022)

features

  • [Spark]: Integrate azure libraries in our standard spark image.
  • [Templates]: Use new spark image which supports both azure and aws

bugfixes

  • [Notebooks]: Configure the notebook to work on azure
  • [Notebooks]: Fix notebook configuration to include azure specific properties and jars
  • [Notebooks]: Fixed a bug where the memory of the spark context was not changed according to the instance size
  • [Notebooks]: Fixed a bug where files were not persisted when notebook was created from the UI

0.63.2 (05-05-2022)

bugfixes

  • [UI]: Fix an issue to get current user roles when using SSO

0.63.1 (05-05-2022)

features

  • [Airflow]: Removed the possibility to create airflow v1 environments, airflow v1 environments are deprecated for a long time now. They will be phased out in 2022. Airflow 1.x does not receive community support anymore and relies on very old libraries which are becoming less and less secure to use.

bugfixes

  • [UI]: Use username to get user roles in UI

0.63.0 (05-05-2022)

features

  • [CLI]: Pass environment variables to airflow for dag validation or using the run command
  • [UI]: Added the m2m tokens in the Conveyor UI Settings page
  • [General]: Add support for detecting spot termination on Azure
  • [Templates]: Make the templates work for both azure and aws
  • [Templates]: Use newest spark version: 3.2.1 in the templates
  • [Templates]: Update python versions such that they work with Apple Clang 13+

bugfixes

  • [UI]: Fixed an issue with SSO users showing up with weird names in the list, this is only relevant for installations starting from 2022
  • [Airflow]: Fixed an issue with cleaning up old airflow logs
  • [Airflow]: Make airflow workers more robust against connection reset errors while watching kubernetes pods
  • [UI]: Show the azure application client id for notebooks.
  • [General]: Increased timeout on Azure when deleting container repositories.

0.62.7 (27-04-2022)

features

  • [Spark]: Added a new failure mode for spark batch applications, if the application has lost more than 5 times the amount of executors requested the application will fail. The chances of such an application ever finishing are very low, and it would continue to take up resources in the cluster otherwise.
  • [Airflow]: When an airflow executor shuts down unexpectedly we check if this was because of a spot interrupt. If that is the case we put a message in the logs. This should make debugging issues easier
  • [General]: Add new failure mode for batch applications, if kubelet has not enough resources (cpu, memory, pods) but scheduler did assign the pod to the node, which caused it to fail.

bugfixes

  • [UI]: Fixed an issue with SSO users not showing up in the user list, this is only relevant for installations in 2022
  • [General]: Fixed an issue where updates of spark application with more than 600 executors would not be updated in our UI
  • [CLI]: Fixed datafy notebook download being broken
  • [Docs]: Fixed broken link to connect-ide docs

0.62.6 (21-04-2022)

Features

  • [CLI]: Add support for Azure DevOps git when using templates to create a project
  • [CLI]: Made the datafy project stop-run command more useful, it can now handle multiple matches. And allows you to stop runs in batch
  • [CLI]: Conveyor run now checks before starting if there are other manual runs with the same properties in the environment. If there are it will ask if you want to clean these up first. This will stop manual runs from piling up.

bugfixes

  • [UI]: Fixed an issue where the cancel run button wouldn't work, but just redirect to the job logs
  • [Airflow]: Fixed an issue where the datafy application runs button would not filter on the environment
  • [Airflow]: The ConveyorExternalTaskSensor would fail if the Airflow instance was unavailable(for example because of a spot interrupt). Now we gracefully retry the sensor on the next poke
  • [General]: Fix a bug where a new users wouldn't be able to register when using SSO
  • [CLI]: Fix a bug for datafy notebook commands delete, start, stop, download where filling in only the name or environment flag would not filter properly on these flags

0.62.5 (15-04-2022)

bugfixes

  • [Airflow]: Make airflow workers more robust against glitches in kubernetes instead of failing immediately
  • [UI]: Return a 404 instead of a 500 when requesting nonexisting logs such the UI does not handle it as an error.
  • [Notebook]: Fix errors in our notebook operator when using cross account clusters

Features

  • [Airflow]: Support instance_life_cycle option for dbt tasks
  • [Airflow]: Support instance_life_cycle option for airflow sensors

0.62.4 (13-04-2022)

Features

  • [General]: Upgrade Aws EKS version to 1.22

0.62.3 (08-04-2022)

Features

  • [General]: Upgrade Aws for fluent bit to version: 2.23.3
  • [General]: Upgrade aws load balancer controller in preparation of eks 1.22 upgrade

0.62.2 (07-04-2022)

Features

  • [Airflow]: Upgrade following dependencies for Airlfow 2: Airflow to 2.2.5, upgrade apache-airflow-providers-apache-spark to 2.1.3, apache-airflow==2.2.5 apache-airflow-providers-cncf-kubernetes to 2.2.0, apache-airflow-providers-slack to 4.2.3, acryl-datahub to 0.8.31.6, boto3 to 1.21.32
  • [General]: When scheduling an application we now don't fail at the first ImagePullBackOff happening in kubernetes, we need three failure events. This makes the operator more robust to temporary network failures.
  • [UI]: On the costs page add the selected cost range to the URL, this makes it easier to share URL's with other people
  • [UI]: On the streaming application pages added the selected filter to the URL, this makes it easier to share URL's with other people
  • [CLI]: Added the command datafy project generate-config, which will generate the .datafy/project.yaml file for a project. This is useful when forgot to check it into git or when you use the terraform provider.

0.62.1 (04-04-2022)

Features

  • [CLI]: add login fallback when the automatic cli login does not work or is not supported.
  • [UI]: If your login is expired, and you go to your Airflow URL, we now redirect you to your Airflow page again after logging in.

bugfixes

  • [UI]: Go to the correct landing page after logging in from an invitation link
  • [DOCS]: Small cleanup in the pyspark and spark tutorial

0.62.0 (29-03-2022)

Features

  • [General]: Run the on-demand instances autoscaling group as a mixed instance fleet, that way we can handle a certain instance type note being available on aws
  • [Spark]: Released new spark images that add a new log4j-executor.properties file that reduces logging for spark executors, this results in cloudwatch cost savings. The new images are:
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-hadoop-3.3.1-v3
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.2.1-2.13-hadoop-3.3.1-v3
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.3-hadoop-3.3.1
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1-v5
  • [UI]: Added type email to our login email field, this way browsers and password managers will recognize it better
  • [UI]: Simplify create environment modal when there is only 1 cluster

bugfixes

  • [General]: Fixed an issue when migration our datafy config file
  • [CLI]: Fix an issue when deleting of notebooks fails
  • [UI]: Fix login flow from CLI

0.61.7 (18-03-2022)

bugfixes

  • [UI]: Revert runtime-config

0.61.6 (18-03-2022)

bugfixes

  • [UI]: Correct runtime-config for production

0.61.5 (17-03-2022)

bugfixes

  • [General]: Add cluster endpoints to management API

0.61.4 (16-03-2022)

bugfixes

  • [CLI]: Fix cicd flow when passing environment variables

0.61.3 (16-03-2022)

bugfixes

  • [General]: Fix listing users
  • [CLI]: Fix cicd flow

0.61.2 (16-03-2022)

bugfixes

  • [General]: Make sure the Conveyor team can access the tenants

0.61.1 (16-03-2022)

bugfixes

  • [Docs]: Regenerated docs for template release 0.15.5
  • [UI]: Fix IDP login for dataminded users

0.61.0 (16-03-2022)

features

  • [UI]: Revamped login flow
  • [Templates]: Update the spark settings so the aws glue integration works again
  • [Templates]: Use dots instead of underscores for specifying the Conveyor_instance_type. (@nclaeys)

0.60.1 (10-03-2022)

bugfixes

  • [Spark Streaming]: Fix missing applications in the UI.

0.60.0 (08-03-2022)

features

  • [UI]: Add button for administrators to invite new users
  • [Airflow]: Make conveyor_instance_type specification consistent by using dots everywhere
  • [Spark Streaming]: Added an alerting option to spark streaming support, for more info see here
  • [General]: Upgrade aws ebs csi driver to 2.6.3
  • [Airflow]: Upgrade Airflow 2 to 2.2.4

bugfixes

  • [CLI]: Remove debug message when copy image fails due to access denied
  • [UI]: Fix filtering on environment and schedule in task executions page
  • [UI]: Refresh page of streaming application that is in state pending every 10 seconds
  • [CLI]: Refactor the result of datafy project list-users and datafy environment list-users to not include /User/ string

0.59.4 (18-02-2022)

bugfixes

  • [Airflow]: Do not set tcp_keepalive when using airflow v1 as it does not exist.
  • [CLI]: Fix datafy update for Apple Silicon Macs
  • [General]: Change the spark version used by the spark history server to not have issues with verifying the s3 ssl certificates

0.59.3 (16-02-2022)

bugfixes

  • [Notebooks]: Use the correct images when launching a notebook from the UI

0.59.2 (16-02-2022)

features

  • [UI]: Added instance type and lifecycle to task execution details page
  • [UI]: Show deletion protection status in environments page
  • [UI]: Added delete button in environments page
  • [UI]: Added button to create new environments
  • [UI]: Added button to create notebooks
  • [General]: We run our agent on the cluster instead of on ECS
  • [Spark]: Release spark 3.2.1 images
  • [Docs]: Added documentation about spark hive integration issues.
  • [Docs]: Migrated the documentation to Docusaurus, this should allow us to make the documentation more user-friendly
  • [Notebooks]: Do not copy the virtual environment to nfs but only the project related files in order to speed up notebook creation

0.59.1 (03-02-2022)

bugfixes

  • [CLI]: Make sure the docker client used by Conveyor also uses the typical Docker environment variables
  • [CLI]: Allow uploading of dag files larger up to 16MB, up from 1MB. Also fail if a larger file is detected instead of printing a warning
  • [UI]: Fix an error that was appearing on the first page load
  • [UI]: Fix a problem in the admin user panel where a project could not be added to a user
  • [UI]: Fix a scrolling issue in the embedded Airflow page

0.59.0 (31-01-2022)

features

  • [Notebooks]: Added support for notebooks persistence. This means notebooks can now be stopped and started using the CLI and the UI.
  • [General]: Cleanup unused aws secrets manager secrets
  • [General]: Improved the pg-bouncer SSL setup to the RDS server by validating the RDS CA, the RDS only accepts encrypted connections now
  • [CLI]: Added the same docker build flags for notebook create that are supported in project build
  • [CLI]: Support for podman as container manager

bugfixes

  • [General]: fix 2 small issues with the configuration of notebook properties
  • [Airflow]: Update the connections used by Airflow 2 to be sourced from the environment. That way we should have fixed the issues with a connection being temporarily unavailable and that leading to a job failure
  • [RBAC]: Fixed an issue where a project admin could not manager users on a project

0.58.1 (10-01-2022)

bugfixes

  • [CLI]: Fixed an issue when building a project would not work
  • [PySpark]: Update pyspark images such that setuptools (>=60.0.0) also installs global python packages in the correct directory for debian.
  • [Template]: Upgrade templates to 0.15.4

0.58.0 (7-01-2022)

features

  • [General]: Release the preview of the costs feature.
  • [General]: Pre cache notebook and project base images to speed up uploading images the first time.
  • [General]: use m6i instances instead of m5 when launching new nodes as they are more cost efficient

bugfixes

  • [UI]: Fixed a bug where a job duration wasn't updated, and it also refused to show the metrics because of that.
  • [General]: Fixed an issue where trying to use a secrets from aws without an IAM role would take 30m to fail.
  • [CLI]: Fixed an issue where starting a new notebook wouldn't work

0.57.1 (6-01-2022)

bugfixes

  • [CLI]: Fixed a bug where starting a notebook could scramble the order in your notebooks.yaml if you had multiple definitions
  • [Spark]: Released new spark images that fix a vulnerability wit log4j 1.x, for more information see our documentation.

0.57.0 (3-01-2022)

features

  • [CLI] Rename all new CLI commands to create to consistently use verbs for CLI commands (the new commands still work as aliases of the create commands for backwards compatibility)
  • [UI] Added Git hash and repo link to task execution detail view
  • [UI] Added "Trigger" column to the task execution page to distinguish tasks triggered by Airflow or via datafy run
  • [UI] Display the task operator version in the task execution page
  • [UI] Add a button to cancel a task execution
  • [UI] Add option to wrap long lines in task logs view

0.56.5 (20-12-2021)

features

  • [General]: add command to stop a project run, when your terminal gets detached.
  • [General]: add gcc and g++ libraries to the notebook base image and extend the notebook documentation to describe how to install pyodbc.

bugfixes

  • [General]: make sure a failed applicationEvent is sent when cancelling a manual project run for both spark and container tasks.
  • [General]: fix crashlooping notebook when mounting secrets from SSM or Secretsmanager.

0.56.4 (13-12-2021)

features

  • [CLI]: The CLI binary is now available for Apple Silicon.
  • [UI]: All page headers are now collapsible

bugfixes

  • [UI]: Fix redirect after login and logout

0.56.3 (9-12-2021)

features

  • [Airflow]: Indicate whether airflow workers are killed due to spot termination (container, spark, container_sensor)

bugfixes

  • [General]: When using V2 Operators we now enable the sts regional endpoints by default. This removes the dependency for using your IAM roles on the us-east-1 region and is recommended by AWS

0.56.2 (7-12-2021)

bugfixes

  • [Streaming]: Fixed an issue where the spark application could not be created if the name of the application was too long
  • [Notebooks]: Use the same service account pattern as projects running from airflow and streaming
  • [Airflow]: ConveyorSparkSubmitOperatorV2 tasks with a . in the name could not be started properly
  • [General]: Performance improvements when processing task executions

0.56.1 (2-12-2021)

features

  • [Notebooks]: Added possibility to work on datafy notebooks from your IDE
  • [Docs]: Updated dbt tutorial using latest template version
  • [Airflow]: test tasks can be turned on/off when using the dbt task factory
  • [General]: Detect spot node interruptions and handle it as a specific failure for a container/spark job

bugfixes

  • [CLI]: Removed some excess logging statements when doing a login

0.56.0 (30-11-2021)

features

  • [CLI]: Automatically add the remote repo url in the project info at project build time when it was empty
  • [General]: Release the beta version of the notebooks feature
  • [General]: Map container start error to a pod failure state such that it can be shown in application runs
  • [UI]: Add route to root from Conveyor logo
  • [Template]: Upgrade templates to 0.15.3

bugfixes

  • [Airflow]: Remove trailing dot from conveyor_ui_domain variable. This ensures that the airflow variables: base_url, jwt_audience are correctly set.
  • [Airflow]: Support passing None values in env variables to conveyor_container_operator_v2
  • [General]: Fix deletion of streaming applications for projects that have an underscore in the name
  • [General]: Improve error message with serviceAccountName when assumeRole fails when fetching secrets
  • [UI]: Correctly show start time and finished time in application runs details when the container is still pending

0.55.2 (22-11-2021)

bugfixes

  • [General]: Fixed deleting of users with the operator role from an environment

0.55.1 (19-11-2021)

Bugfixes

  • [General]: Revert cleanup of project versions in ssm as it failes when there are more than 10 projects in the request

0.55.0 (19-11-2021)

Features

  • [Airflow]: Upgrade Airflow 2 to version 2.2.2
  • [General]: Remove the use of terraform when creating/deleting environments. This makes the creation/deletion of environments faster
  • [CLI]: Remove irrelevant warning about encryption when using airflow validate dags

Bugfixes

  • [General]: Fixed an issue with spark executor metrics not showing up
  • [General]: Fixed a bug with mount ssm parameters or aws secrets manager secrets as environment variables. If you mounted the same secret twice but with a different path the application would never start
  • [Airflow]: Open Application Runs Airflow button in new tab again, the behaviour got changed by upgrading to Airflow 2.2.x, but it makes more sense to open in a new tab by default
  • [CLI]: Remove warning about validation being ran with Airflow 2
  • [CLI]: When checking if the Dockerfile exists during datafy build we know take the project config docker path into account
  • [Templates]: Correct resources s3 template to use like instead of equals in trust relationship condition

0.54.12 (15-11-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.12/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.12/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [General]: Fixed an issue with the migration from RDS proxy to pg-bouncer not going smoothly

0.54.11 (14-11-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.11/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.11/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [General]: Temporary rollback in the way we capture events of running applications

0.54.10 (11-11-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.10/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.10/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes

  • [General]: Fixed an issue processing events of running applications

0.54.9 (10-11-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.9/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.9/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [General]: Finished the migration away from the deprecated EFS provisioner. The new EFS file system we use is encrypted as well.
  • [Airflow]: Make Airflow database migration more robust by giving them longer to finish.
  • [UI]: Show the reason a job fails in the UI. We used to only detect out of memory issues, we now expanded this with secrets issues, image pull issues, etc....

bugfixes

  • [General]: Fixed an issue with Spark event uploading where sometimes Spark did not cleanup the in progress properly, resulting in two eventlog files on the system.
  • [Airflow]: Merged an upstream Airflow patch to fix an issue with get_next_data_interval so that it does not fail when there is no next_run defined yet. This will be fixed in future Airflow releases as well.

0.54.8 (05-11-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.8/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.8/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [General]: Enable KMS encryption for our SQS queues
  • [Airflow]: Upgrade to Airflow 2.2.1
  • [Spark]: Added Spark 3.2.0 image with support for Scala 2.13
  • [Template]: Upgrade templates to 0.15.1

bugfixes

  • [Airflow]: Fixed an issue where a spot interrupt could result in a false task success in Airflow for the ConveyorSparkSubmitOperatorV2
  • [General]: Fixed an issue with cleaning up Spark applications where the driver node gets interrupted
  • [General]: When using datafy run and printing big log lines, datafy run would crash. We now split these lines into multiple chunks fixing the issue.

0.54.7 (28-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.7/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.7/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [General]: Remove RDS proxy as we switched to using pg-bouncer instead
  • [General]: Migrate the dags volume to the encrypted EFS drive
  • [CLI]: Update used templates to 0.15.0

bugfixes:

  • [General]: Fix slow deletion of environment, we were trying to delete files while they were in use. Now we make sure they are not in use before deleting them

0.54.6 (27-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.6/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.6/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [General]: Upgrade node local dns to 1.21.1
  • [General]: Use 1.0.0 of secrets csi driver on the kubernetes cluster
  • [Airflow]: Added the opsgenie provider to airflow 2
  • [General]: Added support for spark 3.2.0
  • [CLI]: Added explanation to datafy run about the default execution-date used to start your job

0.54.5 (26-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.5/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.5/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [Airflow]: Migrate EFS logs storage to an encrypted volume
  • [General]: Make datafy run more robust, and add extra logging when something goes wrong

0.54.4 (22-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.4/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.4/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [General]: Upgrade to aws eks cni 1.9.3
  • [General]: Migrate to new encrypted EFS volume for spark events

0.54.3 (20-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.3/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [Airflow]: Support for environment variables when using the dbt task factory
  • [General]: Upgraded components of the K8s cluster to newer version that use IMDSv2
  • [General]: Enable deletion protection on the RDS database instance used by Airflow
  • [General]: Enabled deletion protection and drop invalid headers on the ALB used by Conveyor

bugfixes:

  • [General]: Fix bug where Spark applications were never cleaned from Kubernetes when the Spark event log directory was never created
  • [Cli]: Fixed an issue where the Conveyor yaml migration would update old project to automatically use Airflow 2 validation, now this defaults to using Airflow 1.10 again

0.54.2 (15-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.2/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes:

  • [General]: Make sure the necessary components for secrets are installed on all nodes
  • [General]: Give a bit more memory to the components managing the metrics and the logs to keep them from running out of memory
  • [General]: Make Airflow 2 validation the default for new projects, bringing it in line with Airflow 2 being the default for a new environment

0.54.1 (14-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.1/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [General]: Remove the old (and unused) NLB, we migrated to an ALB in our previous release but kept this around to be able to reverse

bugfixes:

  • [CLI]: Fix datafy run and dag validation not working because of IAM credential issues.

0.54.0 (13-10-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.54.0/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [Airflow]: ConveyorSparkSubmitOperatorV2 and ConveyorContainerOperatorV2 added support for mounting secrets from SSM and Secrets Manager as environment variables
  • [General]: Removed unneeded access to S3 from Conveyor clusters on other accounts
  • [General]: Increased S3 security by not allowing non-SSL requests on Conveyor artifacts bucket
  • [General]: Encrypt the root EBS volume of Kubernetes worker nodes
  • [CLI]: Update templates to 0.14.0

0.53.3 (28-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.3/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [UI]: Enable streaming UI by default, it was hidden behind a feature flag before
  • [UI]: Enable RBAC UI by default, it was hidden behind a feature flag before

0.53.2 (28-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.2/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes:

  • [UI]: Fix issues with the logs page

0.53.1 (27-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.1/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes:

  • [CLI]: Fix broken table output

0.53.0 (27-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.53.0/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [General]: Release spark streaming support
  • [General]: Upgrade postgresql 13
  • [Airflow]: Upgrade to Airflow 2.1.4
  • [General]: Make Airflow 2 the default version and deprecated Airflow 1 support, this means that in a future release Airflow 1 will be removed.
  • [General]: Upgrade the aws k8s cni to version 1.9.1
  • [General]: Adding support for mounting external environment variables

bugfixes:

  • [Airflow]: Handle connection issues with kubernetes gracefully for V2 operators.

0.52.10 (16-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.10/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.10/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes:

  • [Airflow]: Fixed an issue where a spark submit could be scheduled a second time making the job fail in Airflow. This only happened very sporadically.

0.52.9 (16-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.9/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.9/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [RBAC]: The Airflow and Spark UIs of a given environment are now only accessible for users who have the Operator/Contributor/Administrator role for this environment.

bugfixes:

  • [General]: Several small bug fixes in the code used by the Airflow v2 operator.
  • [Airflow]: Fixing the dbt factory with dependencies on sources items.

0.52.8 (09-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.8/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.8/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [RBAC]: Added the operator role to an environment. This role allows users to view and operate the airflow of an environment, but does not allow them to deploy new releases to an environment.

bugfixes:

  • [General]: Several small bug/performance fixes in the code used by the Airflow v2 operator.

0.52.7 (07-09-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.7/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.7/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes:

  • [General]: Fixed a bug where Airflow thought a job was failed but the datafy UI showed the application was finished correctly. While looking for this bug we enhanced our code to make it easier to figure out the root cause next time.

0.52.6 (20-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.6/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.6/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [Airflow]: Slack providers available in Airflow
  • [Airflow]: Tag filtering on dbt task factory
  • [CLI]: Selecting the airflow version for validation and run

0.52.5 (18-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.5/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.5/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [General]: Upgrade the eks cni used to 1.9.0

bugfixes:

  • [Airflow]: Make sure the datafy runs button goes to the correct page
  • [Airflow]: Airflow 2.1 introduced a new kubernetes setting worker_pods_pending_timeout that kills pods by default that are more than 300s pending. This sometimes resulted in jobs being killed, we are setting it to 600s by default and will look into making scheduling faster.

0.52.4 (17-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.4/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.4/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [CLI]: Asking for the task when no task is passed to the run command
  • [General]: Preparations for future releases

0.52.3 (13-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.3/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes:

  • [General]: Fixed an issue where our operator managing airflow, spark-, and container-runs was crashing from a nil pointer. We fixed the issue and made sure it can't crash because of a single nil pointer.

0.52.2 (12-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.2/conveyor_darwin_amd64.tar.gz

Release notes

bugfixes:

  • [CLI]: Fixed a bug that broke the CLI

0.52.1 (12-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.1/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [CLI]: When CLI is out of date recommend using datafy update to update the CLI
  • [CLI]: Allow configuration files smaller then 1MB to be uploaded as part of the DAG folder

bugfixes:

  • [CLI]: Added retry logic when fetching logs with datafy run fails.
  • [Airflow]: Fixed an issue where the Airflow 1 graph view would crash when using the ConveyorSparkSubmitOperator. Old task executions might still have issues but not ones will work again.
  • [RBAC]: Fixed non-thread safe code in RBAC checking that resulted in a 500.
  • [CLI]: Update templates to 0.12.1

0.52.0 (02-08-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.52.0/conveyor_darwin_amd64.tar.gz

Release notes

features:

  • [Airflow]: Added on-demand options for the Airflow Conveyor V2 Operators. Look at the docs of ConveyorSparkSubmitOperatorV2 and ConveyorContainerOperatorV2 to find out more.
  • [General]: We installed node local dns into the cluster, this should improve dns responsiveness and reduce errors. We also improved the robustness of the DNS setup
  • [Aiflow]: Conveyor application runs button on operators now goes to the project's page i.s.o. environment.
  • [Airflow]: Added s3 committer option for the Airflow Conveyor V2 Spark Operator, allowing the usage of the S3 magic committer. Look at the docs of ConveyorSparkSubmitOperatorV2 to find out more.
  • [CLI]: Added datafy update command, this will replace your current executable with the newest version available.
  • [CLI]: Fixed homebrew installation of zsh autocomplete, and added fish autocompletion via homebrew
  • [Doc]: Added documentation on how to install datafy completion scripts on linux.

0.51.0 (27-07-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.51.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.51.0/conveyor_darwin_amd64.tar.gz

features:

  • [Airflow]: Added ConveyorExternalTaskSensor
  • [Airflow]: Upgrade to airflow 2.1.2
  • [General]: Memory optimisations on the tools running on the cluster
  • [General]: Upgrade the eks cni to the latest 1.8 version as recommended by aws, and corrected it's settings
  • [General]: Upgrade eks cluster to version 1.21
  • [General]: Use containerd as a container runtime i.s.o. docker
  • [Spark]: released the following images with hadoop cloud support, and a python 3.8 installation with lmza support.
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.2-hadoop-3.3.1-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.2-hadoop-3.3.1-python-3.8-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1-python-3.8-v2
  • [Airflow]: Added acryl-datahub python package to the airflow images

bugfixes:

  • [Spark History Server]: We give it a gigabyte more memory so it can keep up with a lot of spark jobs being scheduled
  • [General]: Fixed an issue when calculating memory for the datafy instance types

0.50.2 (13-07-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.2/conveyor_darwin_amd64.tar.gz

features:

  • [Airflow]: Upgrade airflow 2 version to 2.1.1
  • [UI]: Added environment and project links to the task executions page

bugfixes:

  • [General]: Fixed a bug were spark application sometimes weren't cleaned up properly
  • [CLI]: Token was automatically refresh 10min after expiry, instead of 10 minutes before

0.50.1 (07-07-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.1/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [CLI]: Print the airflow version in environment list correctly
  • [CLI]: Print logs correctly for the datafy run command

0.50.0 (07-07-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.50.0/conveyor_darwin_amd64.tar.gz

features:

  • [UI]: Added task executions to the project page
  • [UI]: Task executors page make task type an icon
  • [UI]: Allow task executions links to be opened in new tab
  • [General]: Enable image scanning on push on project ECR images
  • [General]: Upgrade eks to version 1.20
  • [General]: Upgraded to terraform 1.0.1
  • [Airflow]: The ConveyorContainerSensor has been released
  • [Spark]: released the following images with improved logging output, and support for hadoop 3.3.1 and spark 3.0.3. For migration to hadoop 3.3.1 see here.
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.2-hadoop-3.3.0-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.2-hadoop-3.3.0-v2
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.0.3-hadoop-3.3.1
    • public.ecr.aws/dataminded/spark-k8s-glue:v3.1.2-hadoop-3.3.1
  • [Airflow]: ConveyorSparkSubmitOperatorV2 set setting spark.hadoop.fs.s3a.aws.credentials.provider by default to: com.amazonaws.auth.DefaultAWSCredentialsProviderChain
  • [CLI]: Cleaned up the output of all commands
  • [Template]: Update template to 0.11.0

bugfixes:

  • [Spark History Server]: Improvements to make the spark history server more stable
  • [CLI]: Text was sometimes printing wrong when using a spinner resulting in words like: imagege, resourceses...

0.49.3 (22-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.3/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [General]: Fixed a small issue in Airflow authentication

0.49.2 (21-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.2/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [General]: Fixed an issue with cleaning up old builds

0.49.1 (18-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.1/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [UI]: Fixed issues with rendering Airflow inside of our UI
  • [General]: Fixed an issue with cleaning up old builds

0.49.0 (17-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.49.0/conveyor_darwin_amd64.tar.gz

features:

  • [UI]: Embedded airflow into the Conveyor UI. This means on the environment page you can look at Airflow without leaving the Conveyor UI. You can still open the Airflow UI full screen if you want to.

bugfixes:

  • [UI]: Fixed issues with pagination in the UI

0.48.1 (16-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.48.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.48.1/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [RBAC]: datafy project run resulted in unauthorized with RBAC enabled.
  • [UI]: Users panel project/environment role selection is empty after assigning to a user

0.48.0 (11-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.48.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.48.0/conveyor_darwin_amd64.tar.gz

features:

  • [CLI]: Adding often used project commands to the root command.
  • [General]: Improved the cleaning up of old builds, this should reduce the bill on your cloud account.
  • [Airflow]: Upgraded to Airflow 2.1.0. A known issue related to the auto-refresh on the dag tree view exists, and will be fixed in a next release of Airflow: https://github.com/apache/airflow/pull/16018/files
  • [General]: Upgraded the version of Terraform used by the Conveyor agent to 1.0.0.

bugfixes:

  • [Airflow]: Fix CSRF issues after Airflow Web restart.
  • [Airflow]: ConveyorSparkSubmitOperatorV2 no longer results in an error when inputting an integer as value in the Spark config.
  • [CLI]: DAG validation no longer warns about Conveyor plugins in Airflow.
  • [Airflow]: An edge case is fixed in the V2 Operators where failed applications were not properly detected.
  • [CLI]: The description in datafy environment new and datafy environment update no longer says to enable experimental mode for Airflow 2 (as this is not the case anymore).
  • [General]: Fixed an issue where deleting a project could result in an update to an Airflow 2 environment failing.
  • [General]: Allow CI/CD token access to the Airflow 2 API.

0.47.2 (03-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.2/conveyor_darwin_amd64.tar.gz

features:

  • [General]: Airflow V2 operator, and datafy project run now detect application being evicted because of disk pressure and will warn you about this happening.
  • [CLI]: Added execution date support to datafy project run.

bugfixes:

  • [UI]: The logs UI was fetching finished application logs in the wrong order, this is now fixed.

0.47.1 (02-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.1/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [Spark]: Fix Spark history server upload when spark application names are very long.

0.47.0 (01-6-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.47.0/conveyor_darwin_amd64.tar.gz

features:

  • [UI]: The UI received some new paint. We migrated to a new framework which will allow us to make a UI that is more uniform and easier to maintain.
  • [CLI]: Restructured the output of datafy project run to make it more focussed
  • [General]: Using datafy Airflow operator V2 or datafy project run will now warn you if you use a container image that can't be pulled.

bugfixes:

  • [CLI]: Make datafy project run robust against print statements and logs in dags.
  • [Spark]: Fixed a bug where setting spark.executor.cores and spark.drives.cores resulted in unauthorized with the ConveyorSparkSubmitterV2.

0.46.2 (27-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.2/conveyor_darwin_amd64.tar.gz

features:

  • [CLI]: Turn on quiet flag with env variable QUIET=true.
  • [Airflow]: Changed our recommendation warning in the ConveyorContainerOperator when setting cpu/memory limits. In short you should not set these yourself.
  • [General]: Make spark.driver.cores and spark.executor.cores user editable in the ConveyorSparkSubmiteOperatorV2. This can be usefull for IO/CPU bound jobs.

bugfixes:

  • [Airflow]: Fixed the ConveyorSparkSubmitOperatorV2 and ConveyorContainerOperatorV2 not sending application run events.

0.46.1 (26-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.1/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [General]: Bugfix where terraform wouldn't be destroyed when using datafy project undeploy
  • [Spark]: Fixed the executors page in the spark history server.
  • [UI]: Spark UI button was not working.

0.46.0 (26-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.46.0/conveyor_darwin_amd64.tar.gz

features:

  • [Spark]: Added spark history server support when using the ConveyorSparkSubmitOperatorV2.
  • [CLI]: Clarify datafy project run documentation. This does not support deploying resources, this is made clear in the docs.

bugfixes:

  • [General]: Long project, dag or task name could result in jobs not being scheduled this is now fixed
  • [General]: When pending application are canceled they would stay in pending forever. They are now set to failed.
  • [Airflow]: Fixed an issue when running an extra datafy cluster in another region/account.

0.45.3 (20-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.3/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [Airflow]: Fixed some caching issues with forwarding the Airflow UI

0.45.2 (20-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.2/conveyor_darwin_amd64.tar.gz

features:

  • [Airflow]: Added ui colors to v2 operators
  • [Airflow]: ConveyorSparkSubmitOperatorV2 add support for env_vars
  • [Airflow]: ConveyorContainerOperatorV2 added support for legacy kube2iam way of assuming roles.

bugfixes:

  • [Airflow]: ConveyorSparkSubmitOperatorV2 and ConveyorConainerOperatorV2 would result in failures when project had underscore. The service account that contains the project name replaces underscores _ with dots .

0.45.1 (19-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.1/conveyor_darwin_amd64.tar.gz

features:

  • [CLI]: datafy project upgrade-dags also replaces the import for ExternalTaskSensor

bugfixes:

  • [Airflow]: Fix issues with long airflow task names and V2 operators
  • [Airflow]: Fix issues with characters in task names that are now allow on kubernetes for V2 operators
  • [UI]: Empty log pages will be skipped in cloudwatch to return the first non empty page that can be found

0.45.0 (18-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.45.0/conveyor_darwin_amd64.tar.gz

The main feature of this release is the release of new version of our Airflow operators with datafy project run support. This allows you to locally start a job on the remote cluster without having to build, deploy and clear the task in Airflow.

features:

  • [Airflow]: Airflow 2.0 has been upgraded to 2.0.2
  • [Airflow]: Release ConveyorSparkSubmitOperatorV2and ConveyorContainerOperatorV2
  • [CLI]: datafy project run for airflow tasks that use the new V2 Operators
  • [CLI]: Dag validation also runs the NoAdditionalArgsInOperatorsRule from airflow, as undefined args give an error in Airflow 2.0.
  • [Documentation]]: Updated documentation structure
  • [Template]: Set default templates version to 0.9.0.

bugfixes:

  • [CLI]: Fixing support for resource templates
  • [General]: When deleting a project environments weren't properly updated.
  • [General]: Make our kubernetes setup more tolerant to spot interruption failures.

0.44.5 (03-5-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.5/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.5/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [Airflow]: Extending the job_heartbeat_sec from 5s to 10s and configuring a db connection timeout of 30 seconds to avoid jobs failing when missing a heartbeat.

0.44.4 (29-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.4/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.4/conveyor_darwin_amd64.tar.gz

features:

  • [Agent]: Upgrade our agent to use terraform 0.14.11

bugfixes:

0.44.3 (28-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.3/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.3/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [Internal]: Reduce CPU load on airflow manager

0.44.2 (26-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.2/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [UI]: Some CSS changes made the logs font too big, some buttons overflow etc. We fixed these changes.

0.44.1 (26-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.1/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [General]: Airflow RDS proxy is not available in all regions, so disable the use in region that don't have it available

0.44.0 (26-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.44.0/conveyor_darwin_amd64.tar.gz

features:

  • [Airflow]: Airflow 2.0 has been made generally available.
  • [Airflow]: We use a new way of deploying to Airflow, further speading up deployments.
  • [Airflow]: The ConveyorContainerOperator does not require you to fill in the name parameter anymore.
  • [Airflow]: Airflow 2.0, enable access to the API using a datafy token. You can get the token with datafy auth get, look in the docs for more info.
  • [Airflow]: Set default parallelism to 64 up from 32, and default dag concurrency to 32 up from 16.
  • [CLI]: Remove experimental flag from cluster commands as this is GA.
  • [Template]: Set default templates version to 0.8.1.

bugfixes:

  • [General]: Do not allow to deploy, undeploy, delete an environment that is being deleted.
  • [UI]: Show friendly message when log aren't available yet instead of generic error
  • [Airflow]: Fixed a bug that stopped dags from being synced.
  • [Airflow]: Added a proxy to the Airflow Database so we can handle more connection to the database then before.

0.43.1 (12-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.43.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.43.1/conveyor_darwin_amd64.tar.gz

bugfixes:

  • [General]: Update agent terraform to 0.14.10, and update the terraform locked providers

0.43.0 (12-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.43.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.43.0/conveyor_darwin_amd64.tar.gz

features:

  • [DOCS]: Added Airflow 2.0 migration path to the docs.
  • [Airflow]: Support for Airflow 2 has been made available for environments with the experimental flag enabled.
  • [General]: Experimental environment can enjoy an even faster upgrade experience.
  • [General]: Experimental on an environments can not be disabled anymore. This allows us to only have make sure migration work in one way.
  • [CLI]: The command datafy project upgrade-dags, will change your dags according to our airflow 2.0 upgrade documentation. Your dags will still work on Airflow 1.
  • [UI]: Revamped the logging UI. It will now show the latest logs when a job is finished. You can also choose to see the latest logs when the job is running.
  • [Templates]: Are upgraded to 0.8.0: https://github.com/datamindedbe/datafy-templates/blob/master/CHANGELOG.md#080---2020-04-12

0.42.0 (7-4-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.42.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.42.0/conveyor_darwin_amd64.tar.gz

features:

  • [General]: The experimental dag syncer has now become the default making deploys at least twice as fast!
  • [General]: The terraform version used by our agent has been upgraded to 0.14, to enable this transition we added an automatic upgrade process from 0.12 to 0.13 to 0.14 to the agent.
  • [General]: Added experimental flag to environment to enable experimental features. This flag will be used in the future to test new features. At the moment behind this flag we test a new way of deploying airflow.
  • [Airflow]: Upgraded airflow to version 1.10.15
  • [Airflow]: Added Conveyor macros validation to Conveyor dag validation. Macros should be changed from macros.env to macros.datafy.env
  • [Docs]: We added docs about airflow alerting using the ConveyorContainerOperator

bugfixes:

  • [Airflow]: Fixed a bug when using dag syncer where dag files where unavailable for a short time.

0.41.0 (24-3-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.41.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.41.0/conveyor_darwin_amd64.tar.gz

features:

  • [General]: Eks 1.19 upgrade, if you have the following in your DAG somewhere you can safely remove it:
security_context={
"fsGroup": 185,
}
  • [General]: Updated the documentation url to https://docs.datafy.cloud
  • [Airflow]: Airflow 2.0 warnings during dag validation
  • [Airflow]: Made the Airflow UI more stable by running the http proxy component on fargate.
  • [General]: Send less information to cloudwatch to reduce costs.

bugfixes:

  • [CLI]: Ignore the possible __pycache__ folder in the dags folder when building a project
  • [UI]: Fixed issue where we failed in parsing the logs from cloudwatch to be shown in the UI

0.40.2 (19-3-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.2/conveyor_darwin_amd64.tar.gz

bugfixes

  • [UI]: Fixed an issue where certain logs couldn't be shown in the UI

0.40.1 (16-3-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [UI]: Fixed an issue where newer and older logs buttons where not working anymore
  • [Airflow]: Fixed an issue with experimental dag sync

0.40.0 (12-3-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.40.0/conveyor_darwin_amd64.tar.gz
caution

If you are using resources with datafy we added default provider "aws" when deploying your code. If you have a provider "aws" without an alias in your code this can break things, just rename your provider to use an alias or remove it.

features

  • [CLI]: Added build arg support to the CLI
  • [UI]: Application runs page has a spark UI button added and the logs page is the default i.s.o. the metrics page
  • [UI]: Application runs are now clickable on the environment page
  • [Airflow]: Added and experimental dag sync way of deploying dags to airflow. This should make deploys quite faster (in the range of 20 - 40s for a deploy). You can test this out in your development environment by updating the environment and using the flag --experimental-dag-sync=true. To update your environment do:datafy environment update --name YOURENVIRONMENT --experimental-dag-sync=true . However it is not recommended for production.
  • [Spark]: spark 3.1.1 images release: public.ecr.aws/dataminded/spark-k8s-glue:v3.1.1-hadoop-3.3.0, datamindedbe/spark-k8s-glue:v3.1.1-hadoop-3.3.0

bugfixes

  • [CLI]: When creation of a project fails we now clean up the project that was generated if you used a template
  • [UI]: When pressing refresh logs button on the main logs page the URL path would get into an undefined state, you couldn't share this URL. This is now fixed.

0.39.1 (01-3-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.39.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.39.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [Airflow]: Fixed an issue rolling out or creating a new airflow environment

0.39.0 (01-3-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.39.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.39.0/conveyor_darwin_amd64.tar.gz

features

  • [UI]: Added a refresh button to the application run log view
  • [General]: The image field does not need to be specified to use Conveyor Airflow operators anymore, it uses the name of the project by default.
  • [General]: There is no restriction anymore on the IAM role names that the Conveyor operators can use. There used to be a constraint where the role needed to have the prefix datafy-dp-{env} but not anymore.
  • [General]: Conveyor instance types now support all spark memory options
  • [General]: We updated to spark to version 3.0.2. New docker image available, see here.
  • [Templates]: Updated the templates to use the spark 3.0.2 image.
  • [Documentation]: Updated the CI/CD authentication documentation, see here
  • [Documentation]: Added a documentation page about setting up Conveyor using WSL2 on Windows, see here

0.38.0 (15-2-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.38.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.38.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: Allow the terraform resources to attach existing AWS policies
  • [General]: Support datafy instance types - see here _**_and here
  • [General]: Optimise IP usage by the Kubernetes cluster
  • [Templates]: Update templates to the latest version - See here for release notes

0.37.0 (29-1-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.37.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.37.0/conveyor_darwin_amd64.tar.gz

features

  • [UI]: Added logs to the application runs UI. From now on you can view the logs of your application directly in the Conveyor UI instead of being redirected to AWS cloudwatch.
  • [CLI]: Added flag --no-browser to the cli. This prevents the Conveyor cli from opening a browser automatically but instead prints the url for loging in.

bugfixes

  • [CLI]: Fixed wrong upload message when deploying a project.
  • [CLI]: Print an error when files needed can't be read during project build instead of silently crashing
  • [General]: Fixed one last small instance where we used a docker hub image instead of public ECR
  • [General]: If using project resources and if you still had a state.tf file things would break because of a new terraform kubernetes provider release. We now fixed this.

0.36.0 (15-1-2021)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.36.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.36.0/conveyor_darwin_amd64.tar.gz

features

  • [Airflow]: We are migration certain pieces to be airflow compatible. This should have no impact on your environment:
    • Migrated our spark submit operator to use the backported spark submit operator from Airflow 2.0.
    • We migrate the kubernetes executor config to be Airflow 2.0 compatible
  • [CLI]: Upgraded to datafy-template 0.4.0 which uses the new import paths for operators used in Airflow 2.0
  • [General]: Migrated to use public ECR where possible

bugfixes

  • [UI]: In the metrics page certain charts where not properly unloaded resulting in a memory leak and a slow page
  • [UI]: Fixed a bug where the metrics page could crash when no spark executors were started up.
  • [UI]: Fixed a small bug where we were showing a wrong message in the metrics page when your application was not running for long enough
  • [General]: removed m5zn type aws instances from autoscaling groups as they gave issues with kubernetes.
  • [CLI]: Skip airflow dag validation in datafy project build when the project does not contain dags
  • [CLI]: Fixed the wrong error message when trying to apply a non existing template in datafy template apply

0.35.1 (30-12-2020)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.35.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.35.1/conveyor_darwin_amd64.tar.gz

features

  • [General]: Migrated to use the new GP3 volumes instead of GP2, these are 20% cheaper, provide more IOPS and same troughput. This will make IO intensive jobs on the platform faster.
  • [General]: The kubernetes autoscaler will now downscale nodes after 5 minutes instead of 10m.
  • [General]: Made our spark images available on ECR public: https://gallery.ecr.aws/dataminded/spark-k8s-glue
  • [CLI]: Updated to the latest datafy templates release

bugfixes

  • [UI]: Changed the airflow logo to the new flat version
  • [UI]: Fixed the Conveyor logo to work on all platform (unix, mac and windows)
  • [UI]: Spark jobs with a lot of executors (40+) could not show all of their executors in the legend of the metrics page. This has been fixed.
  • [UI]: Metrics page, failure reason did not show properly when hovering over the information icon.
  • [General]: Better cleanup of resources in the kubernetes cluster after building airflow images
  • [General]: Installation of the Conveyor cluster would fail in an existing VPC network. This has been fixed
  • [General]: Migrated the installation of the kube2iam Helm chart to the new version

0.35.0 (22-12-2020)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.35.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.35.0/conveyor_darwin_amd64.tar.gz

features

  • [Airflow]: Upgraded airflow to 1.10.14 this should fix some scheduling issues users experienced with depends_on_past or task_concurrency . For more information see the airflow release notes.
  • [General]: Made environment rollout 10 to 20s faster by caching the terraform providers used when rolling out an environment.

bugfix

  • [CLI]: Dag validation was failing when airflow variables were used in the dags. This has been fixed in this release.

0.34.1 (09-12-2020)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.34.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.34.1/conveyor_darwin_amd64.tar.gz

features

  • [General]: Added an experimental feature that allows you to restrict which aws roles a project can use. This feature is hidden behind an experimental flag. And can be enabled in the CLI by setting the environment variable CONVEYOR_EXPERIMENTAL=1 . More documentation on how to use this feature will follow later.
  • [General]: We added the new m5zn instance in our spot pools for kubernetes. That way we have even more instance types to choose from, this should result in less spot interrupts.
  • [Templates]: Update to templates versio 0.3.2: https://github.com/datamindedbe/datafy-templates/blob/master/CHANGELOG.md#032---2020-12-09

bugfix

  • [General]: Because of our rewrite of the our control plane we had a regression. When you delete a project we normally trigger it to be removed from all environments. This behaviour has been reinstated.
  • [CLI]: Airflow dag validation failed if you imported for example a utils file from your dags folder into your DAGs. We now do validation correctly so that this isn't mistakingly flagged as an issue.
  • [CLI]: Airflow dag validation failed when you were using it outside the aws eu-west-1 region. This has been fixed.
  • [UI]: When you opened an application run on a spark application without executors. The UI would still try to fetch the metrics for these executors resulting in a crash.

0.34.0 (04-12-2020)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.34.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.34.0/conveyor_darwin_amd64.tar.gz

features

  • [CLI]: Added dag validation to the CLI. When you do a datafy project build a dag validation phase is done first. This just checks if your DAG can run on airflow. You can skip this phase if you want to. You can also use the datafy project validate-dags to validate the dags of your project without doing a build.
  • [Airflow]: Upgraded Airflow to 1.10.13
  • [Doc]: Added documentation on how to run a container as non-root with the ConveyorContainerOperator.
  • [General]: Upgraded the k8s environment to EKS 1.18
  • [General]: We now run the k8s autoscaler as a high priority pod, so that it always gets priority. This makes sure the cluster can always autoscale when needed.
  • [General]: We enabled the free autoscaling metrics for our autoscaling groups. That way you can see when the limit is reached and we can more easily recommend when to scale it up.

bugfix

  • [CLI]: The Conveyor CLI returned exit code 0 when a build failed. This has now been fixed to return an exit code 1. This makes it easier to chain multiple commands or to find out in CI/CD that a build has failed.
  • [Airflow]: When you manually trigger a job and then use the Conveyor application runs button you would find nothing. This has now been fixed.
  • [General]: Updating a project description failed when there were more than 256 characters used. We updated this field to take a bigger amount of text.
  • [General]: Fixed the Conveyor logo in the UI

0.33.1 (20-11-2020)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.33.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.33.1/conveyor_darwin_amd64.tar.gz

bugfix

  • [UI]: Fixed a bug where we showed wrong finished date in application run detail page
  • [UI]: We showed a Nan duration when the application was not yet finished in the application run detail page. Now we show the current duration
  • [UI]: We showed a message that no metrics where available yet while they were visible. This is now fixed
  • [General]: By accident we used an image from docker hub that ran into rate limiting. We now also copied that one to ECR.

0.33.0 (20-11-2020)

CLI

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.33.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.33.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: We now offer CPU/Memory metrics of your jobs running on Conveyor. If you you go to the application runs page, you can for every run go to the metrics of that run.
  • [CLI]: We now show a warning when you are trying to deploy new dags or resources for your project, but first forgot to do a build.
  • [CLI]: When an undeploy or promote fails we know show you the last event on that environment, similar to how deploys work.
  • [General]: We have updated our API to a new technology to be sure we can continue on growing. This means that old version of the CLI (< 0.30.0), might not fully work any longer so please upgrade to the latest CLI version

0.32.0 (16-11-2020)

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.32.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.32.0/conveyor_darwin_amd64.tar.gz

features

  • [CLI]: Conveyor project build now uses the same credentials als the docker CLI. Allowing you to pull images from private registries

  • [CLI]: The datafy project undeploy and datafy project promote command now also show the latest event when it fails like the datafy project deploy command.

  • [General]: Send CPU and Memory metrics of jobs to cloudwatch. Later we will make these metrics available in the UI.

  • [General]: Duplicated the spark images on docker hub on ECR and shared it with our customers. This is a temporary fix for docker hub rate limiting untill aws releases their solution. For more info read the following article: https://aws.amazon.com/blogs/containers/advice-for-customers-dealing-with-docker-hub-rate-limits-and-a-coming-soon-announcement/ The repository is:

    776682305951.dkr.ecr.eu-west-1.amazonaws.com/datafy/data-plane/mirroring/datamindedbe/spark-k8s-glue

0.31.2 (30-10-2020)

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.2/conveyor_darwin_amd64.tar.gz

bugfixes

  • [Application runs]: The application runs app was crashing because of a value that overflowed a 32bit integer

0.31.1 (30-10-2020)

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [Agent]: When manually deleting the a role created with project resources, the agent would fail applying the terraform update because of not enough AWS IAM rights
  • [Airflow]: There was an error popping up on airflow that did not fail your job but was confusing we have fixed so you should not see this error anymore.
  • [Templates]: Bumped templates to the next version, this fixed problems with the dbt template: https://github.com/datamindedbe/datafy-templates/releases/tag/0.3.1

0.31.0 (30-10-2020)

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.31.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: Added description field to environment.
  • [UI]: Added a markdown editor for the description field for project and environment. This allows you to give users more context to your projects and environment.
  • [UI]: Added a git repository to projects. By filling in this field we can provide link to the actual git hashes that deployments were build with.
  • [Doc]: Added more info on the available parameters for the Conveyor spark and container operators
  • [General]: We mirrored all images used in the kubernetes cluster on ECR. Since docker hub has started rate limiting users.

bugfixes

  • [General]: When deleting a project something went wrong in are backend causing the project to remain stuck in deleting mode
  • [CLI]: When something went wrong with generating projects we sometimes produced a very cryptic long error message. The unnecessary parts were cleaned up and the error message is smaller now.

0.30.0 (23-10-2020)

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.30.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.30.0/conveyor_darwin_amd64.tar.gz

features

  • [CLI]: Improved date printing to be less verbose
  • [General]: Added created at field to all objects. These can be seen in the UI and the CLI.
  • [General]: Added last activity field to projects. This field shows the last build/deployment done for a project and can be used to see how inactive a project is. This can help you decide if a project is actively developed or not.

bugs

  • [CLI]: Creating a new project with a template that is from a git repository was broken. It is now fixed

0.29.2

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.2/conveyor_darwin_amd64.tar.gz

bugs

  • [Internal]: This is a release with only internal changes in the way we capture metrics

0.29.1

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.1/conveyor_darwin_amd64.tar.gz

bugs

  • [CLI]: When you create a build and have untracked files you would get a non dirty git hash, this has been fixed
  • [General]: Fixed a bug where if an application got in a certain state the airflow ui would stop working and the application runs would stop updating.

0.29.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.29.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: We now add git information to a build. When you do datafy project build we store the git hash that has been used to create this build. If you have uncommited changes we add the .dirty appendix. You can see this information when you use datafy project deployments or datafy environment deployments or in the UI. If the build is done outside a git repo no information is added. You need to upgrade to the CLI 0.29.0 to take advantage of this feature.
  • [General]: Better project and environment cleanup. When removing project we now delete the associated files on s3. We also cleanup the associated ssm parameters we are using to track deployed versions on environments.
  • [Documentation]: The spark 3 docker images don't run under root anymore. But this makes pip install fail. We documented the way to do this properly here.
  • [Documentation]: Added documentation on the changes in the spark 3.0.1 image here.
  • [Templates]: Upgraded to the latest version of the templates, see release note here.

bugs

  • [CLI]: Fixed a typo in delete environment description.

0.28.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.28.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.28.0/conveyor_darwin_amd64.tar.gz

features

  • [Documentation]: We got a bug report that in certain use cases logs for a ConveyorContainerOperator did not show up in both Airflow and Cloudwatch. The reason for this is that python print statements don't behave properly in a production environment. It's best to use the python logging framework for more information see the FAQ.
  • [Airflow]: We only keep the last 14 days in airflow logs. This is the same retention we have in cloudwatch now. This should make the logs volume smaller and result in a cost reduction.
  • [Airflow]: We added a liveness probe to the airflow scheduler to check if it's still running and restart it if it isn't.
  • [UI]: When your ConveyorContainerOperator application dies with an out of memory error we show this in the UI. When you use the ConveyorSparkSubmitOperator we show you when your driver has died with an out of memory error.

bugs

  • [Airflow]: There was a bug in the airflow UI that could show you were logged in under another user. This is now fixed.
  • [Airflow]: When you have a DAG in airflow with many tasks. The frontend code in airflow makes it very slow to open the modal when you click a task. We set the standard number of runs in the tree view to 5 down from 25 as this helps the javascript code to be faster. See this ticket in airflow Jira.
  • [CLI]: When building a docker image that takes longer than 15 minutes you got a timeout error. This limit has been raised to an hour.
  • [General]: When deleting an environment we now also cleanup the database associated with it.

0.27.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.27.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.27.0/conveyor_darwin_amd64.tar.gz

features

  • [Spark]: Release spark image with spark 3.0.1 and hadoop 3.0.0 support

bugs

  • [Airflow]: The official airflow runs under user 50000 which is more secure than running under root. But all airflow logs were still owned by user root. So we added chown to update those to the scheduler. However this can take a very long time for production airflow instances. We now do this seperate from the scheduler and only once. So the scheduler will start up fast again once we have upgraded.
  • [Airflow]: The list task instances page had log buttons that redirected to the wrong page. This is now fixed.
  • [Airflow]: We got a bug notification that editing a dag run showed a forbidden page. However editing dag runs has been deprecated in airflow. The button will be removed in a later release. For more info see here and here.
  • [Templates]: We upgraded to version 0.2.2 of the templates that contains 3 bugfixes. For more information look here.
    • The pyspark template spark 2.4 support was fixed.
    • When using the python image when you don't want role management we clean up the terraform files
    • We upgraded to spark 3.0.1 and hadoop 3.0.0
  • [General]: Creating projects that start with a dash - or underscore _. Resulted in a failure now we don't allow such name schemes anymore.

0.26.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.26.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.26.0/conveyor_darwin_amd64.tar.gz

features

  • [Agent]: We reworked the agent in the background to be more future proof. Normally this should have no impact for you but it allows us to make more/faster progress in the future. We also created more end-to-end tests to have more quality checks before we release.
  • [Airflow]: We upgrade to the official airflow image here. Again this should have no impact for users but allows us to make fast progress in the future.
  • [General]: The bastion used for datafy forward has been removed since it is not needed anymore. This will result in a small cost saving.

bugs

  • [Airflow]: When going to view a rendered task instance you got an error. This has now been fixed and rendered task instance can be shown in the UI.
  • [CLI]: On linux generating templates resulted in files being created with root ownership this has been fixed.
  • [CLI]: The unlock, get, events command did not work on environments in a failed state. This has been fixed.

0.25.1

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.25.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.25.1/conveyor_darwin_amd64.tar.gz

bugs

  • [CLI]: Fixed a bug where datafy template apply did not work with resource templates

0.25.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.25.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.25.0/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

features

  • [Airflow]: Upgraded airflow to 1.10.12
  • [Airflow]: Upgraded to use the RBAC UI of airflow. The RBAC UI of airflow is the new UI. Only this UI will receive updates and new features. In airflow 2.0 the previous UI will be deprecated.
  • [Airflow]: Added links to the datafy application runs dashboard from airflow when you select a task in airflow. See picture below. It automatically filters to show the runs of this dag and task.

bugs

  • [General]: Show better error message when deleting an environment that has deletion protection turned on

Link to Conveyor application runs from a task run

0.24.2

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.2/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

bugs

0.24.1

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.1/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

bugs

  • [CLI]: Small bugfix where deletion protection wasn't applied when updating an environment

0.24.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.24.0/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

features

  • [Templates]: The templates use by Conveyor have been open sourced: https://github.com/datamindedbe/datafy-templates. You can use these as an example on how to develop your own templates or just have a look at the templates. Suggestions for improvements can be done trough the issues on github, or you can make even make a pull request! 🥳
  • [UI]: Application runs page filtering on execution date, and started at date of jobs in now supported
  • [Airflow]: Conveyor spark submit operator now passes the following configuration by default: "spark.hadoop.fs.s3.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem" . That way users can use s3 instead of s3a without any problems.
  • [Airflow]: Airflow has been upgraded to 1.10.11
  • [CLI]: You can now unlock an environment when something went wrong with deploying your project resources or rolling out your environment trough datafy environment unlock --name ENV
  • [CLI]: We added deletion protection to environments, If deletion protection is enabled users can not delete that environment. This is useful to protect environments like production and development. You can enable with a new environment with datafy environment new --name ENV --deletion-protection=true of you can edit an environment: datafy environment update --name ENV ----deletion-protection=true
  • [General]: Kubernetes has been upgraded to 1.17

0.23.2

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.23.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.23.2/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

bugs

  • [UI]: Fixed a bug were in some use cases we connected to the wrong region in the UI when opening cloudwatch logs

0.23.1

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.23.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.23.1/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

bugs

  • [CLI]: datafy environment deployments returned the wrong output
  • [CLI]: datafy environments events returned the wrong output
  • [CLI]: By accident datafy project get also had ls as an alias. Resulting in a conflict with datafy project list

0.23.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.23.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.23.0/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

features

  • [CLI]: It is now possible to use custom templates with the cli. We support local templates in a directory and templates from a git repository. When using git you can provide a tag or branch and a directory in your git repo. To find out more check datafy template apply --help or datafy project new --help. We hope this features will allow our customers to make their own templates and bootstrap internal project quicker. This is also done in preparation of open sourcing the Conveyor templates.
  • [CLI]: The dependency on cookiecutter from the CLI has been removed. This is one thing less people have to install to get started with Conveyor. Instead we use docker to run a container with cookiecutter installed. We use the following image by default: https://hub.docker.com/r/datamindedbe/cookiecutter. But you can always specify your own if your templates need more to be installed.
  • [CLI]: We went over all commands to provide more descriptive explanations and examples when doing --help
  • [DOC]: Documented a new pattern for using common airflow code.
  • [General]: We made the deploy to an environment roughly 15 to 20 seconds quicker by optimizing some steps.
  • [Airflow]: Set the setting min_file_process_interval to 30s to lessen the load on the airflow database. This results in jobs triggering a bit slower but makes usage of Variables.get("VARIABLES") less of a problem for the database.

bugs

  • [General]: Because of the way we handled events in our API. Your project could remain stuck in the creating state while it was actually created. We have fixed this on our side.
  • [UI]: When switching tabs on an environment page the applications runs wouldn't reload with the latest changes. This has been fixed.

0.22.2

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.22.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.22.2/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

bugfixes

  • [CLI]: We changed the authentication so you don't have to configure it anymore as a first time user. This broke the flow where you set a key and a secret for authentication (for example in CI/CD). This has been fixed in this release.

0.22.1

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.22.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.22.1/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

bugfixes

  • [UI]: Fixed an issue where changing page resulted in filters being reset in the application runs page.
  • [UI]: Fixed a pagination issue on the application runs page where we started at the wrong page.

0.22.0

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.22.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.22.0/conveyor_darwin_amd64.tar.gz

CLI1 is deprecated but we will still provide support if a client needs it. But we recommend you to migrate to CLI2 as quickly as possible.

features

  • [General]: New application logs page where you can see runs and filter on them. Runs are stored up to 2 weeks and you can see the logs of a failed run for up to 2 weeks.
  • [General]: Made our kubernetes cluster a bit less heavy by trimming of some excess deployments. This means we need less nodes to do the same work.
  • [CLI2]: Added a datafy completion command that can generate completions for the shell of your choice. Bash, Zsh and Fish are supported. To find out how to configure it please use datafy completion --help
  • [CLI2]: Migrate the cli to a new authentication configuration. This means as a first user you don't need to configure authentication anymore. If federated login is setup for your company you can use that otherwise you have to use the account created for you.
  • [CLI2]: The cli now doesn't need aws rights to do a build. This means you can be logged in into another aws account while still being able to do a datafy project build

bugfixes

  • [General]: Deleting environments failed we have fixed it in this release.

0.21.0

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.21.0.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.21.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.21.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: Upgrade k8s to 1.16
  • [General]: Switch log aggregation from fluentd to fluent-bit. Fluent-bit is more memory efficient then fluentd so this means we have less overhead per node.
  • [CLI]: The old CLI 1 which is installed as a python package is deprecated in favour of CLI2. For now we will keep it around and fix bugs if needed. It will be unsupported in the future
  • [CLI2]: The homebrew install of CLI2 now install it as the datafy command instead of the godatafy command.
  • [CLI2]: Removed Conveyor forward
  • [UI]: Removed the old UI in favour of https://app.conveyordata.com

0.20.2

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.20.2.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.20.2/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.20.2/conveyor_darwin_amd64.tar.gz

bugfixes

[General]: Small bugfix in a new application that gave trouble when it ran in a us region.

0.20.1

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.20.1.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.20.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.20.1/conveyor_darwin_amd64.tar.gz

bugfixes

[UI]: Fixed a bug where task executions did not show up

0.20.0

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.20.0.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.20.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.20.0/conveyor_darwin_amd64.tar.gz

features

  • [General]: Spark 3 support. We have updated our templates to both support spark 2.4 and 3.0. There are some small migration steps needed as described here. You can find out what is new in spark 3.0 here.
  • [Templates]: Added a description to the generated terraform variables.
  • [Docs]: Added documentation on max memory supported on Conveyor.

bugfixes

  • [CLI2]: Made create project call more robust when using a template. We first generate the template and then create the project.
  • [CLI2]: Fixed an issue where users were getting the wrong tokens. This issue should be fixed.

0.19.1

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.19.1.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.19.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.19.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [CLI2]: Changed the way templates ask for input as the previous version proved to be unstable
  • [UI]: The spark UI did not proxy the executors page correctly this has been fixed and the page should now work correctly
  • [Airflow]: The Conveyor environment link was not working anymore and is fixed in this release.

0.19.0

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.19.0.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.19.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.19.0/conveyor_darwin_amd64.tar.gz

features

  • [UI]: New UI hosted on https://app.conveyordata.com. If your authentication has been correctly configured this can be used once your data plane has been upgraded. After that you never have to use the forward command again! For now we keep it working as a fallback but it will be removed in a future upgrade. For the forwarding of airflow to work correctly you should also trigger an airflow deployment to upgrade airflow to the new configuration needed.
  • [Airflow]: Changed the configuration to be able to host the new UI

0.18.1

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.18.1.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.18.1/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.18.1/conveyor_darwin_amd64.tar.gz

bugfixes

  • [CLI2]: Bugfix for the forward command. Aws does not accept public keys with a newline at the end. So we now send it without newline.

0.18.0

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.18.0.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.18.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.18.0/conveyor_darwin_amd64.tar.gz

features

  • [infra]: Reduced costs by disabling certain unused private endpoints.
  • [Agent]: Changed the terraform output to be more readable.

bugfixes

  • [CLI2]: Found multiple changed files when applying a template. We changed the way we are applying templates, so this nasty bug should not happen again.
  • [CLI]: Deleting a project deletes my .datafy home folder. When deleting a project the cli tried to clean up the project folder. But by accident could delete the datafy home folder.
  • [API]: We know properly validate project names to makes sure you can not create a project with an unsupported name

0.17.0

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.17.0.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.17.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.17.0/conveyor_darwin_amd64.tar.gz

features

  • [Documentation]: Document the list of spark images we have available
  • [CLI2]: when a deploy fails automatically print the latest event on the environment
  • [Templates]: Added scala 2.12 option for the scala template
  • [UI]: show the airflow execution timestamp in the application logs overview
  • [CLI][CLI2]: Change default project workflow_start_date to yesterday

bugfixes

  • [CLI2]: Deploy a build id to an environment failed, now it works again
  • [Templates]: The pyspark template contained a bug this has been fixed

0.16.0

CLI

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.16.0.tar.gz

CLI2

brew install datamindedbe/datafy-formulas/datafy
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.16.0/conveyor_linux_amd64.tar.gz
wget https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/cli/0.16.0/conveyor_darwin_amd64.tar.gz

features

  • [Projects]: Support for Spark 3
  • [CLI]: Graduation of the experimental CLI to CLI2
  • [CLI2]: Support for JSON as output format
  • [API]: Improved event output
  • [Agent]: Configurable agent rights
  • [Documentation]: Documentation on the various authentication options
  • [Documentation]: Brand new tutorial

bugfixes

  • [CLI2]: create .datafy folder if it does not exist
  • [CLI2]: fix path bug when applying templates
  • [CLI2]: return meaningful error messages
  • [CLI2]: CI/CD support

0.15.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.15.0.tar.gz

features

  • [UI]: New read only Conveyor UI, get an overview of environment and projects on this new release of the UI
  • [UI]: Put a link to the documentation in the UI
  • [Experimental CLI]: Added the undeploy and promote commands to the experimental CLI
  • [Experimental CLI]: Added the new authentication flow to the experimental CLI
  • [API]: Always return projects and environments sorted, this way you get a consistent view every time you make a call on the CLI or open the UI

0.14.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.14.0.tar.gz

features

  • [Airflow]: Added a link to the Conveyor environment that the airflow instance belongs to.
  • [Templates]: Updated the python template to include resources.
  • [General]: We clean up old builds of airflow automatically.
  • [Documentation]: Documented how you can use service account with the Conveyor Container Operator

bugfixes

  • [Airflow]: Made the spark submit operator a bit more robust, we had reports of users seeing it try to create certain resources twice, this should be fixed now. If not please let us know.

0.13.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.13.0.tar.gz

features

  • [CLI]: added support to use federated authentication for the CLI. This means we can use the preferred login method of your company (e.g. OneLogin or Okta). This still needs manual configuration from our side so we will contact our customers one by one to set this up and test this out.
  • [CLI]: added support for promotion-based deployments using the datafy project promote command
  • [Experimental CLI]: our experimental cli did not support all flags for project deploy. It is now supported to deploy a build to an environment with the experimental cli.
  • [Experimental CLI]: Installation trough homebrew is now possible for Mac users: brew install datamindedbe/datafy-formulas/datafy This install it as the executable godatafy so it can peacefully coexist with the old Conveyor CLI
  • [General]: We upgraded our runtime to kubernetes 1.15.
  • [Airflow]: Upgrade to 1.10.10

bugs

  • [Experimental CLI]: The experimental CLI worked differently with regards to the region configuration. It has been fixed to automatically use your aws region config now.
  • [Experimental CLI]: The experimental CLI did not contain all templates or the latest versions of them.

0.12.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.12.0.tar.gz

features

  • [CLI] undeploy a project from an environment, use datafy project undeploy
  • [CLI] A new experimental version of the CLI written in Go is available. We noticed that all the dependencies needed by our Conveyor CLI got people into trouble in their python environment. The Go versions ships as a single binary. You can read installation instruction here.
  • [UI] Show logs of container spawned by ConveyorContainerOperator in the Conveyor UI
  • [GENERAL] Added support for using kubernetes service account for ConveyorContainerOperator, you can use the new resource template for this: resource/aws/container-iam-role-s3
  • [AIRFLOW] Added a Conveyor menu with links to the Docs and Conveyor UI
  • [DOCS] Documented the airflow behaviour with pools and old tasks in the FAQ
  • [DOCS] Documented on how to use resources to deploy something else then IAM roles, we only give the agent rights to create IAM roles. This for security reasons, but you can specify you own roles if you want more rights.

bugfixes

  • [GENERAL] Fixed a bug when deleting a project with non existing ECR repo
  • [GENERAL] Deploying a project without dags now proceeds as normal without crashing

0.11.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.11.0.tar.gz

features

  • [GENERAL] Parallel deployments: We now support deploying, creating, deleting multiple projects at the same time. At the moment we allow up to 4 in parallel.

bugfixes

  • [CLI] datafy project shell now passes the aws region correctly
  • [CLI] We use temp files to cache tokens, but since datafy forward must be run as root user this sometimes resulted in file permission errors. We now give broader permissions when creating the file.
  • [CLI] printing empty lists in the CLI resulted in errors. We now print a nothing found message.
  • [GENERAL] When deleting a project, if it is the only project in an environment updating the environment failed. This is fixed now

0.10.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.10.0.tar.gz

features

  • [UI] Let failed jobs live longer in the Conveyor UI, this way it's easier to follow them up
  • [UI] Added the Conveyor logo to the Conveyor UI
  • [AIRFLOW] Expose the airflow operators as an installable package. This will allow you the Conveyor operator for unit testing the airflow dags for example
  • [CLI] Open a local shell into a build project container. This allows you to debug this container locally. To try this use: datafy project shell
  • [CLI] You can now see the currently deployed builds on an environment datafy environment deployment or on which environment your project is deployeddatafy project deployments
  • [CLI] We improved the error messages for datafy template apply . They should help guide you towards the correct syntax
  • [CLI] Added project events to the CLI

bugfixes

  • [UI] correct Spark ui to Spark UI
  • [UI] Use t3.nano as bastion host for Conveyor forward instead of t2.nano. This should increase network throughput
  • [CLI] Conveyor configure has been fixed to work again. It tried to check if your CLI was up to date but without a configuration this didn't work.
  • [CLI] We now update the status of a build to Created or Failed depending on the result of the building
  • [AIRFLOW] Conveyor Spark Submit Operator by accident set the option spark.executor.pyspark.memory to be equal to the requested memory. This resulted in a doubling of the memory requested. This flag has been removed in the operator. You can still set it yourself though.
  • [AIRFLOW] Airflow workers sometimes weren't able to get aws credentials this was made more robust and should not be a problem anymore
  • [AGENT] The project delete process has been made more robust. We won't try to delete ECR repositories anymore that do not exist.

0.9.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.9.0.tar.gz

features

  • CRUD support of project resources
  • Example project resource template
  • Expose project events in the cli
  • Upgrade to airflow 1.10.9
  • Warn CLI users when to upgrade

bugfixes

  • Support for large logs in the datafy UI

0.7.1

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.7.1.tar.gz

bugfixes

  • Could not find eks worker instances in all cases
  • Project delete by name now works correctly

0.7.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.7.0.tar.gz

features

  • Default arguments for the ConveyorSparkSubmitOperator
  • Updated documentation including templates
  • Added documentation for tenant hosted templates

0.6.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.6.0.tar.gz

features

  • Template for the ConveyorDockerOperator
  • Template for scala spark
  • Documentation for the ConveyorDockerOperator
  • Added support for tenant hosted templates

bugfixes

  • Allow users to access the UI without access to k8s

0.5.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.5.0.tar.gz

features

  • data-plane - validate the existence of the referenced docker image in the ConveyorSparkSubmitOperator and the ConveyorContainerOperator. This should makes sure you do not have pending pods because of not existing images
  • data-plane - cached fetched tokens on the agent for eks, this to avoid rate limiting by aws
  • data-plane - Increase default wait time out from 2 minutes to 5 minutes in the ConveyorContainerOperator. 2 minutes could be too low when new nodes need to be scheduled, 5 should always be enough.
  • data-plane - shaved another 10 to 15 seconds on environment updates, releases should take 1 min 40ish seconds
  • cli - add possibility to delete project by name
  • cli - added aliases del for delete and ls for list

0.4.0

pip3 install https://datafy-cp-artifacts.s3-eu-west-1.amazonaws.com/pypi/release/datafy-cli-0.4.0.tar.gz

bugfixes

  • ui - crash when refreshing
  • ui - forward command times out frequently (beta)
  • ui - can not find bastion

features

  • cli - pyspark project template
  • cli - listing available project templates
  • cli - show version
  • cli - wait flag for deployments
  • cli - support both dtf and datafy
  • cli - consistent naming of delete
  • cli - improve table layout
  • data-plane - new operator ConveyorContainerOperator
  • data-plane - remove aws code build dependency
  • data-plane - faster environment updates
  • data-plane - cleanup old spark jobs