Skip to main content

3. Deploy your project

The project and environment are now in a state where the project can be build and deployed to the environment.

3.1 Build the project

The first step is to generate an uberjar containing your code and all necessary dependencies. Since the sample project uses gradle as build tool, this can be achieved through the following command in our project folder:

./gradlew clean shadowJar
info

You can accomplish the same result with Maven or sbt, the only requirement here is that you must make sure that the output is an uberJar containing your code and the necessary dependencies.

The second step is to validate dags and package your project code in a docker image in order to upload it to a Conveyor environment. The command is executed in the project folder.

conveyor build

The project provides the Dockerfile defining what the docker image should look like. We see there that it depends on the uberjar, as specified in the following line:

COPY build/libs/*-all.jar /opt/spark/work-dir/app.jar

Either your build system (Maven, sbt, Gradle) produces an uberjar at the specified location, or you update the Dockerfile to point to location where the uberjar is created.

Once done you can deploy this project to any environment.

info

The first build of a project can take some time depending on your internet connection. Subsequent builds should be significantly faster.

3.2 Deploy the project

Next, deploy the project to the environment:

conveyor deploy --env $ENVIRONMENT_NAME --wait

The --wait flag makes the command wait until the deploy is finished, this is certainly useful in a CI/CD context but also if you want to easily know when a deploy is finished.

3.3 Activate the workflow

In the Conveyor UI navigate to your environment, by clicking environment on the left and clicking on your environment in the list. This will automatically open Airflow.

By default, your dag will be disabled. Enable it by selecting the toggle on the left next to your DAG. Airflow will now start scheduling your project, one run for each day since the start date you specified when creating the project. For more details on Airflow as well as how Conveyor integrates with Airflow, take a look here

3.4 Explore logs and metrics

Next we can see our runs in the Task executions tab of our environment. Select one of the task executions and you will be able to see all the settings of your task. And underneath you see the logs of your task.

For longer running task CPU and Memory metrics are also available.

3.5 Run a task from the CLI

To shorten the feedback loop during development you can run test updates to a given task directly from the CLI through the conveyor run command. This command performs a build and runs the job on the cluster, bypassing Airflow and shows the logs directly in your console.

conveyor run --env $ENVIRONMENT_NAME --dag $PROJECT_NAME --task sample

Great! We just build and deployed our project to an environment for the first time. This is the main way used by developers to deploy their project to development or production. And using conveyor run we are able to quickly test out changes without having to deploy our project.