3. Deploy your project
The project and environment are now in a state where the project can be build and deployed to the environment.
3.1 Build the project
The first step is to generate an uberjar containing your code and all necessary dependencies. Since the sample project uses gradle as build tool, this can be achieved through the following command in our project folder:
./gradlew clean shadowJar
The second step is to validate dags and package your project code in a docker image in order to upload it to a Conveyor environment. The command is executed in the project folder.
conveyor build
The project provides the Dockerfile defining what the docker image should look like. We see there that it depends on the uberjar, as specified in the following line:
COPY build/libs/*-all.jar /opt/spark/work-dir/app.jar
Either your build system (Maven, sbt, Gradle) produces an uberjar at the specified location, or you update the Dockerfile to point to location where the uberjar is created.
Once done you can deploy this project to any environment.
The first build of a project can take some time depending on your internet connection. Subsequent builds should be significantly faster.
3.2 Deploy the project
Next, deploy the project to the environment:
conveyor deploy --env $ENVIRONMENT_NAME --wait
The --wait
flag makes the command wait until the deploy is finished, this is certainly useful in a CI/CD context but
also if you want to easily know when a deploy is finished.
3.3 Activate the workflow
In the Conveyor UI navigate to your environment, by clicking environment on the left and clicking on your environment in the list. This will automatically open Airflow.
By default, your dag will be disabled. Enable it by selecting the toggle on the left next to your DAG. Airflow will now start scheduling your project, one run for each day since the start date you specified when creating the project. For more details on Airflow as well as how Conveyor integrates with Airflow, take a look here
3.4 Explore logs and metrics
Next we can see our runs in the Task executions
tab of our environment.
Select one of the task executions and you will be able to see all the settings of your task.
And underneath you see the logs of your task.
For longer running task CPU and Memory metrics are also available.
3.5 Run a task from the CLI
To shorten the feedback loop during development you can run test updates to a given task directly from the CLI through
the conveyor run
command. This command performs a build and runs the job on the cluster, bypassing Airflow and shows the
logs directly in your console.
conveyor run --env $ENVIRONMENT_NAME --dag $PROJECT_NAME --task sample
Great! We just build and deployed our project to an environment for the first time.
This is the main way used by developers to deploy their project to development or production.
And using conveyor run
we are able to quickly test out changes without having to deploy our project.