Skip to main content

1. Create a new dbt project

info

This is a step-by-step introduction intended for beginners.

If you are already familiar with the basics of Conveyor, our how-to guides are probably more appropriate.

In this tutorial you will create and deploy a new batch processing project using dbt, a popular tool for doing Transformations, described in SQL. The transformations are the T in an ETL (Extract, Transform, Load) pipeline.

1.1 Set up your project

If you have not already done so, you will need to set up Conveyor locally: set up the local development environment.

For convenience, let us define some environment variable for the rest of the tutorial. Use your first name, the name should not contain any special characters like underscore or dashes. For example john would be a good name. Please open your terminal an execute the following commands:

export NAME=INSERT_YOUR_NAME
export PROJECT_NAME=$NAME
export ENVIRONMENT_NAME=$NAME

1.1.1 Create the project

Any batch job in any language can run on Conveyor, as long as there is nothing that prevents it being dockerized.

For convenience we provide ready-to-go templates for batch jobs using vanilla dbt, Spark and other languages and frameworks. Use of these templates is optional.

For this tutorial, we will use the dbt template:

conveyor project create --name $PROJECT_NAME --template dbt

For this tutorial, select the following options for your project (other options should be left on their default settings):

  • database_type: duckdb
  • conveyor_managed_role: No
  • cloud: aws (If you are using Azure, you can update it accordingly)

It takes a few moments to create the project. The result is a local folder with the same name as the project. Let's have a look inside.

1.2 Explore the code

Have a look at the folder that was just created and identify the following subfolders.

cd $PROJECT_NAME
ls -al | grep '^d'

This should show you the following directories:

  • .conveyor contains Conveyor-specific configuration.
  • dags contains the Airflow DAGs that will be deployed as part of this project. Here you will define when and how your project will run.
  • dbt/profiles.yml contains the database information for the different environments.
  • dbt/{{project_name}} contains the actual project code.
  • dbt/{{project_name}}/dbt_project.yml defines the project settings such as the directory structure, default model materialization, ...
  • Dockerfile defines how to package your project code as well as the version for every dependency (dbt, Python, dbt integrations). We supply our own dbt images such that you can get started quickly, more details images can be found here.

1.3 Explore the UI

In the Conveyor UI you can find your project under the projects menu on the left. Select the projects menu on the left and find your project in the list. Clicking on it will show you the details. Use this opportunity to update the description.