Skip to main content

Working with GPU's

Conveyor has support for GPU instances as documented in our Instances page.

We only support Nvidia GPU's.

We support the following methods of launching GPU instances:

Nvidia drivers

danger

Nvidia drivers should not be installed in the container, this can lead to an inability to use the GPU's.

The nvidia drivers are made available to your container via the nvidia container toolkit. The nvidia container toolkit passes on the correct devices, and GPU libraries from the host to your container. Installing a different nvidia driver version in your container vs the one used by the host will lead to issues.

Using the Nvidia GPU's with pytorch

To use the Nvidia GPU's you can use pytorch. Installing pytorch will automatically also install the nvidia cuda library, which allows pytorch to use the GPU's.

Use the following example Dockerfile as an example to run the mnist pytorch application using a GPU.

FROM python:3.12-slim

ENV PYTHONUNBUFFERED=1
WORKDIR /app

RUN pip install torch torchaudio torchvision
ADD https://raw.githubusercontent.com/pytorch/examples/refs/heads/main/mnist/main.py main.py

You can then use the ConveyorContainerOperatorV2 to run the main.py:

from airflow import DAG
from conveyor.operators import ConveyorContainerOperatorV2
from datetime import datetime, timedelta
import itertools

default_args = {
"owner": "Conveyor",
"depends_on_past": False,
"start_date": datetime.now() - timedelta(days=2),
"email": [],
"email_on_failure": False,
"email_on_retry": False,
"retries": 0,
"retry_delay": timedelta(minutes=5),
}

dag = DAG(
"sample_pytorch", default_args=default_args, schedule="4 0 * * *", max_active_runs=1
)


ConveyorContainerOperatorV2(
dag=dag,
task_id=f"mnist_train_g4dn_xlarge",
cmds=["python"],
arguments=["/app/main.py"],
instance_type="g4dn.xlarge",
)