Writing Dockerfiles

The Dockerfile in your project root describes how to create a container from your local source code.

info

Some of the explanation below applies only when building images that have to be compiled, such as projects that are written in Scala.

Arguably, the cleanest way to create your image is to use a Docker builder image to build your application, and then copy the artefacts it generates into your target container. This way you create a multi-stage build in your Dockerfile. As an example, consider the following Dockerfile that first builds a Spark application using Maven, and then copies the resulting artefacts:

FROM maven:3.6.0-jdk-8 as builder
WORKDIR /build
COPY . .
RUN mvn clean package -DskipTests

FROM public.ecr.aws/dataminded/spark-k8s-glue:2.4.3-2.11-hadoop-2.9.2-v3
COPY --from=builder /build/target/*-all.jar /app/app.jar

Alternatively, you could just copy locally built artefacts into the image. This would result in a Dockerfile as follows:

FROM public.ecr.aws/dataminded/spark-k8s-glue:2.4.3-2.11-hadoop-2.9.2-v3
WORKDIR /app
COPY target/*-all.jar app.jar

Both approaches have their advantages and disadvantages as described in the following table.

Approach	Advantages	Disadvantages
Multi stage builds	More complicated build processes do not require configuration beyond what you already have in place. E.g., dependencies to private repositories can be resolved without any extra configuration.	`conveyor build` does not actually compile your code. This is a problem when you forget to re-package your source code locally before building.
Copy artifacts from local	`conveyor build` does not require compiling your source code and therefore runs much faster.	The build process depends on how your local machine is configured, so images built on your machine might not be exactly the same as those built on another machine.