Writing Dockerfiles
The Dockerfile
in your project root describes how to create a container from your local source code.
Some of the explanation below applies only when building images that have to be compiled, such as projects that are written in Scala.
Arguably, the cleanest way to create your image is to use a Docker builder image to build your application,
and then copy the artefacts it generates into your target container.
This way you create a multi-stage build in your Dockerfile.
As an example, consider the following Dockerfile
that first builds a Spark application using Maven,
and then copies the resulting artefacts:
FROM maven:3.6.0-jdk-8 as builder
WORKDIR /build
COPY . .
RUN mvn clean package -DskipTests
FROM public.ecr.aws/dataminded/spark-k8s-glue:2.4.3-2.11-hadoop-2.9.2-v3
COPY --from=builder /build/target/*-all.jar /app/app.jar
Alternatively, you could just copy locally built artefacts into the image.
This would result in a Dockerfile
as follows:
FROM public.ecr.aws/dataminded/spark-k8s-glue:2.4.3-2.11-hadoop-2.9.2-v3
WORKDIR /app
COPY target/*-all.jar app.jar
Both approaches have their advantages and disadvantages as described in the following table.
Approach | Advantages | Disadvantages |
---|---|---|
Multi stage builds | More complicated build processes do not require configuration beyond what you already have in place. E.g., dependencies to private repositories can be resolved without any extra configuration. | conveyor build does not actually compile your code. This is a problem when you forget to re-package your source code locally before building. |
Copy artifacts from local | conveyor build does not require compiling your source code and therefore runs much faster. | The build process depends on how your local machine is configured, so images built on your machine might not be exactly the same as those built on another machine. |