Skip to main content

Instances

Conveyor supports the following instances types for all jobs:

Instance typeCPUTotal Memory (AWS)Total Memory (Azure)
mx.nano1*0.438 GB0.434 GB
mx.micro1*0.875 GB0.868 GB
mx.small1*1.75 GB1.736 GB
mx.medium13.5 GB3.47 GB
mx.large27 GB6.94 GB
mx.xlarge414 GB13.89 GB
mx.2xlarge829 GB30.65 GB
mx.4xlarge1659 GB64.16 GB
cx.nano1*0.219 GBNot supported
cx.micro1*0.438 GBNot supported
cx.small1*0.875 GBNot supported
cx.medium11.75 GBNot supported
cx.large23.5 GBNot supported
cx.xlarge47 GBNot supported
cx.2xlarge814 GBNot supported
cx.4xlarge1629 GBNot supported
rx.xlarge428 GBNot supported
rx.2xlarge859 GBNot supported
rx.4xlarge16120 GBNot supported
info

(*) These instance types don't get a guaranteed full CPU but only a slice of a full CPU, but they are allowed to burst up to a full CPU if the cluster allows.

The numbers for AWS and Azure differ because nodes on both clouds run different DaemonSets and have different reservation requirements set by the provider. We aim to minimize the node overhead as much as possible while still obeying the minimum requirements of each cloud provider.

Spark resources

When running Spark/PySpark applications, only a part of the total memory for the container is available for Spark itself. The details are described in the following tables:

Instance typeCPUTotal memorySpark memoryPySpark memory
mx.micro1*0.875 GB0.8 GB0.6 GB
mx.small1*1.75 GB1.6 GB1.25 GB
mx.medium13.5 GB3.2 GB2.5 GB
mx.large27 GB6.4 GB5 GB
mx.xlarge414 GB12.7 GB10 GB
mx.2xlarge829 GB26.7 GB21 GB
mx.4xlarge1659 GB54 GB42.4 GB
cx.medium11.75 GB1.6 GB1.25 GB
cx.large23.5 GB3.2 GB2.5 GB
cx.xlarge47 GB6.4 GB5 GB
cx.2xlarge814 GB12.7 GB10 GB
cx.4xlarge1629 GB26.7 GB21 GB
rx.xlarge828 GB26 GB21 GB
rx.2xlarge1659 GB54 GB43 GB
rx.4xlarge16120 GB112 GB88 GB
info

(*) These instance types don't get a guaranteed full CPU but only a slice of a full CPU. If the cluster has space for it, they are allowed to burst up to a full CPU.

As you can see from the tables, the supported executor memory configs change depending on using regular (Scala) Spark or PySpark. The explanation for this can be found in the spark.kubernetes.memoryOverheadFactor which can be found in the Spark settings. This setting is configured to 0.1 for JVM jobs (Scala and Java Spark), and to 0.4 for non-JVM jobs (PySpark, SparkR). A portion of the memory is set aside for non-JVM things like: off-heap memory allocations, system-processes, Python, R... Otherwise, your job would commonly fail with the error "Memory Overhead Exceeded".

GPU instances

When installed on AWS, Conveyor supports using instances that come with a GPU. Currently, you can make use of the following instance types:

Instance typeCPUTotal Memory (AWS)GPU
g4dn.xlarge416 GBNVIDIA T4

GPU instances are not currently supported on Azure. If you would like to see this happen, please get in touch!

Disk space allocation

When an application saves data to disk, it will by default consume disk space from the host that it is running on. It's important to note that this disk space will be shared across all the jobs that are running on the same physical machine. Applications are unable to read each-others files, but a particularly storage-hungry application might consume all available disk-space, potentially causing issues for other jobs running on the same host machine.

Applications requesting a T-shirt size of mx.xlarge or greater will get the "full" instance assigned. This means that no other applications will be deployed on that instance and will thus not suffer from a "noisy neighbor" problem. Applications running on smaller instance sizes will receive a slice of a physical machine, and share the amount of available disk space (about 50GB of allocatable space).

To avoid this issue, you can provision application-specific storage by specifying the disk_size (and optionally disk_mount_path) when using the ContainerOperator.

Spark applications can make use of the equivalent executor_disk_size when using the SparkSubmitOperator. This setting will provision additional storage for each executor, which will then be automatically used by Spark.