Skip to main content

Instances

Conveyor supports the following instances types for any jobs:

Instance typeCpuTotal Memory (AWS)Total Memory (Azure)
mx.nano1*0.438Gb0.375 Gb
mx.micro1*0.875Gb0.75 Gb
mx.small1*1.75Gb1.5 Gb
mx.medium13.5Gb3 Gb
mx.large27Gb6 Gb
mx.xlarge414Gb12 Gb
mx.2xlarge829Gb26 Gb
mx.4xlarge1659Gb55 Gb
rx.xlarge428GbNot supported
rx.2xlarge859GbNot supported
rx.4xlarge16120GbNot supported
info

(*) These instance types don't get a guaranteed full CPU but only a slice of a full CPU, but they are allowed to burst up to a full CPU if the cluster allows.

The numbers for AWS and Azure differ because nodes on both clouds run different daemonsets as well as have different reservation requirements set by the provider. We aim to minimize the node overhead as much as possible while still obeying the minimum requirements of each cloud provider.

Spark resources

When running Spark/PySpark applications, only a part of the total memory for the container is available for Spark itself. The details are described in the following tables:

AWS

Instance typeCpuTotal Memory (AWS)Spark memory (AWS)PySpark memory (AWS)
mx.micro1*0.875Gb0.8Gb0.6Gb
mx.small1*1.75Gb1.6Gb1.25Gb
mx.medium13.5Gb3.2Gb2.5Gb
mx.large27Gb6.4Gb5Gb
mx.xlarge414Gb12.7Gb10Gb
mx.2xlarge829Gb26.7Gb21Gb
mx.4xlarge1659Gb54Gb42.4Gb
rx.xlarge828Gb26Gb21Gb
rx.2xlarge1659Gb54Gb43Gb
rx.4xlarge16120Gb112Gb88Gb
info

(*) These instance types don't get a guaranteed full CPU but only a slice of a full CPU, but they are allowed to burst up to a full CPU if the cluster allows.

Azure

Instance typeCpuTotal Memory (Azure)Spark memory (Azure)PySpark memory (Azure)
mx.micro1*0.75 Gb0.69Gb0.55Gb
mx.small1*1.5 Gb1.38Gb1.1Gb
mx.medium13 Gb2.75Gb2.15Gb
mx.large26 Gb5.5Gb4,3Gb
mx.xlarge412 Gb11Gb8.6Gb
mx.2xlarge826 Gb23.6Gb18.6Gb
mx.4xlarge1655 Gb50Gb35.7Gb
info

*These instance types don't get a guaranteed full CPU but only a slice of a full CPU, but they are allowed to burst up to a full CPU if the cluster allows.

As you can see from the tables, the supported executor memory configs change depending on using regular (Scala) Spark or PySpark. The explanation for this can be found in the spark.kubernetes.memoryOverheadFactor which can be found in the settings here. This is set to 0.1 for JVM jobs (Scala and Java Spark), and to 0.4 for non-JVM jobs (PySpark, SparkR). A portion of the memory is set aside for non-JVM things like: off-heap memory allocations, system-processes, Python, R... Otherwise, your job would commonly fail with the error "Memory Overhead Exceeded".