Skip to main content

Using AWS Roles

This document describes how to create your AWS role. The role can be specified in the aws_role field for Airflow V2 Operators, IDEs, Streaming jobs, ... The aws_role field allows your job to use an aws role to communicate to services like S3, Glue etc.

The aws_role field allows you to specify either the name of the AWS Role you want to use, or the full ARN. If you only specify the name, you can only specify IAM roles in the account where the Conveyor cluster is running. For example:

from conveyor.operators import ConveyorContainerOperatorV2

ConveyorContainerOperatorV2(
aws_role="role-name",
)

ConveyorContainerOperatorV2(
aws_role="arn:aws:iam::1234567890:role/role-name",
)

Both examples result in the same role when your AWS account id is 1234567890. You can also use this to run with a role in another AWS account. Find out more on how to configure this for a recent SDK in the following section.

Supported AWS SDKs

The following AWS SDKs and newer can be used with Conveyor. The container images provided by Conveyor already come with an appropriate SDK version. If you are using one of these images, you don't need to install this dependency yourself.

Configuration

For example, if you have a project in Conveyor called sample, you can create the AWS role with Terraform as such:

locals {
project_name = "sample"
uuid_pattern = "????????-????-????-????-????????????"
}

resource "aws_iam_role" "default" {
name = "${local.project_name}-${var.env_name}"
assume_role_policy = data.aws_iam_policy_document.default.json
}

data "aws_iam_policy_document" "default" {
statement {
actions = ["sts:AssumeRoleWithWebIdentity"]
effect = "Allow"

condition {
test = "StringLike"
variable = "${replace(var.aws_iam_openid_connect_provider_url, "https://", "")}:sub"
values = [
"system:serviceaccount:${var.env_name}:${replace(local.project_name, "_", ".")}-${local.uuid_pattern}"
]
}

principals {
identifiers = [var.aws_iam_openid_connect_provider_arn]
type = "Federated"
}
}
}

For those who are interested in the technical details of the authentication method used, you can refer to Kubernetes IRSA. We automatically create a service account linked to the role specified by your application. The service account is generated with the following naming pattern: project-name-UUID where UUID is a randomly generated UUID.

Notice the following pattern in the condition value:

"system:serviceaccount:ENVIRONMENT_NAME:PROJECT_NAME-UUID_Pattern"

If the project name contains underscores _, these will be replaced by dots . as underscores are not allowed by Kubernetes. Using this pattern, you can specify which environment can use this role. Alternatively, you can use a wildcard * instead of the environment name to allow all environments to assume the role. You can also specify the project name to make sure no other project can use this role. In the example, we pattern match the UUID exactly with the following:

"????????-????-????-????-????????????"

This matches the exact character pattern (8-4-4-4-12) that is used for a UUID.

For more information on what you can do with the StringLike condition, you can consult the AWS IAM documentation.

Using an IAM role in another account

If you want your roles to work cross-account, you need to create an IAM OIDC provider in the account where the Conveyor cluster is not running.

This should allow container running in the Conveyor cluster to assume roles in another account. You do not have to set up the IAM OIDC provider in the AWS account where the Conveyor cluster is running, as that is already done by us. For the full details on how to access resources cross-account, look at the how-to article on the topic

The AWS documentation also has more information on the topic.