Skip to main content

conveyor project spark-history

Create a local Spark history server

Synopsis

Start a local Spark history server for analyzing a Spark eventlog. You can copy the full command from the Conveyor UI when clicking on Spark icon for a completed Spark application.

conveyor project spark-history [flags]

Examples

To execute the command you need to provide the ID for the project, environment and application as follows:

$ conveyor project spark-history --id PROJECT_ID --env-id ENVIRONMENT_ID --application-id APPLICATION_ID

When analysing large Spark history files (> 800MiB), you might need to increase the memory limit (default: 1024MiB) for the created Spark history server.
This can be done through the --memory argument as follows:

$ conveyor project spark-history --id PROJECT_ID --env-id ENVIRONMENT_ID --application-id APPLICATION_ID --memory 2048

If you are mirroring the provided Spark base images to your own repository and want to use those for running the Spark history server,
you can do so by setting it via the --spark-image argument:

$ conveyor project spark-history --id PROJECT_ID --env-id ENVIRONMENT_ID --application-id APPLICATION_ID --spark-image MIRRORED_SPARK_IMAGE

Options

      --application-id string   The ID of the Spark application
--env-id string The ID of the environment in which the Spark application ran
-h, --help help for spark-history
--id string The ID of the project to which the Spark application belongs
--memory int The memory limit in MiB for the Spark History Server (default 1024)
--spark-image string The container image to use for running the Spark History Server (default "public.ecr.aws/dataminded/spark-k8s-glue:v3.5.1-hadoop-3.3.6-v2")

Options inherited from parent commands

      --debug                        Show debug output
--no-browser NO_BROWSER=true Do not automatically open a browser at login instead print the url to the command line. You can also use the environment variable NO_BROWSER=true.
-o, --output string Change the output. Valid options are table or json (default "table")
--quiet QUIET=true Quiet down the output. You can also use the environment variable QUIET=true.

See also