-
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
Step launchers allow users to control how specific steps within a run are launched. For example, the emr_pyspark_step_launcher has the op run inside an EMR cluster, instead of the process that the executor would normally execute it inside. Writing a step launcher is not the only way to run code remotely. You can also have the body of an op invoke a remote execution. https://docs.dagster.io/_apidocs/libraries/dagster-databricks#dagster_databricks.create_databricks_job_op is an example of this approach. The advantage of using a step launcher is that it allows your op to express pure business logic, which makes it much easier to test. Writing a step launcher isn't simple, because running code remotely isn't simple. Writing a StepLauncher means implementing the Writing a step launcher involves:
The LocalExternalStepLauncher is a "simple" step launcher implementation that provides a starting point. |
Beta Was this translation helpful? Give feedback.
-
Hello, I have started developing my StepLauncher for EMR Serverless based on your implementation for plain EMR. However when I start a job on the EMR side, it fails during the resolution of StepRunRef to StepExecutionContext in dagster._core.execution.plan.external_step.step_run_ref_to_step_context. (Please see attached stacktrace) Somewhere more down the line it tries to import my Definitions to lookup the pipeline/asset/resources etc., and DagsterInvalidConfigError is thrown because some jobs lack required run_configs. These configs are normally read from a file which is obviously missing from the EMR cluster and the run_config provided to the Run is not propagated. When I supply manually the config file to the EMR Job it succeeds however this decouples the configs I have during job run and also does not allow me to change it from the UI. Do you have any solution how to pass run_config from the Run to the remote execution? Thank you very much. |
Beta Was this translation helpful? Give feedback.
Step launchers allow users to control how specific steps within a run are launched. For example, the emr_pyspark_step_launcher has the op run inside an EMR cluster, instead of the process that the executor would normally execute it inside.
Writing a step launcher is not the only way to run code remotely. You can also have the body of an op invoke a remote execution. https://docs.dagster.io/_apidocs/libraries/dagster-databricks#dagster_databricks.create_databricks_job_op is an example of this approach. The advantage of using a step launcher is that it allows your op to express pure business logic, which makes it much easier to test.
Writing a step launcher isn't simple, because running cod…