Help needed to find details of asset job failure #17828
Unanswered
Replies: 2 comments 1 reply
-
You should be able to see the same logs in the CLI as you would if you ran the asset from the UI. Some potential things you can do:
|
Beta Was this translation helpful? Give feedback.
0 replies
-
I added the import pandas as pd
# import dask.dataframe as dd
from dagster import SourceAsset, asset, get_dagster_logger
ssl._create_default_https_context = ssl._create_unverified_context
iris_harvest_data = SourceAsset(key="iris_harvest_data")
@asset()
def ext_iris_dataset(context) -> pd.DataFrame:
"""This is the iris dataset."""
context.log.info("Reading iris dataset")
my_logger = get_dagster_logger()
try:
return pd.read_csv(
"https://docs.dagster.io/assets/iris.csv",
names=[
"sepal_length_cm",
"sepal_width_cm",
"petal_length_cm",
"petal_width_cm",
"species",
],
)
except Exception as e:
my_logger.error(f"Failed to read iris dataset. Error: {e}")
raise
finally:
my_logger.info("Finished reading iris dataset") interestingly, the log in CLI is expected: dagster asset materialize --select ext_iris_dataset --package-name spc_flow
2023-11-08 17:11:09 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10933 - RUN_START - Started execution of run for "__ASSET_JOB".
2023-11-08 17:11:09 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10933 - ENGINE_EVENT - Executing steps using multiprocess executor: parent process (pid: 10933)
2023-11-08 17:11:09 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10933 - ext_iris_dataset - STEP_WORKER_STARTING - Launching subprocess for "ext_iris_dataset".
2023-11-08 17:11:11 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10963 - STEP_WORKER_STARTED - Executing step "ext_iris_dataset" in subprocess.
2023-11-08 17:11:11 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10963 - ext_iris_dataset - RESOURCE_INIT_STARTED - Starting initialization of resources [io_manager].
2023-11-08 17:11:11 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10963 - ext_iris_dataset - RESOURCE_INIT_SUCCESS - Finished initialization of resources [io_manager].
2023-11-08 17:11:11 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10963 - LOGS_CAPTURED - Started capturing logs in process (pid: 10963).
2023-11-08 17:11:11 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10963 - ext_iris_dataset - STEP_START - Started execution of step "ext_iris_dataset".
2023-11-08 17:11:11 -0500 - dagster - INFO - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - ext_iris_dataset - Reading iris dataset
2023-11-08 17:11:11 -0500 - dagster - INFO - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - ext_iris_dataset - Finished reading iris dataset
2023-11-08 17:11:11 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10963 - ext_iris_dataset - STEP_OUTPUT - Yielded output "result" of type "DataFrame". (Type check passed).
2023-11-08 17:11:17 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10963 - ext_iris_dataset - ASSET_MATERIALIZATION - Materialized value ext_iris_dataset.
2023-11-08 17:11:17 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10963 - ext_iris_dataset - HANDLED_OUTPUT - Handled output "result" using IO manager "io_manager"
2023-11-08 17:11:17 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10963 - ext_iris_dataset - STEP_SUCCESS - Finished execution of step "ext_iris_dataset" in 6.0s.
2023-11-08 17:11:17 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10933 - ENGINE_EVENT - Multiprocess executor: parent process exiting after 8.01s (pid: 10933)
2023-11-08 17:11:17 -0500 - dagster - DEBUG - __ASSET_JOB - 2b413846-07b7-4d7f-b208-1c398f87b668 - 10933 - RUN_SUCCESS - Finished execution of run for "__ASSET_JOB". but in production the log doesn't seem to get to the asset step: time="2023-11-08T22:10:05.155Z" level=info msg="capturing logs" argo=true
2023-11-08 22:10:09 +0000 - dagster - DEBUG - __ASSET_JOB - 2169b6f0-01d6-4a0b-b575-e776b9b5bff5 - 27 - RUN_START - Started execution of run for "__ASSET_JOB".
2023-11-08 22:10:09 +0000 - dagster - DEBUG - __ASSET_JOB - 2169b6f0-01d6-4a0b-b575-e776b9b5bff5 - 27 - ENGINE_EVENT - Executing steps using multiprocess executor: parent process (pid: 27)
2023-11-08 22:10:09 +0000 - dagster - DEBUG - __ASSET_JOB - 2169b6f0-01d6-4a0b-b575-e776b9b5bff5 - 27 - ext_iris_dataset - STEP_WORKER_STARTING - Launching subprocess for "ext_iris_dataset".
2023-11-08 22:10:12 +0000 - dagster - DEBUG - __ASSET_JOB - 2169b6f0-01d6-4a0b-b575-e776b9b5bff5 - 61 - STEP_WORKER_STARTED - Executing step "ext_iris_dataset" in subprocess.
2023-11-08 22:10:12 +0000 - dagster - DEBUG - __ASSET_JOB - 2169b6f0-01d6-4a0b-b575-e776b9b5bff5 - 61 - ext_iris_dataset - RESOURCE_INIT_STARTED - Starting initialization of resources [io_manager].
2023-11-08 22:10:12 +0000 - dagster - DEBUG - __ASSET_JOB - 2169b6f0-01d6-4a0b-b575-e776b9b5bff5 - 61 - ext_iris_dataset - RESOURCE_INIT_SUCCESS - Finished initialization of resources [io_manager].
2023-11-08 22:10:13 +0000 - dagster - DEBUG - __ASSET_JOB - 2169b6f0-01d6-4a0b-b575-e776b9b5bff5 - 27 - ENGINE_EVENT - Multiprocess executor: parent process exiting after 3.91s (pid: 27)
2023-11-08 22:10:13 +0000 - dagster - ERROR - __ASSET_JOB - 2169b6f0-01d6-4a0b-b575-e776b9b5bff5 - 27 - RUN_FAILURE - Execution of run for "__ASSET_JOB" failed. Steps failed: ['ext_iris_dataset']. it's executing in prod as a command line like so - basically a cron job: apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
name: ext-run
spec:
# run daily at 1 AM EST
#schedule: "0 1 * * *"
# run every hour (test)
schedule: "*/10 * * * *"
timezone: "America/New_York"
concurrencyPolicy: "Replace"
workflowSpec:
entrypoint: high-sys-container
templates:
- name: high-sys-container
steps:
- - name: step1
template: high-sys-containerca
- name: high-sys-containerca
metadata:
labels:
app: container
name: HSC
retryStrategy:
limit: 3
container: &container
resources:
requests:
memory: 4Gi
cpu: 2
imagePullPolicy: Always
image: secret.amazonaws.com/image
command:
[
"dagster",
"asset",
"materialize",
"--select",
"ext_iris_dataset",
"--package-name",
"spc_flow",
]
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
hello again - i'm trying to debug an error in production - all i see is
RUN_FAILURE - Execution of run for "__ASSET_JOB" failed. Steps Failed: ['ext_iris_dataset'].
how can i log more info in prod to see more details on the nature of the error? I'm running via command line, so no UI access.The question was originally asked in Dagster Slack.
Beta Was this translation helpful? Give feedback.
All reactions