How to use dataframes from asset to asset? #23006
Answered
by
garethbrickman
JamarWhitfield
asked this question in
Q&A
-
If I create an asset that creates a data frame and then performs some processes on the data frame how would I go about using the processed data frame in another asset? |
Beta Was this translation helpful? Give feedback.
Answered by
garethbrickman
Jul 16, 2024
Replies: 1 comment 3 replies
-
You can return the dataframe in the upstream asset and use it as an input in a dependent downstream asset via the import pandas as pd
from dagster import asset, Definitions, AssetIn, AssetKey
@asset
def create_dataframe(context):
data = {
'A': [1, 2, 3, 4],
'B': [5, 6, 7, 8]
}
df = pd.DataFrame(data)
context.log.info(f"Raw dataframe: {df}")
return df
@asset(ins={"df": AssetIn(key=AssetKey("create_dataframe"))})
def process_dataframe(context, df: pd.DataFrame):
# Perform some operations on the DataFrame
df['C'] = df['A'] + df['B']
context.log.info(f"Processed dataframe: {df}")
return df
defs = Definitions(
assets=[create_dataframe, process_dataframe]
) |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In this example, we'll return multiple assets in the
create_dataframe
function and they'll be used as inputs for theprocess_dataframe
asset: