-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guide for connecting to APIs #23920
Guide for connecting to APIs #23920
Changes from 10 commits
9fe460f
f547c65
f2f7388
ff8d5e3
715f17e
e6ffb1e
bddb4eb
1cd4ebd
091805f
9593f31
1e94b9d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,78 @@ | ||
--- | ||
title: Connecting to APIs | ||
sidebar_position: 20 | ||
--- | ||
--- | ||
|
||
When building a data pipeline, you'll likely need to connect to several external APIs, each with its own specific configuration and behavior. This guide demonstrates how to standardize your API connections and customize their configuration using Dagster resources. | ||
|
||
|
||
## What you'll learn | ||
|
||
- How to connect to an API using a Dagster resource | ||
- How to use that resource in an asset | ||
- How to configure a resource | ||
- How to source configuration values from environment variables | ||
|
||
<details> | ||
<summary>Prerequisites</summary> | ||
|
||
To follow the steps in this guide, you'll need: | ||
|
||
- Familiarity with [Asset definitions](/concepts/assets) | ||
- Familiarity with [resources](/concepts/resources) | ||
- Install the `requests` library: | ||
```bash | ||
pip install requests | ||
``` | ||
|
||
</details> | ||
|
||
## Step 1: Write a resource to connect to an API | ||
|
||
This example fetches the sunrise time for a given location from a REST API. Begin by defining a Dagster resource with a method to return the sunrise time for a location. In the first version of this resource, the location will be hardcoded to San Francisco International Airport. | ||
Check warning on line 32 in docs/docs-beta/docs/guides/external-systems/apis.md GitHub Actions / runner / vale
Check failure on line 32 in docs/docs-beta/docs/guides/external-systems/apis.md GitHub Actions / runner / vale
Check failure on line 32 in docs/docs-beta/docs/guides/external-systems/apis.md GitHub Actions / runner / vale
|
||
|
||
|
||
<CodeExample filePath="guides/external-systems/apis/minimal_resource.py" language="python" title="Resource to connect to the Sunrise API" /> | ||
|
||
|
||
## Step 2: Use the resource in an asset | ||
|
||
To use the resource written in Step 1, you can provide it as a parameter to an asset: | ||
jamiedemaria marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
<CodeExample filePath="guides/external-systems/apis/use_minimal_resource_in_asset.py" language="python" title="Use the SunResource in an asset" /> | ||
|
||
When you materialize `sfo_sunrise`, Dagster will provide an initialized `SunResource` to the `sun_resource` parameter. | ||
|
||
|
||
## Step 3: Configure your resource | ||
Many APIs have configuration you can set to customize your usage. Here is an updated version of the resource from Step 1 with configuration to allow for setting the query location: | ||
|
||
<CodeExample filePath="guides/external-systems/apis/use_configurable_resource_in_asset.py" language="python" title="Use the configurable SunResource in an asset" /> | ||
|
||
The configurable resource can be provided to an asset exactly as before. When the resource is initialized, you can pass values for each of the configuration options. | ||
|
||
When you materialize `sfo_sunrise`, Dagster will provide a `SunResource` initialized with the configuration values to the `sun_resource` parameter. | ||
|
||
|
||
## Step 4: Source configuration values from environment variables | ||
Resources can also be configured with environment variables. You can use Dagster's built-in `EnvVar` class to source configuration values from environment variables at materialization time. | ||
Check warning on line 58 in docs/docs-beta/docs/guides/external-systems/apis.md GitHub Actions / runner / vale
|
||
|
||
In this example, there is a new `home_sunrise` asset. Rather than hardcoding the location of your home, you can set it in environment variables, and configure the `SunResource` by reading those values: | ||
Check failure on line 60 in docs/docs-beta/docs/guides/external-systems/apis.md GitHub Actions / runner / vale
Check failure on line 60 in docs/docs-beta/docs/guides/external-systems/apis.md GitHub Actions / runner / vale
|
||
|
||
<CodeExample filePath="guides/external-systems/apis/env_var_configuration.py" language="python" title="Configure the resource with values from environment variables" /> | ||
|
||
When you materialize `home_sunrise`, Dagster will read the values set for the `HOME_LATITUDE`, `HOME_LONGITUDE`, and `HOME_TIMZONE` environment variables and initialize a `SunResource` with those values. | ||
|
||
The initialized `SunResource` will be provided to the `sun_resource` parameter. | ||
|
||
:::note | ||
You can also fetch environment variables using the `os` library. Dagster treats each approach to fetching environment variables differently, such as when they are fetched or how they display in the UI. Refer to the [Environment variables guide](/todo) for more information. | ||
Check warning on line 69 in docs/docs-beta/docs/guides/external-systems/apis.md GitHub Actions / runner / vale
|
||
::: | ||
|
||
|
||
## Next steps | ||
erinkcochran87 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- [Authenticate to a resource](/guides/external-systems/authentication.md) | ||
- [Use different resources in different execution environments](/todo) | ||
- [Set environment variables in Dagster+](/todo) | ||
- Learn what [Dagster-provided resources](/todo) are available to use |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
import requests | ||
|
||
import dagster as dg | ||
|
||
|
||
class SunResource(dg.ConfigurableResource): | ||
latitude: str | ||
longitude: str | ||
time_zone: str | ||
|
||
@property | ||
def query_string(self) -> str: | ||
return f"https://api.sunrise-sunset.org/json?lat={self.latitude}&lng={self.longitude}&date=today&tzid={self.time_zone}" | ||
|
||
def sunrise(self) -> str: | ||
data = requests.get(self.query_string, timeout=5).json() | ||
return data["results"]["sunrise"] | ||
|
||
|
||
# highlight-start | ||
@dg.asset | ||
def home_sunrise(context: dg.AssetExecutionContext, sun_resource: SunResource) -> None: | ||
sunrise = sun_resource.sunrise() | ||
context.log.info(f"Sunrise at home is at {sunrise}.") | ||
|
||
|
||
defs = dg.Definitions( | ||
assets=[home_sunrise], | ||
resources={ | ||
"sun_resource": SunResource( | ||
latitude=dg.EnvVar("HOME_LATITUDE"), | ||
longitude=dg.EnvVar("HOME_LONGITUDE"), | ||
time_zone=dg.EnvVar("HOME_TIMEZONE"), | ||
) | ||
}, | ||
) | ||
|
||
# highlight-end |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
import requests | ||
|
||
import dagster as dg | ||
|
||
|
||
class SunResource(dg.ConfigurableResource): | ||
@property | ||
def query_string(self) -> str: | ||
latittude = "37.615223" | ||
longitude = "-122.389977" | ||
time_zone = "America/Los_Angeles" | ||
return f"https://api.sunrise-sunset.org/json?lat={latittude}&lng={longitude}&date=today&tzid={time_zone}" | ||
|
||
def sunrise(self) -> str: | ||
data = requests.get(self.query_string, timeout=5).json() | ||
return data["results"]["sunrise"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
import requests | ||
|
||
import dagster as dg | ||
|
||
|
||
class SunResource(dg.ConfigurableResource): | ||
# highlight-start | ||
latitude: str | ||
longitude: str | ||
time_zone: str | ||
|
||
@property | ||
def query_string(self) -> str: | ||
return f"https://api.sunrise-sunset.org/json?lat={self.latittude}&lng={self.longitude}&date=today&tzid={self.time_zone}" | ||
|
||
# highlight-end | ||
|
||
def sunrise(self) -> str: | ||
data = requests.get(self.query_string, timeout=5).json() | ||
return data["results"]["sunrise"] | ||
|
||
|
||
@dg.asset | ||
def sfo_sunrise(context: dg.AssetExecutionContext, sun_resource: SunResource) -> None: | ||
sunrise = sun_resource.sunrise() | ||
context.log.info(f"Sunrise in San Francisco is at {sunrise}.") | ||
|
||
|
||
# highlight-start | ||
defs = dg.Definitions( | ||
assets=[sfo_sunrise], | ||
resources={ | ||
"sun_resource": SunResource( | ||
latitude="37.615223", | ||
longitude="-122.389977", | ||
time_zone="America/Los_Angeles", | ||
) | ||
}, | ||
) | ||
|
||
# highlight-end |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
import requests | ||
|
||
import dagster as dg | ||
|
||
|
||
class SunResource(dg.ConfigurableResource): | ||
@property | ||
def query_string(self) -> str: | ||
latittude = "37.615223" | ||
longitude = "-122.389977" | ||
time_zone = "America/Los_Angeles" | ||
return f"https://api.sunrise-sunset.org/json?lat={latittude}&lng={longitude}&date=today&tzid={time_zone}" | ||
|
||
def sunrise(self) -> str: | ||
data = requests.get(self.query_string, timeout=5).json() | ||
return data["results"]["sunrise"] | ||
|
||
|
||
# highlight-start | ||
@dg.asset | ||
def sfo_sunrise(context: dg.AssetExecutionContext, sun_resource: SunResource) -> None: | ||
sunrise = sun_resource.sunrise() | ||
context.log.info(f"Sunrise in San Francisco is at {sunrise}.") | ||
|
||
|
||
defs = dg.Definitions(assets=[sfo_sunrise], resources={"sun_resource": SunResource()}) | ||
|
||
# highlight-end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jamiedemaria This is great.
@erinkcochran87 @PedramNavid curious what you think in terms of in scope for a "how-to" guide. I just want it to be clear the resources are opt-in for this use case.
I think we should include (abridged) language about when it is good to use resources. E.g:
Accessing an API through a Dagster resource is useful if you want to:
If you don't want any of these features, you should just invoke the external service directly.