-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[docs] Hello, Dagster becomes Quickstart (#19981)
## Summary & Motivation We would like to reduce the barrier to entry by replacing Hello, Dagster with a Quickstart project. This project leverages GitHub Codespaces, so that users can try Dagster without installing any dependencies on their local machine. This is a part of a larger initiative of simplifying the entrypoints into Dagster and our documentation. ### Checklist - [ ] Make the https://github.com/dagster-io/dagster-quickstart repo public ### Outstanding Questions - We have a few _quickstart_ templates that can already be scaffolded with the `dagster` command. Where do we want to draw the line on this quickstart vs those? ## How I Tested These Changes - Ran Next.js project locally --------- Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> Co-authored-by: Erin Cochran <[email protected]>
- Loading branch information
1 parent
5f7a542
commit 25e8e82
Showing
18 changed files
with
263 additions
and
201 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,196 @@ | ||
--- | ||
title: Quickstart | Dagster Docs | ||
description: Run dagster for the first time | ||
--- | ||
|
||
# Quickstart | ||
|
||
<Note> | ||
Looking to scaffold a new project? Check out the{" "} | ||
<Link href="/getting-started/create-new-project">Creating a new project</Link>{" "} | ||
guide! | ||
</Note> | ||
|
||
Welcome to Dagster! This guide will help you quickly run the [Dagster Quickstart](https://github.com/dagster-io/dagster-quickstart) project, showcasing Dagster's capabilities and serving as a foundation for exploring its features. | ||
|
||
The [Dagster Quickstart](https://github.com/dagster-io/dagster-quickstart) project can be used without installing anything on your machine by using the pre-configured [GitHub Codespace](https://github.com/features/codespaces). If you prefer to run things on your own machine, however, we've got you covered. | ||
|
||
<TabGroup> | ||
<TabItem name="Option 1: Running locally"> | ||
|
||
### Option 1: Running Locally | ||
|
||
<DagsterVersion /> | ||
|
||
Ensure you have one of the supported Python versions installed before proceeding. | ||
|
||
Refer to Python's official <a href="https://www.python.org/about/gettingstarted/">getting started guide</a>, or our recommendation of using <a href="https://github.com/pyenv/pyenv?tab=readme-ov-file#installation">pyenv</a> for installing Python. | ||
|
||
1. Clone the Dagster Quickstart repository by executing: | ||
|
||
```bash | ||
git clone https://github.com/dagster-io/dagster-quickstart && cd dagster-quickstart | ||
``` | ||
|
||
2. Install the necessary dependencies using the following command: | ||
|
||
We use `-e` to install dependencies in ["editable mode"](https://pip.pypa.io/en/latest/topics/local-project-installs/#editable-installs). This allows changes to be automatically applied when we modify code. | ||
|
||
```bash | ||
pip install -e ".[dev]" | ||
``` | ||
|
||
3. Run the project! | ||
|
||
```bash | ||
dagster dev | ||
``` | ||
|
||
4. Navigate to <a href="localhost:3000">localhost:3000</a> in your web browser. | ||
|
||
5. **Success!** | ||
|
||
</TabItem> | ||
<TabItem name="Option 2: Using GitHub Codespaces"> | ||
|
||
### Option 2: Using GitHub Codespaces | ||
|
||
1. Fork the [Dagster Quickstart](https://github.com/dagster-io/dagster-quickstart) repository | ||
|
||
2. Select **Create codespace on main** from the **Code** dropdown menu. | ||
|
||
<Image | ||
width={400} | ||
height={400} | ||
alt="Create codespace" | ||
src="/images/getting-started/quickstart/github-codespace-create.png" | ||
/> | ||
|
||
3. After the codespace loads, start Dagster by running `dagster dev` in the terminal: | ||
|
||
```bash | ||
dagster dev | ||
``` | ||
|
||
4. Click **Open in Browser** when prompted. | ||
|
||
<Image | ||
width={400} | ||
height={300} | ||
alt="Codespace Open In Browser" | ||
src="/images/getting-started/quickstart/github-codespace-open-in-browser.png" | ||
/> | ||
|
||
5. **Success!** | ||
|
||
</TabItem> | ||
</TabGroup> | ||
|
||
## Navigating the User Interface | ||
|
||
You should now have a running instance of Dagster! From here, we can run our data pipeline. | ||
|
||
To run the pipeline, click the **Materialize All** button in the top right. In Dagster, _materialization_ refers to executing the code associated with an asset to produce an output. | ||
|
||
<Image | ||
alt="HackerNews assets in Dagster's Asset Graph, unmaterialized" | ||
src="/images/getting-started/quickstart/quickstart-unmaterialized.png" | ||
width={2000} | ||
height={816} | ||
/> | ||
|
||
Congratulations! You have successfully materialized two Dagster assets: | ||
|
||
<Image | ||
alt="HackerNews asset graph" | ||
src="/images/getting-started/quickstart/quickstart.png" | ||
width={2000} | ||
height={1956} | ||
/> | ||
|
||
But wait - there's more. Because the `hackernews_top_stories` asset returned some `metadata`, you can view the metadata right in the UI: | ||
|
||
1. Click the asset | ||
2. In the sidebar, click the **Show Markdown** link in the **Materialization in Last Run** section. This opens a preview of the pipeline result, allowing you to view the top 10 HackerNews stories: | ||
|
||
<Image | ||
alt="Markdown preview of HackerNews top 10 stories" | ||
src="/images/getting-started/quickstart/hn-preview.png" | ||
width={2000} | ||
height={1754} | ||
/> | ||
|
||
## Understanding the Code | ||
|
||
The Quickstart project defines two **Assets** using the <PyObject object="asset" decorator /> decorator: | ||
|
||
- `hackernews_top_story_ids` retrieves the top stories from the Hacker News API and saves them as a JSON file. | ||
- `hackernews_top_stories` asset builds upon the first asset, retrieving data for each story as a CSV file, and returns a `MaterializeResult` with a markdown preview of the top stories. | ||
|
||
```python file=/getting-started/quickstart/assets.py | ||
import json | ||
|
||
import pandas as pd | ||
import requests | ||
|
||
from dagster import ( | ||
MaterializeResult, | ||
MetadataValue, | ||
asset, | ||
) | ||
|
||
from .configurations import HNStoriesConfig | ||
|
||
|
||
@asset | ||
def hackernews_top_story_ids(config: HNStoriesConfig): | ||
"""Get top stories from the HackerNews top stories endpoint.""" | ||
top_story_ids = requests.get( | ||
"https://hacker-news.firebaseio.com/v0/topstories.json" | ||
).json() | ||
|
||
with open(config.hn_top_story_ids_path, "w") as f: | ||
json.dump(top_story_ids[: config.top_stories_limit], f) | ||
|
||
|
||
@asset(deps=[hackernews_top_story_ids]) | ||
def hackernews_top_stories(config: HNStoriesConfig) -> MaterializeResult: | ||
"""Get items based on story ids from the HackerNews items endpoint.""" | ||
with open(config.hn_top_story_ids_path, "r") as f: | ||
hackernews_top_story_ids = json.load(f) | ||
|
||
results = [] | ||
for item_id in hackernews_top_story_ids: | ||
item = requests.get( | ||
f"https://hacker-news.firebaseio.com/v0/item/{item_id}.json" | ||
).json() | ||
results.append(item) | ||
|
||
df = pd.DataFrame(results) | ||
df.to_csv(config.hn_top_stories_path) | ||
|
||
return MaterializeResult( | ||
metadata={ | ||
"num_records": len(df), | ||
"preview": MetadataValue.md(str(df[["title", "by", "url"]].to_markdown())), | ||
} | ||
) | ||
``` | ||
|
||
--- | ||
|
||
## Next steps | ||
|
||
Congratulations on successfully running your first Dagster pipeline! In this example, we used [assets](/tutorial), which are a cornerstone of Dagster projects. They empower data engineers to: | ||
|
||
- Think in the same terms as stakeholders | ||
- Answer questions about data quality and lineage | ||
- Work with the modern data stack (dbt, Airbyte/Fivetran, Spark) | ||
- Create declarative freshness policies instead of task-driven cron schedules | ||
|
||
Dagster also offers [ops and jobs](/guides/dagster/intro-to-ops-jobs), but we recommend starting with assets. | ||
|
||
To create your own project, consider the following options: | ||
|
||
- Scaffold a new project using our [new project guide](/getting-started/create-new-project). | ||
- Begin with an official example, like the [dbt + Dagster project](/integrations/dbt/using-dbt-with-dagster), and explore [all examples on GitHub](https://github.com/dagster-io/dagster/tree/master/examples). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file removed
BIN
-106 KB
...xt/public/images/getting-started/hello-dagster/hello-dagster-unmaterialized.png
Binary file not shown.
Binary file removed
BIN
-401 KB
docs/next/public/images/getting-started/hello-dagster/hello-dagster.png
Binary file not shown.
Binary file removed
BIN
-520 KB
docs/next/public/images/getting-started/hello-dagster/hn-preview.png
Binary file not shown.
Binary file added
BIN
+98.7 KB
docs/next/public/images/getting-started/quickstart/github-codespace-create.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+36.1 KB
...t/public/images/getting-started/quickstart/github-codespace-open-in-browser.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+323 KB
docs/next/public/images/getting-started/quickstart/quickstart-unmaterialized.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.