Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] Hello, Dagster becomes Quickstart #19981

Merged
merged 47 commits into from
Feb 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
ce5ace6
[docs] Hello, Dagster becomes Quickstart
cmpadden Feb 22, 2024
1e386c6
update references to /quickstart/hello-dagster
cmpadden Feb 22, 2024
0381dad
images hello-dagster -> quickstart
cmpadden Feb 22, 2024
e4596aa
make mdx-format
cmpadden Feb 22, 2024
2a1a6aa
add note about scaffolding projects
cmpadden Feb 22, 2024
cde01c4
add explanations to code snippet
cmpadden Feb 22, 2024
b4cc6b6
make mdx-format
cmpadden Feb 22, 2024
fe8ab01
accidentally a word
cmpadden Feb 22, 2024
a68cf4f
include both options for installing the project
cmpadden Feb 22, 2024
02e5b74
use tabgroup / tabitem
cmpadden Feb 22, 2024
5b22aae
tweak phrasing
cmpadden Feb 22, 2024
02a9860
include prerequisites for local install
cmpadden Feb 22, 2024
61cc767
include configurations
cmpadden Feb 22, 2024
27bddb1
Update docs/content/getting-started.mdx
cmpadden Feb 23, 2024
e8a7b47
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
ef00023
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
633caf1
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
21a33f9
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
e336b52
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
5f95467
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
26b9d28
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
2b0dcd8
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
3edcd0f
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
60c7392
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
89cd315
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
15b35c3
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
2a93499
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
12e6be4
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
c5d7b2f
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
34029cf
update screenshots to dagster v1.6
cmpadden Feb 23, 2024
ae3487d
restructure introduction
cmpadden Feb 23, 2024
58dafaa
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
fd63267
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
2e0c05f
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
1582cdc
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 23, 2024
79a731f
add python installation recommendation
cmpadden Feb 26, 2024
44b961c
default to running locally
cmpadden Feb 26, 2024
5051186
Update docs/content/getting-started/what-why-dagster.mdx
cmpadden Feb 28, 2024
3539913
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 28, 2024
2f060c4
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 28, 2024
1c9759c
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 28, 2024
f42149b
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 28, 2024
5f13b66
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 28, 2024
1211b05
Update docs/content/getting-started/quickstart.mdx
cmpadden Feb 28, 2024
f7aa3fe
add redirect for hello-dagster -> quickstart
cmpadden Feb 28, 2024
a916847
remove introduction heading; standardize on 'Quickstart project'
cmpadden Feb 28, 2024
596a97d
consolidate comment on materializeresult
cmpadden Feb 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/content/_navigation.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@
"path": "/getting-started/what-why-dagster"
},
{
"title": "Hello, Dagster!",
"path": "/getting-started/hello-dagster"
"title": "Quickstart",
"path": "/getting-started/quickstart"
},
{
"title": "Installation",
Expand Down
6 changes: 3 additions & 3 deletions docs/content/getting-started.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ Dagster is an orchestrator that's designed for developing and maintaining data a

You declare functions that you want to run and the data assets that those functions produce or update. Dagster then helps you run your functions at the right time and keep your assets up-to-date.

Dagster is built to be used at every stage of the data development lifecycle - local development, unit tests, integration tests, staging environments, all the way up to production.
Dagster is designed to be used at every stage of the data development lifecycle, including local development, unit tests, integration tests, staging environments, and production.

**New to Dagster**? Check out the **Hello Dagster example**, learn with some hands-on **Tutorials**, or dive into **Concepts**. For an in-depth learning experience, enroll in **Dagster University**.
**New to Dagster**? Check out the **Quickstart**, learn with some hands-on **Tutorials**, or dive into **Concepts**. For an in-depth learning experience, enroll in **Dagster University**.
cmpadden marked this conversation as resolved.
Show resolved Hide resolved

<div className="inline-flex flex-row space-x-4">
<Button link="/getting-started/hello-dagster">Run Hello, Dagster!</Button>
<Button link="/getting-started/quickstart">Quickstart</Button>
<Button link="/tutorial" style="secondary">
View Tutorials
</Button>
Expand Down
147 changes: 0 additions & 147 deletions docs/content/getting-started/hello-dagster.mdx

This file was deleted.

196 changes: 196 additions & 0 deletions docs/content/getting-started/quickstart.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
---
title: Quickstart | Dagster Docs
description: Run dagster for the first time
---

# Quickstart

<Note>
Looking to scaffold a new project? Check out the{" "}
cmpadden marked this conversation as resolved.
Show resolved Hide resolved
<Link href="/getting-started/create-new-project">Creating a new project</Link>{" "}
guide!
</Note>

Welcome to Dagster! This guide will help you quickly run the [Dagster Quickstart](https://github.com/dagster-io/dagster-quickstart) project, showcasing Dagster's capabilities and serving as a foundation for exploring its features.

The [Dagster Quickstart](https://github.com/dagster-io/dagster-quickstart) project can be used without installing anything on your machine by using the pre-configured [GitHub Codespace](https://github.com/features/codespaces). If you prefer to run things on your own machine, however, we've got you covered.

<TabGroup>
<TabItem name="Option 1: Running locally">

### Option 1: Running Locally
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Option 1: Running Locally
## Option 1: Running locally

[1], and this becomes an H2 with the removal of the # Introduction heading.


<DagsterVersion />

Ensure you have one of the supported Python versions installed before proceeding.
cmpadden marked this conversation as resolved.
Show resolved Hide resolved

Refer to Python's official <a href="https://www.python.org/about/gettingstarted/">getting started guide</a>, or our recommendation of using <a href="https://github.com/pyenv/pyenv?tab=readme-ov-file#installation">pyenv</a> for installing Python.

1. Clone the Dagster Quickstart repository by executing:

```bash
git clone https://github.com/dagster-io/dagster-quickstart && cd dagster-quickstart
```

2. Install the necessary dependencies using the following command:

We use `-e` to install dependencies in ["editable mode"](https://pip.pypa.io/en/latest/topics/local-project-installs/#editable-installs). This allows changes to be automatically applied when we modify code.

```bash
pip install -e ".[dev]"
```
Comment on lines +35 to +41
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - this sounds like the next paragraph will be the command, but it's more explanation. I suggest moving things around a bit:

Suggested change
2. Install the necessary dependencies using the following command:
We use `-e` to install dependencies in ["editable mode"](https://pip.pypa.io/en/latest/topics/local-project-installs/#editable-installs). This allows changes to be automatically applied when we modify code.
```bash
pip install -e ".[dev]"
```
2. Install the necessary dependencies using the following command:
```bash
pip install -e ".[dev]"
```
We use `-e` to install dependencies in ["editable mode"](https://pip.pypa.io/en/latest/topics/local-project-installs/#editable-installs). This allows changes to be automatically applied when we modify code.

Double-check the formatting if you accept this suggestion - GH can be cranky about code blocks within code blocks


3. Run the project!

```bash
dagster dev
```

4. Navigate to <a href="localhost:3000">localhost:3000</a> in your web browser.

5. **Success!**

</TabItem>
<TabItem name="Option 2: Using GitHub Codespaces">

### Option 2: Using GitHub Codespaces
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Option 2: Using GitHub Codespaces
## Option 2: Using GitHub Codespaces

Becomes an H2 with the removal of # Introduction


1. Fork the [Dagster Quickstart](https://github.com/dagster-io/dagster-quickstart) repository
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Fork the [Dagster Quickstart](https://github.com/dagster-io/dagster-quickstart) repository
1. Fork the [Dagster Quickstart](https://github.com/dagster-io/dagster-quickstart) repository.

Adding a period to make this consistent with other list items.


2. Select **Create codespace on main** from the **Code** dropdown menu.

<Image
width={400}
height={400}
alt="Create codespace"
src="/images/getting-started/quickstart/github-codespace-create.png"
/>
Comment on lines +62 to +67
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[2] This image isn't nested under the list item about it (2). Indent everything (I think it's 3 spaces; if not that, then 4) and everything should be in line


3. After the codespace loads, start Dagster by running `dagster dev` in the terminal:

```bash
dagster dev
```

4. Click **Open in Browser** when prompted.

<Image
width={400}
height={300}
alt="Codespace Open In Browser"
src="/images/getting-started/quickstart/github-codespace-open-in-browser.png"
/>
Comment on lines +77 to +82
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[2]


5. **Success!**

</TabItem>
</TabGroup>

## Navigating the User Interface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Navigating the User Interface
---
## Navigating the Dagster UI

[3] We use horizontal rules before every H2.

Also making this heading more specific by including the proper name of the UI


You should now have a running instance of Dagster! From here, we can run our data pipeline.
cmpadden marked this conversation as resolved.
Show resolved Hide resolved

To run the pipeline, click the **Materialize All** button in the top right. In Dagster, _materialization_ refers to executing the code associated with an asset to produce an output.

<Image
alt="HackerNews assets in Dagster's Asset Graph, unmaterialized"
src="/images/getting-started/quickstart/quickstart-unmaterialized.png"
width={2000}
height={816}
/>

Congratulations! You have successfully materialized two Dagster assets:

<Image
alt="HackerNews asset graph"
src="/images/getting-started/quickstart/quickstart.png"
width={2000}
height={1956}
/>

But wait - there's more. Because the `hackernews_top_stories` asset returned some `metadata`, you can view the metadata right in the UI:
cmpadden marked this conversation as resolved.
Show resolved Hide resolved

1. Click the asset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Click the asset
1. Click the `hackernews_top_stories` asset.

Adding a period and including the name of the asset.

2. In the sidebar, click the **Show Markdown** link in the **Materialization in Last Run** section. This opens a preview of the pipeline result, allowing you to view the top 10 HackerNews stories:
cmpadden marked this conversation as resolved.
Show resolved Hide resolved

<Image
alt="Markdown preview of HackerNews top 10 stories"
src="/images/getting-started/quickstart/hn-preview.png"
width={2000}
height={1754}
/>
Comment on lines +116 to +121
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[2]


## Understanding the Code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Understanding the Code
---
## Understanding the code

[3] and using sentence casing


The Quickstart project defines two **Assets** using the <PyObject object="asset" decorator /> decorator:

- `hackernews_top_story_ids` retrieves the top stories from the Hacker News API and saves them as a JSON file.
- `hackernews_top_stories` asset builds upon the first asset, retrieving data for each story as a CSV file, and returns a `MaterializeResult` with a markdown preview of the top stories.

```python file=/getting-started/quickstart/assets.py
import json

import pandas as pd
import requests

from dagster import (
MaterializeResult,
MetadataValue,
asset,
)

from .configurations import HNStoriesConfig


@asset
def hackernews_top_story_ids(config: HNStoriesConfig):
"""Get top stories from the HackerNews top stories endpoint."""
top_story_ids = requests.get(
"https://hacker-news.firebaseio.com/v0/topstories.json"
).json()

with open(config.hn_top_story_ids_path, "w") as f:
json.dump(top_story_ids[: config.top_stories_limit], f)


@asset(deps=[hackernews_top_story_ids])
def hackernews_top_stories(config: HNStoriesConfig) -> MaterializeResult:
"""Get items based on story ids from the HackerNews items endpoint."""
with open(config.hn_top_story_ids_path, "r") as f:
hackernews_top_story_ids = json.load(f)

results = []
for item_id in hackernews_top_story_ids:
item = requests.get(
f"https://hacker-news.firebaseio.com/v0/item/{item_id}.json"
).json()
results.append(item)

df = pd.DataFrame(results)
df.to_csv(config.hn_top_stories_path)

return MaterializeResult(
metadata={
"num_records": len(df),
"preview": MetadataValue.md(str(df[["title", "by", "url"]].to_markdown())),
}
)
```

---

## Next steps

Congratulations on successfully running your first Dagster pipeline! In this example, we used [assets](/tutorial), which are a cornerstone of Dagster projects. They empower data engineers to:

- Think in the same terms as stakeholders
- Answer questions about data quality and lineage
- Work with the modern data stack (dbt, Airbyte/Fivetran, Spark)
- Create declarative freshness policies instead of task-driven cron schedules
cmpadden marked this conversation as resolved.
Show resolved Hide resolved

Dagster also offers [ops and jobs](/guides/dagster/intro-to-ops-jobs), but we recommend starting with assets.
cmpadden marked this conversation as resolved.
Show resolved Hide resolved

To create your own project, consider the following options:

- Scaffold a new project using our [new project guide](/getting-started/create-new-project).
- Begin with an official example, like the [dbt + Dagster project](/integrations/dbt/using-dbt-with-dagster), and explore [all examples on GitHub](https://github.com/dagster-io/dagster/tree/master/examples).
2 changes: 1 addition & 1 deletion docs/content/getting-started/what-why-dagster.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ Additionally, Dagster is accompanied by a sleek, modern, [web-based UI](/concept

## How does it work?
cmpadden marked this conversation as resolved.
Show resolved Hide resolved

If you want to try running Dagster yourself, check out the [Hello, Dagster!](/getting-started/hello-dagster) quickstart.
If you want to try running Dagster yourself, check out the Dagster [Quickstart](/getting-started/quickstart).

---

Expand Down
4 changes: 2 additions & 2 deletions docs/content/integrations/pandas.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ description: The dagster-pandas library provides the ability to perform data val
<Note>
This page describes the <code>dagster-pandas</code> library, which is used for
performing data validation. To simply use pandas with Dagster, start with the{" "}
cmpadden marked this conversation as resolved.
Show resolved Hide resolved
<a href="/getting-started/hello-dagster" target="new">
<a href="/getting-started/quickstart" target="new">
{" "}
Hello Dagster example.
Dagster Quickstart example.
</a>{" "}
Dagster makes it easy to use pandas code to manipulate data and then store
cmpadden marked this conversation as resolved.
Show resolved Hide resolved
that data in other systems such as{" "}
Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading