Skip to content

Commit

Permalink
1.3.11 changelog (#14907)
Browse files Browse the repository at this point in the history
Co-authored-by: Ben Pankow <[email protected]>
  • Loading branch information
alangenfeld and benpankow committed Jun 22, 2023
1 parent 55681c5 commit e1d6278
Showing 1 changed file with 94 additions and 44 deletions.
138 changes: 94 additions & 44 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,57 @@
# Changelog

# 1.3.11 (core) / 0.19.11 (libraries)

### New

- Assets with lazy auto-materialize policies are no longer auto-materialized if they are missing but don’t need to be materialized in order to help downstream assets meet their freshness policies.
- [ui] The descriptions of auto-materialize policies in the UI now include their skip conditions along with their materialization conditions.
- [dagster-dbt] Customized asset keys can now be specified for nodes in the dbt project, using `meta.dagster.asset_key`. This field takes in a list of strings that are used as the components of the generated `AssetKey`.

```yaml
version: 2

models:
- name: users
config:
meta:
dagster:
asset_key: ["my", "custom", "asset_key"]
```

- [dagster-dbt] Customized groups can now be specified for models in the dbt project, using `meta.dagster.group`. This field takes in a string that is used as the Dagster group for the generated software-defined asset corresponding to the dbt model.

```yaml
version: 2

models:
- name: users
config:
meta:
dagster:
group: "my_group"
```

### Bugfixes

- Fixed an issue where the `dagster-msteams` and `dagster-mlflow` packages could be installed with incompatible versions of the `dagster` package due to a missing pin.
- Fixed an issue where the `dagster-daemon run` command sometimes kept code server subprocesses open longer than it needed to, making the process use more memory.
- Previously, when using `@observable_source_asset`s with AutoMaterializePolicies, it was possible for downstream assets to get “stuck”, not getting materialized when other upstream assets changed, or for multiple down materializations to be kicked off in response to the same version being observed multiple times. This has been fixed.
- Fixed a case where the materialization count for partitioned assets could be wrong.
- Fixed an error which arose when trying to request resources within run failure sensors.
- [dagster-wandb] Fixed handling for multi-dimensional partitions. Thanks @chrishiste

### Experimental

- [dagster-dbt] improvements to `@dbt_assets`
- `project_dir` and `target_path` in `DbtCliTask` are converted from type `str` to type `pathlib.Path`.
- In the case that dbt logs are not emitted as json, the log will still be redirected to be printed in the Dagster compute logs, under `stdout`.

### Documentation

- Fixed a typo in dagster_aws S3 resources. Thanks @akan72
- Fixed a typo in link on the Dagster Instance page. Thanks @PeterJCLaw

# 1.3.10 (core) / 0.19.10 (libraries)

### New
Expand All @@ -16,9 +68,9 @@ models:
config:
dagster_freshness_policy:
maximum_lag_minutes: 60
cron_schedule: '0 9 * * *'
cron_schedule: "0 9 * * *"
dagster_auto_materialize_policy:
type: 'lazy'
type: "lazy"
```

After:
Expand All @@ -33,38 +85,38 @@ models:
dagster:
freshness_policy:
maximum_lag_minutes: 60
cron_schedule: '0 9 * * *'
cron_schedule: "0 9 * * *"
auto_materialize_policy:
type: 'lazy'
type: "lazy"
```

- Added support for Pythonic Config classes to the `@configured` API, which makes reusing op and asset definitions easier:

```python
class GreetingConfig(Config):
message: str
```python
class GreetingConfig(Config):
message: str

@op
def greeting_op(config: GreetingConfig):
print(config.message)
@op
def greeting_op(config: GreetingConfig):
print(config.message)

class HelloConfig(Config):
name: str
class HelloConfig(Config):
name: str

@configured(greeting_op)
def hello_op(config: HelloConfig):
return GreetingConfig(message=f"Hello, {config.name}!")
```
@configured(greeting_op)
def hello_op(config: HelloConfig):
return GreetingConfig(message=f"Hello, {config.name}!")
```

- Added `AssetExecutionContext` to replace `OpExecutionContext` as the context object passed in to `@asset` functions.
- `TimeWindowPartitionMapping` now contains an `allow_nonexistent_upstream_partitions` argument that, when set to `True`, allows a downstream partition subset to have nonexistent upstream parents.
- `TimeWindowPartitionMapping` now contains an `allow_nonexistent_upstream_partitions` argument that, when set to `True`, allows a downstream partition subset to have nonexistent upstream parents.
- Unpinned the `alembic` dependency in the `dagster` package.
- [ui] A new “Assets” tab is available from the Overview page.
- [ui] The Backfills table now includes links to the assets that were targeted by the backfill.

### Bugfixes

- Dagster is now compatible with a breaking change introduced in `croniter==1.4.0`. Users of earlier versions of Dagster can pin `croniter<1.4`.
- Dagster is now compatible with a breaking change introduced in `croniter==1.4.0`. Users of earlier versions of Dagster can pin `croniter<1.4`.
- Fixed an issue introduced in 1.3.8 which prevented resources from being bound to sensors when the specified job required late-bound resources.
- Fixed an issue which prevented specifying resource requirements on a `@run_failure_sensor`.
- Fixed an issue where the asset reconciliation sensor failed with a “invalid upstream partitions” error when evaluating time partitions definitions with different start times.
Expand All @@ -82,10 +134,10 @@ models:

- Evaluation history for `AutoMaterializePolicy`s will now be cleared after 1 week.
- [dagster-dbt] Several improvements to `@dbt_assets`:
- `profile` and `target` can now be customized on the `DbtCli` resource.
- If a `partial_parse.msgpack` is detected in the target directory of your dbt project, it is now copied into the target directories created by `DbtCli` to take advantage of [partial parsing](https://docs.getdbt.com/reference/parsing).
- The metadata of assets generated by `@dbt_assets` can now be customized by overriding `DbtManifest.node_info_to_metadata`.
- Execution duration of dbt models is now added as default metadata to `AssetMaterialization`s.
- `profile` and `target` can now be customized on the `DbtCli` resource.
- If a `partial_parse.msgpack` is detected in the target directory of your dbt project, it is now copied into the target directories created by `DbtCli` to take advantage of [partial parsing](https://docs.getdbt.com/reference/parsing).
- The metadata of assets generated by `@dbt_assets` can now be customized by overriding `DbtManifest.node_info_to_metadata`.
- Execution duration of dbt models is now added as default metadata to `AssetMaterialization`s.

### Documentation

Expand All @@ -101,7 +153,6 @@ models:

- Fixed an issue in the `1.3.8` release where the Dagster Cloud agent would sometimes fail to start up with an import error.


# 1.3.8 (core) / 0.19.8 (libraries)

### New
Expand Down Expand Up @@ -134,9 +185,9 @@ models:

- `@observable_source_asset`-decorated functions can now return a `DataVersionsByPartition` to record versions for partitions.
- `@dbt_assets`
- `DbtCliTask`'s created by invoking `DbtCli.cli(...)` now have a method `.is_successful()`, which returns a boolean representing whether the underlying CLI process executed the dbt command successfully.
- Descriptions of assets generated by `@dbt_assets` can now be customized by overriding `DbtManifest.node_info_to_description`.
- IO Managers can now be configured on `@dbt_assets`.
- `DbtCliTask`'s created by invoking `DbtCli.cli(...)` now have a method `.is_successful()`, which returns a boolean representing whether the underlying CLI process executed the dbt command successfully.
- Descriptions of assets generated by `@dbt_assets` can now be customized by overriding `DbtManifest.node_info_to_description`.
- IO Managers can now be configured on `@dbt_assets`.

### Documentation

Expand All @@ -160,7 +211,7 @@ models:

### Bugfixes

- Fixed an issue where setting a resource in an op didn’t work if the Dagster job was only referenced within a schedule or sensor and wasn’t included in the `jobs` argument to `Definitions`.
- Fixed an issue where setting a resource in an op didn’t work if the Dagster job was only referenced within a schedule or sensor and wasn’t included in the `jobs` argument to `Definitions`.
- [dagster-slack][dagster-pagerduty][dagster-msteams][dagster-airflow] Fixed issue where pre-built sensors and hooks which created urls to the runs page in the UI would use the old `/instance/runs` path instead of the new `/runs`.

### Community Contributions
Expand Down Expand Up @@ -249,7 +300,7 @@ models:
- [dagster-airflow] persistent database URI can now be passed via environment variable
- [dagster-azure] New `ConfigurablePickledObjectADLS2IOManager` that uses pythonic config
- [dagster-fivetran] Fivetran connectors that are broken or incomplete are now ignored
- [dagster-gcp] New `DataProcResource` follows the Pythonic resource system. The existing `dataproc_resource` remains supported.
- [dagster-gcp] New `DataProcResource` follows the Pythonic resource system. The existing `dataproc_resource` remains supported.
- [dagster-k8s] The K8sRunLauncher and k8s_job_executor will now retry the api call to create a Kubernetes Job when it gets a transient error code (500, 503, 504, or 401).
- [dagster-snowflake] The `SnowflakeIOManager` now supports `private_key`s that have been `base64` encoded to avoid issues with newlines in the private key. Non-base64 encoded keys are still supported. See the `SnowflakeIOManager` documentation for more information on `base64` encoded private keys.
- [ui] Unpartitioned assets show up on the backfill page
Expand All @@ -259,7 +310,7 @@ models:
### Bugfixes

- The server side polling for events during a live run has had its rate adjusted and no longer uses a fixed interval.
- [dagster-postgres] Fixed an issue where primary key constraints were not being created for the `kvs`, `instance_info`, and `daemon_hearbeats` table for existing Postgres storage instances that were migrating from before `1.2.2`. This should unblock users relying on the existence of a primary key constraint for replication.
- [dagster-postgres] Fixed an issue where primary key constraints were not being created for the `kvs`, `instance_info`, and `daemon_hearbeats` table for existing Postgres storage instances that were migrating from before `1.2.2`. This should unblock users relying on the existence of a primary key constraint for replication.
- Fixed a bug that could cause incorrect counts to be shown for missing asset partitions when partitions are in progress
- Fixed an issue within `SensorResult` evaluation where multipartitioned run requests containing a dynamic partition added in a dynamic partitions request object would raise an invalid partition key error.
- [ui] When trying to terminate a queued or in-progress run from a Run page, forcing termination was incorrectly given as the only option. This has been fixed, and these runs can now be terminated normally.
Expand All @@ -272,7 +323,7 @@ models:

### Community Contributions

- [dagster-airbyte] When supplying an `airbyte_resource` to `load_assets_from_connections` , you may now provide an instance of the `AirbyteResource` class, rather than just `airbyte_resource.configured(...)` (thanks **[@joel-olazagasti](https://github.com/joel-olazagasti)!)**
- [dagster-airbyte] When supplying an `airbyte_resource` to `load_assets_from_connections` , you may now provide an instance of the `AirbyteResource` class, rather than just `airbyte_resource.configured(...)` (thanks **[@joel-olazagasti](https://github.com/joel-olazagasti)!)**
- [dagster-airbyte] Fixed an issue connecting to destinations that support normalization (thanks [@nina-j](https://github.com/nina-j)!)
- Fix an error in the docs code snippets for IO managers (thanks [out-running-27](https://github.com/out-running-27)!)
- Added [an example](https://github.com/dagster-io/dagster/tree/master/examples/project_analytics) to show how to build the Dagster's Software-Defined Assets for an analytics workflow with different deployments for a local and prod environment. (thanks [@PedramNavid](https://github.com/PedramNavid)!)
Expand All @@ -297,17 +348,17 @@ models:
- `RunFailureSensorContext` now has a `get_step_failure_events` method.
- The Pythonic resource system now supports a set of lifecycle hooks which can be used to manage setup and teardown:

```python
class MyAPIClientResource(ConfigurableResource):
api_key: str
_internal_client: MyAPIClient = PrivateAttr()
```python
class MyAPIClientResource(ConfigurableResource):
api_key: str
_internal_client: MyAPIClient = PrivateAttr()

def setup_for_execution(self, context):
self._internal_client = MyAPIClient(self.api_key)
def setup_for_execution(self, context):
self._internal_client = MyAPIClient(self.api_key)

def get_all_items(self):
return self._internal_client.items.get()
```
def get_all_items(self):
return self._internal_client.items.get()
```

- Added support for specifying input and output config on `ConfigurableIOManager`.
- `QueuedRunCoordinator` and `SubmitRunContext` are now exposed as public dagster exports.
Expand All @@ -316,8 +367,8 @@ models:
- [dagster-aws] Allow the S3 compute log manager to specify a `show_url_only: true` config option, which will display a URL to the S3 file in dagit, instead of the contents of the log file.
- [dagster-aws] `PickledObjectS3IOManager` now fully supports loading partitioned inputs.
- [dagster-azure] `PickedObjectADLS2IOManager` now fully supports loading partitioned inputs.
- [dagster-gcp] New `GCSResource` and `ConfigurablePickledObjectGCSIOManager` follow the Pythonic resource system. The existing `gcs_resource` and `gcs_pickle_io_manager` remain supported.
- [dagster-gcp] New `BigQueryResource` follows the Pythonic resource system. The existing `bigquery_resource` remains supported.
- [dagster-gcp] New `GCSResource` and `ConfigurablePickledObjectGCSIOManager` follow the Pythonic resource system. The existing `gcs_resource` and `gcs_pickle_io_manager` remain supported.
- [dagster-gcp] New `BigQueryResource` follows the Pythonic resource system. The existing `bigquery_resource` remains supported.
- [dagster-gcp] `PickledObjectGCSIOManager` now fully supports loading partitioned inputs.
- [dagster-postgres] The event watching implementation has been moved from listen/notify based to the polling watcher used by MySQL and SQLite.
- [dagster-slack] Add `monitor_all_repositories` to `make_slack_on_run_failure_sensor`, thanks @danielgafni!
Expand Down Expand Up @@ -346,8 +397,8 @@ models:

- Ever wanted to know more about the files in Dagster projects, including where to put them in your project? Check out the new [Dagster project files reference](https://docs.dagster.io/getting-started/project-file-reference) for more info!
- We’ve made some improvements to the sidenav / information architecture of our docs!
- The **Guides** section now contains several new categories, including **Working with data assets** and **Working with tasks**
- The **Community** section is now under **About**
- The **Guides** section now contains several new categories, including **Working with data assets** and **Working with tasks**
- The **Community** section is now under **About**
- The Backfills concepts page now includes instructions on how to launch backfills that target ranges of partitions in a single run.

# 1.3.2 (core) / 0.19.2 (libraries)
Expand All @@ -371,7 +422,6 @@ models:

### Breaking Changes


- Yielding run requests for experimental dynamic partitions via `run_request_for_partition` now throws an error. Instead, users should yield directly instantiated run requests via `RunRequest(partition_key=...)`.
- `graph_asset` and `graph_multi_asset` now support specifying `resource_defs` directly (thanks [@kmontag42](https://github.com/KMontag42))!

Expand Down

0 comments on commit e1d6278

Please sign in to comment.