Releases: dagster-io/dagster
1.4.15 / 0.20.15 (libraries)
New
- The
deps
parameter for@asset
and@multi_asset
now supports directly passing@multi_asset
definitions. If an@multi_asset
is passed todeps
, dependencies will be created on every asset produced by the@multi_asset
. - Added an optional data migration to convert storage ids to use 64-bit integers instead of 32-bit integers. This will incur some downtime, but may be required for instances that are handling a large number of events. This migration can be invoked using
dagster instance migrate --bigint-migration
. - [ui] Dagster now allows you to run asset checks individually.
- [ui] The run list and run details page now show the asset checks targeted by each run.
- [ui] In the runs list, runs launched by schedules or sensors will now have tags that link directly to those schedules or sensors.
- [ui] Clicking the "N assets" tag on a run allows you to navigate to the filtered asset graph as well as view the full list of asset keys.
- [ui] Schedules, sensors, and observable source assets now appear on the resource “Uses” page.
- [dagster-dbt] The
DbtCliResource
now validates at definition time that itsproject_dir
andprofiles_dir
arguments are directories that respectively contain adbt_project.yml
andprofiles.yml
. - [dagster-databricks] You can now configure a
policy_id
for new clusters when using thedatabricks_pyspark_step_launcher
(thanks @zyd14!) - [ui] Added an experimental sidebar to the Asset lineage graph to aid in navigating large graphs. You can enable this feature under user settings.
Bugfixes
- Fixed an issue where the
dagster-webserver
command was not indicating which port it was using in the command-line output. - Fixed an issue with the quickstart_gcp example wasn’t setting GCP credentials properly when setting up its IOManager.
- Fixed an issue where the process output for Dagster run and step containers would repeat each log message twice in JSON format when the process finished.
- [ui] Fixed an issue where the config editor failed to load when materializing certain assets.
- [auto-materialize] Previously, rematerializing an old partition of an asset which depended on a prior partition of itself would result in a chain of materializations to propagate that change all the way through to the most recent partition of this asset. To prevent these “slow-motion backfills”, this behavior has been updated such that these updates are no longer propagated.
Experimental
MaterializeResult
has been added as a new return type to be used in@asset
/@multi_asset
materialization functions- [ui] The auto-materialize page now properly indicates that the feature is experimental and links to our documentation.
Documentation
- The Concepts category page got a small facelift, to bring it line with how the side navigation is organized.
Dagster Cloud
- Previously, when importing a dbt project in Cloud, naming the code location “dagster” would cause build failures. This is now disabled and an error is now surfaced.
1.4.14 / 0.20.14 (libraries)
New
- Added a new tooltip to asset runs to either view the asset list or lineage
Bugfixes
- [ui] Fixed an issue where re-executing a run from a particular run's page wouldn’t navigate to the newly created run
Experimental
- [dagster-ext] An initial version of the
dagster-ext
module along with subprocess, docker, databricks, and k8s pod integrations are now available. Read more at #16319. Note that the module is temporarily being published to PyPI underdagster-ext-process
, but is available in python asimport dagster_ext
. - [asset checks] Added an ‘execute’ button to run checks without materializing the asset. Currently this is only supported for checks defined with
@asset_check
orAssetChecksDefinition
. - [asset checks] Added
check_specs
argument to@graph_multi_asset
- [asset checks] Fixed a bug with checks on
@graph_asset
that would raise an error about nonexistant checks
1.4.13 / 0.20.13 (libraries)
New
OpExecutionContext.add_output_metadata
can now be called multiple times per output.
Bugfixes
- The double evaluation of log messages in sensor logging has been fixed (thanks
@janosroden
!) - Cron schedules targeting leap day (ending with
29 2 *
) no longer cause exceptions in the UI or daemon. - Previously, if multiple partitioned
observable_source_asset
s with different partition definitions existed in the same code location, runs targeting those assets could fail to launch. This has been fixed. - When using AutoMaterializePolicies with assets that depended on prior partitions of themselves, updating the
start_date
of their underlyingPartitionsDefinition
could result in runs being launched for partitions that no longer existed. This has been fixed. - Fixed an issue where auto-materilization could sometimes produce duplicate runs if there was an error in the middle of an auto-materialization tick.
- [dagster-census] A recent change to the Census API broke compatibility with
this integration. This has been fixed (thanks@ldnicolasmay
!) - [dagster-dbt] Fixed an issue where
DagsterDbtTranslator
did not properly invokeget_auto_materialize_policy
andget_freshness_policy
forload_assets_from_dbt_project
. - [ui] Fixed a number of interaction bugs with the Launchpad config editor, including issues with newlines and multiple cursors.
- [ui] Asset keys and partitions presented in the asset checks UI are sorted to avoid flickering.
- [ui] Backfill actions (terminate backfill runs, cancel backfill submission) are now available from an actions menu on the asset backfill details page.
Community Contributions
- Typo fix in run monitoring docs (thanks c0dk)!
- Grammar fixes in testing docs (thanks sonnyarora)!
- Typo fix in contribution docs (thanks tab1tha)!
Experimental
- [dagster-dbt][asset checks] Added support to model dbt tests as Dagster asset checks.
- [asset checks] Added
@graph_asset
support. This can be used to implement blocking checks, by raising an exception if the check fails. - [asset checks] Fixed
@multi_asset
subsetting, so only checks which target assets in the subset will execute. - [asset checks]
AssetCheckSpec
s will now cause an error at definition time if they target an asset other than the one they’re defined on. - [asset checks] The status of asset checks now appears in the asset graph and asset graph sidebar.
Dagster Cloud
- [Experimental] Added support for freeing global op concurrency slots after runs have finished, using the deployment setting:
run_monitoring > free_slots_after_run_end_seconds
1.4.12 / 0.20.12 (libraries)
New
- The
context
object now has anasset_key
property to get theAssetKey
of the current asset. - Performance improvements to the auto-materialize daemon when running on large asset graphs.
- The
dagster dev
anddagster-daemon run
commands now include a--log-level
argument that allows you to customize the logger level threshold. - [dagster-airbyte]
AirbyteResource
now includes apoll_interval
key that allows you to configure how often it checks an Airbyte sync’s status.
Bugfixes
- Fixed an issue where the dagster scheduler would sometimes raise an error if a schedule set its cron_schedule to a list of strings and also had its default status set to AUTOMATICALLY_RUNNING.
- Fixed an issue where the auto-materialize daemon would sometimes raise a RecursionError when processing asset graphs with long upstream dependency chains.
- [ui] Fixed an issue where the Raw Compute Logs dropdown on the Run page sometimes didn’t show the current step name or properly account for retried steps.
Community Contributions
- [dagster-databricks] Fixed a regression causing
DatabricksStepLauncher
to fail. Thanks @zyd14! - Fixed an issue where Dagster raised an exception when combining observable source assets with multiple partitions definitions. Thanks @aroig!
- [dagster-databricks] Added support for client authentication with OAuth. Thanks @zyd14!
- [dagster-databricks] Added support for
workspace
andvolumes
init scripts in the databricks client. Thanks @zyd14! - Fixed a missing import in our docs. Thanks @C0DK!
Experimental
-
Asset checks are now displayed in the asset graph and sidebar.
-
[Breaking] Asset check severity is now set at runtime on
AssetCheckResult
instead of in the@asset_check
definition. Now you can define one check that either errors or warns depending on your check logic.ERROR
severity no longer causes the run to fail. We plan to reintroduce this functionality with a different API. -
[Breaking]
@asset_check
now requires theasset=
argument, even if the asset is passed as an input to the decorated function. Example:@asset_check(asset=my_asset) def my_check(my_asset) -> AssetCheckResult: ...
-
[Breaking]
AssetCheckSpec
now takesasset=
instead ofasset_key=
, and can accept either a key or an asset definition. -
[Bugfix] Asset checks now work on assets with
key_prefix
set. -
[Bugfix]
Execution failure
asset checks are now displayed correctly on the checks tab.
Documentation
- [dagster-dbt] Added example of invoking
DbtCliResource
in custom asset/op to API docs. - [dagster-dbt] Added reference to explain how a dbt manifest can be created at run time or build time.
- [dagster-dbt] Added reference to outline the steps required to deploy a Dagster and dbt project in CI/CD.
- Miscellaneous fixes to broken links and typos.
1.4.11 / 0.20.11 (libraries)
New
- Dagster code servers now wait to shut down until any calls that they are running have finished, preventing them from stopping while in the middle of executing sensor ticks or other long-running operations.
- The
dagster execute job
cli now accepts—-op-selection
(thanks @silent-lad!) - [ui] Option (Alt) + R now reloads all code locations (OSS only)
Bugfixes
- Adds a check to validate partition mappings when directly constructing
AssetsDefinition
instances. - Assets invoked in composition functions like
@graph
and@job
now work again, fixing a regression introduced in 1.4.5. - Fixed an issue where a race condition with parallel runs materializing the same asset could cause a run to hang.
- Fixed an issue where including a resource in both a schedule and a job raised a “Cannot specify resource requirements” exception when the definitions were loaded.
- The
ins
argument tograph_asset
is now respected correctly. - Fixed an issue where the daemon process could sometimes stop with a heartbeat failure when the first sensor it ran took a long time to execute.
- Fixed an issue where
dagster dev
failed on startup when theDAGSTER_GRPC_PORT
`environment variable was set in the environment. deps
arguments for an asset can now be specified as an iterable instead of a sequence, allowing for sets to be passed.- [dagster-aws] Fixed a bug where the S3PickleIOManager didn’t correctly handle missing partitions when allow_missing_partitions was set. Thanks @o-sirawat!
- [dagster-k8s] in the helm chart, the daemon
securityContext
setting now applies correctly to all init containers (thanks @maowerner!)
Community Contributions
- [dagster-databricks] Migrated to use new official databricks Python SDK. Thanks @judahrand!
Experimental
- New APIs for defining and executing checks on software-defined assets. These APIs are very early and subject to change. The corresponding UI has limited functionality. Docs
- Adds a new auto-materialize skip rule
AutoMaterializeRule.skip_on_not_all_parents_updated
that enforces that an asset can only be materialized if all parents have been materialized since the asset's last materialization. - Exposed an auto-materialize skip rule –
AutoMaterializeRule.skip_on_parent_missing
–which is already part of the behavior of the default auto-materialize policy. - Auto-materialize evaluation history will now be stored for 1 month, instead of 1 week.
- The auto-materialize asset daemon now includes more logs about what it’s doing for each asset in each tick in the Dagster Daemon process output.
Documentation
- [dagster-dbt] Added reference docs for
dagster-dbt project scaffold
.
Dagster Cloud
- Fixed an issue where the Docker agent would sometimes fail to load code locations with long names with a hostname connection error.
1.4.10 / 0.20.10 (libraries)
Bugfixes
- [dagster-webserver] Fixed an issue that broke loading static files on Windows.
1.4.9 / 0.20.9 (libraries)
Bugfixes
- Fixed an issue that caused some missing icons in the UI.
1.4.8 / 0.20.8 (libraries)
New
- A new
@partitioned_config
decorator has been added for defined configuration for partitioned jobs. Thanks @danielgafni! - [dagster-aws] The
ConfigurablePickledObjectS3IOManager
has been renamedS3PickleIOManager
for simplicity. TheConfigurablePickledObjecS3IOManager
will continue to be available but is considered deprecated in favor ofS3PickleIOManager
. There is no change in the functionality of the I/O manager. - [dagster-azure] The
ConfigurablePickledObjectADLS2IOManager
has been renamedADLS2PickleIOManager
for simplicity. TheConfigurablePickledObjectADLS2IOManager
will continue to be available but is considered deprecated in favor ofADLS2PickleIOManager
. There is no change in the functionality of the I/O manager. - [dagster-dbt] When an exception is raised when invoking a dbt command using
DbtCliResource
, the exception message now includes a link to thedbt.log
produced. This log file can be inspected for debugging. - [dagster-gcp] The
ConfigurablePickledObjectGCSIOManager
has been renamedGCSPickleIOManager
for simplicity. TheConfigurablePickledObjecGCSIOManager
will continue to be available but is considered deprecated in favor ofGCSPickleIOManager
. There is no change in the functionality of the I/O manager.
Bugfixes
- Fixed a bug that caused a
DagsterInvariantViolationError
when executing a multi-asset where both assets have self-dependencies on earlier partitions. - Fixed an asset backfill issue where some runs continue to be submitted after a backfill is requested for cancellation.
- [dagster-dbt] Fixed an issue where using the
--debug
flag raised an exception in the Dagster framework. - [ui] “Launched run” and “Launched backfill” toasts in the Dagster UI behave the same way. To open in a new tab, hold the cmd/ctrl key when clicking “View”
- [ui] When opening step compute logs, the view defaults to
stderr
which aligns with Python’s logging defaults. - [ui] When viewing a global asset graph with more than 100 assets, the “choose a subset to display” prompt is correctly aligned to the query input.
Community Contributions
- Fix for loading assets with a
BackfillPolicy
, thanks @ruizh22!
Experimental
- [dagster-graphql] The Dagster GraphQL Python client now includes a default timeout of 300 seconds for each query, to ensure that GraphQL requests don’t hang and never return a response. If you are running a query that is expected to take longer than 300 seconds, you can set the
timeout
argument when constructing aDagsterGraphQLClient
. - [ui] We are continuing to improve the new horizontal rendering of the asset graph, which you can enable in Settings. This release increases spacing between nodes and improves the traceability of arrows on the graph.
Documentation
- Several Pythonic resources and I/O managers now have API docs entries.
- Updated the tutorial’s example project and content to be more explicit about resources.
- [dagster-dbt] Added API docs examples for
DbtCliResource
andDbtCliResource.cli(...)
. - Some code samples in API docs for
InputContext
andOutputContext
have been fixed. Thanks @Sergey Mezentsev!
Dagster Cloud
- When setting up a new organization by importing a dbt project, using GitLab is now supported.
1.4.7 / 0.20.7 (libraries)
Experimental
-
Added a
respect_materialization_data_versions
option to auto materialization. It can enabled indagster.yaml
withauto_materialize: respect_materialization_data_versions: True
This flag may be changed or removed in the near future.
1.4.6 / 0.20.6 (libraries)
New
- ops or assets with multiple outputs that are all required and return type
None
/Nothing
will interpret an explicitly or implicitly returned valueNone
to indicate that all outputs were successful. - The
skip_reason
argument to the constructor ofSensorResult
now accepts a string in addition to aSkipReason
. - [dagster-k8s] Added a
step_k8s_config
field tok8s_job_executor
that allows you to customize the raw Kubernetes config for each step in a job. See the docs for more information. - [dagster-k8s] Launched run pods now have an additional code location label.
- [dagster-ui] The runs table now lets you toggle which tags are always visible.
- [dagster-dbt]
dagster-dbt project scaffold
now creates the scaffold in multiple files:constants.py
contains a reference to your manifest and dbt project directoryassets.py
contains your initial dbt assets definitionsdefinitions.py
contains the code to load your asset definitions into the Dagster UIschedules.py
contains an optional schedule to add for your dbt assets
- [dagster-dbt] Added new methods
get_auto_materialize_policy
andget_freshness_policy
toDagsterDbtTranslator
. - [dagster-fivertran] Sync options can now be passed to
load_assets_from_fivetran_instance
. - [dagster-wandb] W&B IO Manager now handles partitions natively. (Thanks @chrishiste!)
Bugfixes
- Previously, canceling large asset backfills would cause the daemon to time out and display a “not running” error. This has been fixed.
- [dagster-ssh] Previously the
SSHResource
would warn whenallow_host_key_change
was set. Now known hosts are always loaded from the system hosts file, and theallow_host_key_change
parameter is ignored. - Previously, when using AutoMaterializePolicies, partitioned assets downstream of partitioned observable source assets could be materialized before their parent partitions were observed. This has been fixed.
Documentation
@graph_multi_asset
now has an API docs entry.- The
GCSComputeLogManager
example in the Dagster Instance reference is now correct. - Several outdated K8s documentation links have been removed from the Customizing your Kubernetes deployment guide.
- Added callouts to the GitHub and GitLab Branch Deployment guides specifying that some steps are optional for Serverless users.
- The “Graphs” page under the “Concepts” section has been renamed to “Op Graphs” and moved inside under the “Ops” heading.
- [dagster-dbt] Added API examples for
@dbt_assets
for the following use-cases:- Running dbt commands with flags
- Running dbt commands with
--vars
- Running multiple dbt commands
- Retrieving dbt artifacts after running a dbt command
- Invoking other Dagster resouces alongside dbt
- Defining and accessing Dagster config alongside dbt
Dagster Cloud
- The viewer role now has permission to edit their own user tokens.