Skip to content

Releases: dagster-io/dagster

0.9.18

05 Nov 22:32
Compare
Choose a tag to compare

Breaking Changes

  • CliApiRunLauncher and GrpcRunLauncher have been combined into DefaultRunLauncher.
    If you had one of these run launchers in your dagster.yaml, replace it with DefaultRunLauncher
    or remove the run_launcher: section entirely.

New

  • Added a type loader for typed dictionaries: can now load typed dictionaries from config.

Bugfixes

  • Dagit bugfixes and improvements
    • Added error handling for repository errors on startup and reload
    • Repaired timezone offsets
    • Fixed pipeline explorer state for empty pipelines
    • Fixed Scheduler table
  • User-defined k8s config in the pipeline run tags (with key dagster-k8s/config) will now be
    passed to the k8s jobs when using the dagster-k8s and dagster-celery-k8s run launchers.
    Previously, only user-defined k8s config in the pipeline definition’s tag was passed down.

Experimental

  • Run queuing: the new QueuedRunCoordinator enables limiting the number of concurrent runs.
    The DefaultRunCoordinator launches jobs directly from Dagit, preserving existing behavior.

0.9.18.pre0

05 Nov 22:33
Compare
Choose a tag to compare
0.9.18.pre0 Pre-release
Pre-release
0.9.18.pre0

0.9.17

29 Oct 22:20
Compare
Choose a tag to compare

New

  • [dagster-dask] Allow connecting to an existing scheduler via its address
  • [dagster-aws] Importing dagster_aws.emr no longer transitively importing dagster_spark
  • [dagster-dbr] CLI solids now emit materializations

Community contributions

Bug fixes

  • PipelineDefinition 's that do not meet resource requirements for its types will now fail at definition time
  • Dagit bugfixes and improvements
  • Fixed an issue where a run could be left hanging if there was a failure during launch

Deprecated

  • We now warn if you return anything from a function decorated with @pipeline. This return value actually had no impact at all and was ignored, but we are making changes that will use that value in the future. By changing your code to not return anything now you will avoid any breaking changes with zero user-visible impact.

0.9.16

22 Oct 23:58
Compare
Choose a tag to compare

Breaking Changes

  • Removed DagsterKubernetesPodOperator in dagster-airflow.
  • Removed the execute_plan mutation from dagster-graphql.
  • ModeDefinition, PartitionSetDefinition, PresetDefinition, @repository, @pipeline, and ScheduleDefinition names must pass the regular expression r"^[A-Za-z0-9_]+$" and not be python keywords or disallowed names. See DISALLOWED_NAMES in dagster.core.definitions.utils for exhaustive list of illegal names.
  • dagster-slack is now upgraded to use slackclient 2.x - this means that this resource will only support Python 3.6 and above.
  • [K8s] Added a health check to the helm chart for user deployments, which relies on a new dagster api grpc-health-check cli command present in Dagster 0.9.16 and later.

New

  • Add helm chart configurations to allow users to configure a K8sRunLauncher, in place of the CeleryK8sRunLauncher.
  • “Copy URL” button to preserve filter state on Run page in dagit

Community Contributions

  • Dagster CLI options can now be passed in via environment variables (Thanks @xinbinhuang!)
  • New --limit flag on the dagster run list command (Thanks @haydarai!)

Bugfixes

  • Addressed performance issues loading the /assets table in dagit. Requires a data migration to create a secondary index by running dagster instance reindex.
  • Dagit bugfixes and improvements

0.9.15

15 Oct 22:33
Compare
Choose a tag to compare

Breaking Changes

  • CeleryDockerExecutor no longer requires a repo_location_name config field.
  • executeRunInProcess was removed from dagster-graphql

New

  • Dagit: Warn on tab removal in playground
  • Display versions CLI: Added a new CLI that displays version information for a memoized run. Called via dagster pipeline list_versions.
  • CeleryDockerExecutor accepts a network field to configure the network settings for the Docker container it connects to for execution.
  • Dagit will now set a statement timeout on supported instance DBs. Defaults to 5s and can be controlled with the --db-statement-timeout flag

Community Contributions

  • dagster grpc requirements are now more friendly for users (thanks @jmo-qap!)
  • dagster.utils now has is_str (thanks @monicayao!)
  • dagster-pandas can now load dataframes from pickle (thanks @mrdrprofuroboros!)
  • dagster-ge validation solid factory now accepts name (thanks @haydarai!)

Bugfixes

  • Dagit bugfixes and improvements
  • Fixed an issue where dagster could fail to load large pipelines.
  • Fixed a bug where experimental arg warning would be thrown even when not using versioned dagster type loaders.
  • Fixed a bug where CeleryDockerExecutor was failing to execute pipelines unless they used a legacy workspace config.
  • Fixed a bug where pipeline runs using IntMetadataEntryData could not be visualized in dagit.

Experimental

  • Improve the output structure of dagster-dbt solids.
  • Version-based memoization over outputs stored in the intermediate store now works

Documentation

  • Fix a code snippet rendering issue in Overview: Assets & Materializations
  • Fixed all python code snippets alignment across docs examples

0.9.14

08 Oct 20:59
Compare
Choose a tag to compare

0.9.14

New

  • Steps down stream of a failed step no longer report skip events and instead simply do not execute.
  • dagit-debug can load multiple debug files.
  • dagit now has a Debug Console Logging feature flag accessible at /flags .
  • Telemetry metrics are now taken when scheduled jobs are executed.
  • With memoized reexecution, we now only copy outputs that current plan won't generate
  • Document titles throughout dagit

Community Contributions

  • [dagster-ge] solid factory can now handle arbitrary types (thanks @sd2k!)
  • [dagster-dask] utility options are now available in loader/materializer for Dask DataFrame (thanks @kinghuang!)

Bugfixes

  • Fixed an issue where run termination would sometimes be ignored or leave the execution process hanging
  • [dagster-k8s] fixed issue that would cause timeouts on clusters with many jobs
  • Fixed an issue where reconstructable was unusable in an interactive environment, even when the pipeline is defined in a different module.
  • Bugfixes and UX improvements in dagit

Experimental

  • AssetMaterializations now have an optional “partition” attribute

0.9.12

01 Oct 20:20
Compare
Choose a tag to compare

Breaking Changes

  • Dagster now warns when a solid, pipeline, or other definition is created with an invalid name (for example, a Python keyword). This warning will become an error in the 0.9.13 release.

Community Contributions

  • Added an int type to EventMetadataEntry (Thanks @ChocoletMousse !)
  • Added a build_composite_solid_definition method to Lakehouse (Thanks @sd2k!)
  • Improved broken link detection in Dagster docs (Thanks @keyz !)

New

  • Improvements to log filtering on Run view in Dagit
  • Improvements to instance level scheduler page
  • Emit engine events when pipeline termination is initiated
  • Published the Lakehouse module to PyPI

Bugfixes

  • Syntax errors in user code now display the file and line number with the error in Dagit.
  • Dask executor no longer fails when using intermediate_storage
  • Fixes an issue using `build_reconstructable_pipeline
  • In the Celery K8s executor, we now mark the step as failed when the step job fails
  • Changed DagsterInvalidAssetKey error so that it no longer fails upon being thrown.

Documentation

  • Added API docs for dagster-dbt experimental library.
  • Fixed some cosmetic issues with docs.dagster.io.
  • Added code snippets from Solids examples to test path, and fixed some inconsistencies regarding parameter ordering.
  • Changed to using markers instead of exact line numbers to mark out code snippets

0.9.11

25 Sep 00:06
Compare
Choose a tag to compare

Breaking Changes

  • [dagster-dask] Removed the compute option from Dask DataFrame materialization configs for all output types. Setting this option to False (default True) would result in a future that is never computed, leading to missing materializations

Community Contributions

New

  • Console log messages are now streamlined to live on a single line per message
  • Added better messaging around $DAGSTER_HOME if it is not set or improperly setup when starting up a Dagster instance
  • Tools for exporting a file for debugging a run have been added:
    • dagster debug export - a new CLI entry added for exporting a run by id to a file
    • dagit-debug - a new CLI added for loading dagit with a run to debug
    • dagit now has a button to download the debug file for a run via the action menu on the runs page
  • The dagster api grpc command now defaults to the current working directory if none is specified
  • Added retries to dagster-postgres connections
  • Fixed faulty warning message when invoking the same solid multiple times in the same context
  • Added ability to specify custom liveness probe for celery workers in kubernetes deployment

Bugfixes

  • Fixed a bug where Dagster types like List/Set/Tuple/Dict/Optional were not displaying properly on dagit logs
  • Fixed endless spinners on dagit --empty-workspace
  • Fixed incorrect snapshot banner on pipeline view
  • Fixed visual overlapping of overflowing dagit logs
  • Fixed a bug where hanging runs when executing against a gRPC server could cause the Runs page to be unable to load
  • Fixed a bug in celery integration where celery tasks could return None when an iterable is expected, causing errors in the celery execution loop.

Experimental

  • [lakehouse] Each time a Lakehouse solid updates an asset, it automatically generates an AssetMaterialization event
  • [lakehouse] Lakehouse computed_assets now accept a version argument that describes the version of the computation
  • Setting the “dagster/is_memoized_run” tag to true will cause the run to skip any steps whose versions match the versions of outputs produced in prior runs.
  • [dagster-dbt] Solids for running dbt CLI commands
  • Added extensive documentation to illuminate how versions are computed
  • Added versions for step inputs from config, default values, and from other step outputs

0.9.9

17 Sep 21:41
Compare
Choose a tag to compare

New

  • [Databricks] solids created with create_databricks_job_solid now log a URL for accessing the job in the Databricks UI.
  • The pipeline execute command now defaults to using your current directory if you don’t specify a working directory.

Bugfixes

  • [Celery-K8s] Surface errors to Dagit that previously were not caught in the Celery workers.
  • Fix issues with calling add_run_tags on tags that already exist.
  • Add “Unknown” step state in Dagit’s pipeline run logs view for when pipeline has completed but step has not emitted a completion event

Experimental

  • Version tags for resources and external inputs.

Documentation

  • Fix rendering of example solid config in “Basics of Solids” tutorial.

0.9.8

16 Sep 00:31
Compare
Choose a tag to compare

New

  • Support for the Dagster step selection DSL: reexecute_pipeline now takes step_selection, which accepts queries like *solid_a.compute++ (i.e., solid_a.compute, all of its ancestors, its immediate descendants, and their immediate descendants). steps_to_execute is deprecated and will be removed in 0.10.0.

Community contributions

  • [dagster-databricks] Improved setup of Databricks environment (Thanks @sd2k!)
  • Enabled frozenlist pickling (Thanks @kinghuang!)

Bugfixes

  • Fixed a bug that pipeline-level hooks were not correctly applied on a pipeline subset.
  • Improved error messages when execute command can't load a code pointer.
  • Fixed a bug that prevented serializing Spark intermediates with configured intermediate storages.

Dagit

  • Enabled subset reexecution via Dagit when part of the pipeline is still running.
  • Made Schedules clickable and link to View All page in the schedule section.
  • Various Dagit UI improvements.

Experimental

  • [lakehouse] Added CLI command for building and executing a pipeline that updates a given set of assets: house update --module package.module —assets my_asset*

Documentation

  • Fixes and improvements.