Releases: dagster-io/dagster
Releases · dagster-io/dagster
0.9.18
Breaking Changes
CliApiRunLauncher
andGrpcRunLauncher
have been combined intoDefaultRunLauncher
.
If you had one of these run launchers in yourdagster.yaml
, replace it withDefaultRunLauncher
or remove therun_launcher:
section entirely.
New
- Added a type loader for typed dictionaries: can now load typed dictionaries from config.
Bugfixes
- Dagit bugfixes and improvements
- Added error handling for repository errors on startup and reload
- Repaired timezone offsets
- Fixed pipeline explorer state for empty pipelines
- Fixed Scheduler table
- User-defined k8s config in the pipeline run tags (with key
dagster-k8s/config
) will now be
passed to the k8s jobs when using thedagster-k8s
anddagster-celery-k8s
run launchers.
Previously, only user-defined k8s config in the pipeline definition’s tag was passed down.
Experimental
- Run queuing: the new
QueuedRunCoordinator
enables limiting the number of concurrent runs.
TheDefaultRunCoordinator
launches jobs directly from Dagit, preserving existing behavior.
0.9.18.pre0
0.9.18.pre0
0.9.17
New
- [dagster-dask] Allow connecting to an existing scheduler via its address
- [dagster-aws] Importing dagster_aws.emr no longer transitively importing dagster_spark
- [dagster-dbr] CLI solids now emit materializations
Community contributions
- Docs fix (Thanks @kaplanbora!)
Bug fixes
PipelineDefinition
's that do not meet resource requirements for its types will now fail at definition time- Dagit bugfixes and improvements
- Fixed an issue where a run could be left hanging if there was a failure during launch
Deprecated
- We now warn if you return anything from a function decorated with
@pipeline
. This return value actually had no impact at all and was ignored, but we are making changes that will use that value in the future. By changing your code to not return anything now you will avoid any breaking changes with zero user-visible impact.
0.9.16
Breaking Changes
- Removed
DagsterKubernetesPodOperator
indagster-airflow
. - Removed the
execute_plan
mutation fromdagster-graphql
. ModeDefinition
,PartitionSetDefinition
,PresetDefinition
,@repository
,@pipeline
, andScheduleDefinition
names must pass the regular expressionr"^[A-Za-z0-9_]+$"
and not be python keywords or disallowed names. SeeDISALLOWED_NAMES
indagster.core.definitions.utils
for exhaustive list of illegal names.dagster-slack
is now upgraded to use slackclient 2.x - this means that this resource will only support Python 3.6 and above.- [K8s] Added a health check to the helm chart for user deployments, which relies on a new
dagster api grpc-health-check
cli command present in Dagster0.9.16
and later.
New
- Add helm chart configurations to allow users to configure a
K8sRunLauncher
, in place of theCeleryK8sRunLauncher
. - “Copy URL” button to preserve filter state on Run page in dagit
Community Contributions
- Dagster CLI options can now be passed in via environment variables (Thanks @xinbinhuang!)
- New --limit flag on the dagster run list command (Thanks @haydarai!)
Bugfixes
- Addressed performance issues loading the /assets table in dagit. Requires a data migration to create a secondary index by running dagster instance reindex.
- Dagit bugfixes and improvements
0.9.15
Breaking Changes
- CeleryDockerExecutor no longer requires a repo_location_name config field.
executeRunInProcess
was removed fromdagster-graphql
New
- Dagit: Warn on tab removal in playground
- Display versions CLI: Added a new CLI that displays version information for a memoized run. Called via dagster pipeline list_versions.
- CeleryDockerExecutor accepts a network field to configure the network settings for the Docker container it connects to for execution.
- Dagit will now set a statement timeout on supported instance DBs. Defaults to 5s and can be controlled with the --db-statement-timeout flag
Community Contributions
- dagster grpc requirements are now more friendly for users (thanks @jmo-qap!)
- dagster.utils now has is_str (thanks @monicayao!)
- dagster-pandas can now load dataframes from pickle (thanks @mrdrprofuroboros!)
- dagster-ge validation solid factory now accepts name (thanks @haydarai!)
Bugfixes
- Dagit bugfixes and improvements
- Fixed an issue where dagster could fail to load large pipelines.
- Fixed a bug where experimental arg warning would be thrown even when not using versioned dagster type loaders.
- Fixed a bug where CeleryDockerExecutor was failing to execute pipelines unless they used a legacy workspace config.
- Fixed a bug where pipeline runs using IntMetadataEntryData could not be visualized in dagit.
Experimental
- Improve the output structure of dagster-dbt solids.
- Version-based memoization over outputs stored in the intermediate store now works
Documentation
- Fix a code snippet rendering issue in Overview: Assets & Materializations
- Fixed all python code snippets alignment across docs examples
0.9.14
0.9.14
New
- Steps down stream of a failed step no longer report skip events and instead simply do not execute.
- dagit-debug can load multiple debug files.
- dagit now has a Debug Console Logging feature flag accessible at /flags .
- Telemetry metrics are now taken when scheduled jobs are executed.
- With memoized reexecution, we now only copy outputs that current plan won't generate
- Document titles throughout dagit
Community Contributions
- [dagster-ge] solid factory can now handle arbitrary types (thanks @sd2k!)
- [dagster-dask] utility options are now available in loader/materializer for Dask DataFrame (thanks @kinghuang!)
Bugfixes
- Fixed an issue where run termination would sometimes be ignored or leave the execution process hanging
- [dagster-k8s] fixed issue that would cause timeouts on clusters with many jobs
- Fixed an issue where reconstructable was unusable in an interactive environment, even when the pipeline is defined in a different module.
- Bugfixes and UX improvements in dagit
Experimental
- AssetMaterializations now have an optional “partition” attribute
0.9.12
Breaking Changes
- Dagster now warns when a solid, pipeline, or other definition is created with an invalid name (for example, a Python keyword). This warning will become an error in the 0.9.13 release.
Community Contributions
- Added an int type to EventMetadataEntry (Thanks @ChocoletMousse !)
- Added a build_composite_solid_definition method to Lakehouse (Thanks @sd2k!)
- Improved broken link detection in Dagster docs (Thanks @keyz !)
New
- Improvements to log filtering on Run view in Dagit
- Improvements to instance level scheduler page
- Emit engine events when pipeline termination is initiated
- Published the Lakehouse module to PyPI
Bugfixes
- Syntax errors in user code now display the file and line number with the error in Dagit.
- Dask executor no longer fails when using intermediate_storage
- Fixes an issue using `build_reconstructable_pipeline
- In the Celery K8s executor, we now mark the step as failed when the step job fails
- Changed DagsterInvalidAssetKey error so that it no longer fails upon being thrown.
Documentation
- Added API docs for dagster-dbt experimental library.
- Fixed some cosmetic issues with docs.dagster.io.
- Added code snippets from Solids examples to test path, and fixed some inconsistencies regarding parameter ordering.
- Changed to using markers instead of exact line numbers to mark out code snippets
0.9.11
Breaking Changes
- [dagster-dask] Removed the
compute
option from Dask DataFrame materialization configs for all output types. Setting this option toFalse
(defaultTrue
) would result in a future that is never computed, leading to missing materializations
Community Contributions
- Added a Dask resource (Thanks @kinghuang!)
New
- Console log messages are now streamlined to live on a single line per message
- Added better messaging around
$DAGSTER_HOME
if it is not set or improperly setup when starting up a Dagster instance - Tools for exporting a file for debugging a run have been added:
dagster debug export
- a new CLI entry added for exporting a run by id to a filedagit-debug
- a new CLI added for loading dagit with a run to debugdagit
now has a button to download the debug file for a run via the action menu on the runs page
- The
dagster api grpc
command now defaults to the current working directory if none is specified - Added retries to dagster-postgres connections
- Fixed faulty warning message when invoking the same solid multiple times in the same context
- Added ability to specify custom liveness probe for celery workers in kubernetes deployment
Bugfixes
- Fixed a bug where Dagster types like List/Set/Tuple/Dict/Optional were not displaying properly on dagit logs
- Fixed endless spinners on
dagit --empty-workspace
- Fixed incorrect snapshot banner on pipeline view
- Fixed visual overlapping of overflowing dagit logs
- Fixed a bug where hanging runs when executing against a gRPC server could cause the Runs page to be unable to load
- Fixed a bug in celery integration where celery tasks could return
None
when an iterable is expected, causing errors in the celery execution loop.
Experimental
- [lakehouse] Each time a Lakehouse solid updates an asset, it automatically generates an AssetMaterialization event
- [lakehouse] Lakehouse computed_assets now accept a version argument that describes the version of the computation
- Setting the “dagster/is_memoized_run” tag to true will cause the run to skip any steps whose versions match the versions of outputs produced in prior runs.
- [dagster-dbt] Solids for running dbt CLI commands
- Added extensive documentation to illuminate how versions are computed
- Added versions for step inputs from config, default values, and from other step outputs
0.9.9
New
- [Databricks] solids created with create_databricks_job_solid now log a URL for accessing the job in the Databricks UI.
- The pipeline execute command now defaults to using your current directory if you don’t specify a working directory.
Bugfixes
- [Celery-K8s] Surface errors to Dagit that previously were not caught in the Celery workers.
- Fix issues with calling add_run_tags on tags that already exist.
- Add “Unknown” step state in Dagit’s pipeline run logs view for when pipeline has completed but step has not emitted a completion event
Experimental
- Version tags for resources and external inputs.
Documentation
- Fix rendering of example solid config in “Basics of Solids” tutorial.
0.9.8
New
- Support for the Dagster step selection DSL:
reexecute_pipeline
now takesstep_selection
, which accepts queries like*solid_a.compute++
(i.e.,solid_a.compute
, all of its ancestors, its immediate descendants, and their immediate descendants).steps_to_execute
is deprecated and will be removed in 0.10.0.
Community contributions
- [dagster-databricks] Improved setup of Databricks environment (Thanks @sd2k!)
- Enabled frozenlist pickling (Thanks @kinghuang!)
Bugfixes
- Fixed a bug that pipeline-level hooks were not correctly applied on a pipeline subset.
- Improved error messages when execute command can't load a code pointer.
- Fixed a bug that prevented serializing Spark intermediates with configured intermediate storages.
Dagit
- Enabled subset reexecution via Dagit when part of the pipeline is still running.
- Made
Schedules
clickable and link to View All page in the schedule section. - Various Dagit UI improvements.
Experimental
- [lakehouse] Added CLI command for building and executing a pipeline that updates a given set of assets:
house update --module package.module —assets my_asset*
Documentation
- Fixes and improvements.