Ask AI

You are viewing an unreleased or outdated version of the documentation

Changelog#

1.6.11 (core) / 0.22.11 (libraries)#

Bugfixes#

  • Fixed an issue where dagster dev or the Dagster UI would display an error when loading jobs created with op or asset selections.

1.6.10 (core) / 0.22.10 (libraries)#

New#

  • Latency improvements to the scheduler when running many simultaneous schedules.

Bugfixes#

  • The performance of loading the Definitions snapshot from a code server when large @multi_asset s are in use has been drastically improved.
  • The snowflake quickstart example project now renames the “by” column to avoid reserved snowflake names. Thanks @jcampbell!
  • The existing group name (if any) for an asset is now retained if the_asset.with_attributes is called without providing a group name. Previously, the existing group name was erroneously dropped. Thanks @ion-elgreco!
  • [dagster-dbt] Fixed an issue where Dagster events could not be streamed from dbt source freshness.
  • [dagster university] Removed redundant use of MetadataValue in Essentials course. Thanks @stianthaulow!
  • [ui] Increased the max number of plots on the asset plots page to 100.

Breaking Changes#

  • The tag_keys argument on DagsterInstance.get_run_tagsis no longer optional. This has been done to remove an easy way of accidentally executing an extremely expensive database operation.

Dagster Cloud#

  • The maximum number of concurrent runs across all branch deployments is now configurable. This setting can now be set via GraphQL or the CLI.
  • [ui] In Insights, fixed display of table rows with zero change in value from the previous time period.
  • [ui] Added deployment-level Insights.
  • [ui] Fixed an issue causing void invoices to show up as “overdue” on the billing page.
  • [experimental] Branch deployments can now indicate the new and modified assets in the branch deployment as compared to the main deployment. To enable this feature, turn on the “Enable experimental branch deployment asset graph diffing” user setting.

1.6.9 (core) / 0.22.9 (libraries)#

New#

  • [ui] When viewing logs for a run, the date for a single log row is now shown in the tooltip on the timestamp. This helps when viewing a run that takes place over more than one date.
  • Added suggestions to the error message when selecting asset keys that do not exist as an upstream asset or in an AssetSelection.
  • Improved error messages when trying to materialize a subset of a multi-asset which cannot be subset.
  • [dagster-snowflake] dagster-snowflake now requires snowflake-connector-python>=3.4.0
  • [embedded-elt] @sling_assets accepts an optional name parameter for the underlying op
  • [dagster-openai] dagster-openai library is now available.
  • [dagster-dbt] Added a new setting on DagsterDbtTranslatorSettings called enable_duplicate_source_asset_keys that allows users to set duplicate asset keys for their dbt sources. Thanks @hello-world-bfree!
  • Log messages in the Dagster daemon for unloadable sensors and schedules have been removed.
  • [ui] Search now uses a cache that persists across pageloads which should greatly improve search performance for very large orgs.
  • [ui] groups/code locations in the asset graph’s sidebar are now sorted alphabetically.

Bugfixes#

  • Fixed issue where the input/output schemas of configurable IOManagers could be ignored when providing explicit input / output run config.
  • Fixed an issue where enum values could not properly have a default value set in a ConfigurableResource.
  • Fixed an issue where graph-backed assets would sometimes lose user-provided descriptions due to a bug in internal copying.
  • [auto-materialize] Fixed an issue introduced in 1.6.7 where updates to ExternalAssets would be ignored when using AutoMaterializePolicies which depended on parent updates.
  • [asset checks] Fixed a bug with asset checks in step launchers.
  • [embedded-elt] Fix a bug when creating a SlingConnectionResource where a blank keyword argument would be emitted as an environment variable
  • [dagster-dbt] Fixed a bug where emitting events from dbt source freshness would cause an error.
  • [ui] Fixed a bug where using the “Terminate all runs” button with filters selected would not apply the filters to the action.
  • [ui] Fixed an issue where typing a search query into the search box before the search data was fetched would yield “No results” even after the data was fetched.

Community Contributions#

  • [docs] fixed typo in embedded-elt.mdx (thanks @cameronmartin)!
  • [dagster-databricks] log the url for the run of a databricks job (thanks @smats0n)!
  • Fix missing partition property (thanks christeefy)!
  • Add op_tags to @observable_source_asset decorator (thanks @maxfirman)!
  • [docs] typo in MultiPartitionMapping docs (thanks @dschafer)
  • Allow github actions to checkout branch from forked repo for docs changes (ci fix) (thanks hainenber)!

Experimental#

  • [asset checks] UI performance of asset checks related pages has been improved.
  • [dagster-dbt] The class DbtArtifacts has been added for managing the behavior of rebuilding the manifest during development but expecting a pre-built one in production.

Documentation#

  • Added example of writing compute logs to AWS S3 when customizing agent configuration.
  • "Hello, Dagster" is now "Dagster Quickstart" with the option to use a Github Codespace to explore Dagster.
  • Improved guides and reference to better running multiple isolated agents with separate queues on ECS.

Dagster Cloud#

  • Microsoft Teams is now supported for alerts. Documentation
  • A send sample alert button now exists on both the alert policies page and in the alert policies editor to make it easier to debug and configure alerts without having to wait for an event to kick them off.

1.6.8 (core) / 0.22.8 (libraries)#

Bugfixes#

  • [dagster-embedded-elt] Fixed a bug in the SlingConnectionResource that raised an error when connecting to a database.

Experimental#

  • [asset checks] graph_multi_assets with check_specs now support subsetting.

1.6.7 (core) / 0.22.7 (libraries)#

New#

  • Added a new run_retries.retry_on_op_or_asset_failures setting that can be set to false to make run retries only occur when there is an unexpected failure that crashes the run, allowing run-level retries to co-exist more naturally with op or asset retries. See the docs for more information.
  • dagster dev now sets the environment variable DAGSTER_IS_DEV_CLI allowing subprocesses to know that they were launched in a development context.
  • [ui] The Asset Checks page has been updated to show more information on the page itself rather than in a dialog.

Bugfixes#

  • [ui] Fixed an issue where the UI disallowed creating a dynamic partition if its name contained the “|” pipe character.
  • AssetSpec previously dropped the metadata and code_version fields, resulting in them not being attached to the corresponding asset. This has been fixed.

Experimental#

  • The new @multi_observable_source_asset decorator enables defining a set of assets that can be observed together with the same function.
  • [dagster-embedded-elt] New Asset Decorator @sling_assets and Resource SlingConnectionResource have been added for the [dagster-embedded-elt.sling](http://dagster-embedded-elt.sling) package. Deprecated build_sling_asset, SlingSourceConnection and SlingTargetConnection.
  • Added support for op-concurrency aware run dequeuing for the QueuedRunCoordinator.

Documentation#

  • Fixed reference documentation for isolated agents in ECS.
  • Corrected an example in the Airbyte Cloud documentation.
  • Added API links to OSS Helm deployment guide.
  • Fixed in-line pragmas showing up in the documentation.

Dagster Cloud#

  • Alerts now support Microsoft Teams.
  • [ECS] Fixed an issue where code locations could be left undeleted.
  • [ECS] ECS agents now support setting multiple replicas per code server.
  • [Insights] You can now toggle the visibility of a row in the chart by clicking on the dot for the row in the table.
  • [Users] Added a new column “Licensed role” that shows the user's most permissive role.

1.6.6 (core) / 0.22.6 (libraries)#

New#

  • Dagster officially supports Python 3.12.
  • dagster-polars has been added as an integration. Thanks @danielgafni!
  • [dagster-dbt] @dbt_assets now supports loading projects with semantic models.
  • [dagster-dbt] @dbt_assets now supports loading projects with model versions.
  • [dagster-dbt] get_asset_key_for_model now supports retrieving asset keys for seeds and snapshots. Thanks @aksestok!
  • [dagster-duckdb] The Dagster DuckDB integration supports DuckDB version 0.10.0.
  • [UPath I/O manager] If a non-partitioned asset is updated to have partitions, the file containing the non-partitioned asset data will be deleted when the partitioned asset is materialized, rather than raising an error.

Bugfixes#

  • Fixed an issue where creating a backfill of assets with dynamic partitions and a backfill policy would sometimes fail with an exception.
  • Fixed an issue with the type annotations on the @asset decorator causing a false positive in Pyright strict mode. Thanks @tylershunt!
  • [ui] On the asset graph, nodes are slightly wider allowing more text to be displayed, and group names are no longer truncated.
  • [ui] Fixed an issue where the groups in the asset graph would not update after an asset was switched between groups.
  • [dagster-k8s] Fixed an issue where setting the security_context field on the k8s_job_executor didn't correctly set the security context on the launched step pods. Thanks @krgn!

Experimental#

  • Observable source assets can now yield ObserveResults with no data_version.
  • You can now include FreshnessPolicys on observable source assets. These assets will be considered “Overdue” when the latest value for the “dagster/data_time” metadata value is older than what’s allowed by the freshness policy.
  • [ui] In Dagster Cloud, a new feature flag allows you to enable an overhauled asset overview page with a high-level stakeholder view of the asset’s health, properties, and column schema.

Documentation#

  • Updated docs to reflect newly-added support for Python 3.12.

Dagster Cloud#

  • [kubernetes] Fixed an issue where the Kubernetes agent would sometimes leave dangling kubernetes services if the agent was interrupted during the middle of being terminated.

1.6.5 (core) / 0.22.5 (libraries)#

New#

  • Within a backfill or within auto-materialize, when submitting runs for partitions of the same assets, runs are now submitted in lexicographical order of partition key, instead of in an unpredictable order.
  • [dagster-k8s] Include k8s pod debug info in run worker failure messages.
  • [dagster-dbt] Events emitted by DbtCliResource now include metadata from the dbt adapter response. This includes fields like rows_affected, query_id from the Snowflake adapter, or bytes_processed from the BigQuery adapter.

Bugfixes#

  • A previous change prevented asset backfills from grouping multiple assets into the same run when using BackfillPolicies under certain conditions. While the backfills would still execute in the proper order, this could lead to more individual runs than necessary. This has been fixed.
  • [dagster-k8s] Fixed an issue introduced in the 1.6.4 release where upgrading the Helm chart without upgrading the Dagster version used by user code caused failures in jobs using the k8s_job_executor.
  • [instigator-tick-logs] Fixed an issue where invoking context.log.exception in a sensor or schedule did not properly capture exception information.
  • [asset-checks] Fixed an issue where additional dependencies for dbt tests modeled as Dagster asset checks were not properly being deduplicated.
  • [dagster-dbt] Fixed an issue where dbt model, seed, or snapshot names with periods were not supported.

Experimental#

  • @observable_source_asset-decorated functions can now return an ObserveResult. This allows including metadata on the observation, in addition to a data version. This is currently only supported for non-partitioned assets.
  • [auto-materialize] A new AutoMaterializeRule.skip_on_not_all_parents_updated_since_cron class allows you to construct AutoMaterializePolicys which wait for all parents to be updated after the latest tick of a given cron schedule.
  • [Global op/asset concurrency] Ops and assets now take run priority into account when claiming global op/asset concurrency slots.

Documentation#

  • Fixed an error in our asset checks docs. Thanks @vaharoni!
  • Fixed an error in our Dagster Pipes Kubernetes docs. Thanks @cameronmartin!
  • Fixed an issue on the Hello Dagster! guide that prevented it from loading.
  • Add specific capabilities of the Airflow integration to the Airflow integration page.
  • Re-arranged sections in the I/O manager concept page to make info about using I/O versus resources more prominent.

1.3.12 (core) / 0.19.12 (libraries)#

New#

  • The --name argument is now optional when running dagster project from-example.
  • An asset key can now be directly specified via the asset decorator: @asset(key=...).
  • AssetKey now has a with_prefix method.
  • Significant performance improvements when using AutoMaterializePolicys with large numbers of partitions.
  • dagster instance migrate now prints information about changes to the instance database schema.
  • The dagster-cloud-agent helm chart now supports setting K8s labels on the agent deployment.
  • [ui] Step compute logs are shown under “Last Materialization” in the asset sidebar.
  • [ui] Truncated asset names now show a tooltip when hovered in the asset graph.
  • [ui] The “Propagate changes” button has been removed and replaced with “Materialize Stale and Missing” (which was the “Propagate changes” predecessor).

Bugfixes#

  • [ui] Fixed an issue that prevented filtering by date on the job-specific runs tab.

  • [ui] “F” key with modifiers (alt, ctrl, cmd, shift) no longer toggles the filter menu on pages that support filtering.

  • [ui] Fix empty states on Runs table view for individual jobs, to provide links to materialize an asset or launch a run for the specific job, instead of linking to global pages.

  • [ui] When a run is launched from the Launchpad editor while an editor hint popover is open, the popover remained on the page even after navigation. This has been fixed.

  • [ui] Fixed an issue where clicking on the zoom controls on a DAG view would close the right detail panel for selected nodes.

  • [ui] Fixed an issue shift-selecting assets with multi-component asset keys.

  • [ui] Fixed an issue with the truncation of the asset stale causes popover.

  • When using a TimeWindowPartitionMapping with a start_offset or end_offset specified, requesting the downstream partitions of a given upstream partition would yield incorrect results. This has been fixed.

  • When using AutoMaterializePolicys with observable source assets, in rare cases, a second run could be launched in response to the same version being observed twice. This has been fixed.

  • When passing in hook_defs to define_asset_job, if any of those hooks had required resource keys, a missing resource error would surface when the hook was executed. This has been fixed.

  • Fixed a typo in a documentation URL in dagster-duckdb-polars tests. The URL now works correctly.

Experimental#

  • [dagster-dbt] Added methods to DbtManifest to fetch asset keys of sources and models: DbtManifest.get_asset_key_for_model, DbtManifest.get_asset_key_for_source. These methods are utilities for defining python assets as dependencies of dbt assets via @asset(key=manifest.get_asset_key_for_model(...).
  • [dagster-dbt] The use of the state_path parameter with DbtManifestAssetSelection has been deprecated, and will be removed in the next minor release.
  • Added experimental support for limiting global op/asset concurrency across runs.

Dependencies#

  • Upper bound on the grpcio package (for dagster) has been removed.

Breaking Changes#

  • Legacy methods of PartitionMapping have been removed. Defining custom partition mappings has been unsupported since 1.1.7.

Community Contributions#

  • [dagster-airbyte] Added the ability to specify asset groups to build_airbyte_assets. Thanks @guy-rvvup!

Documentation#

  • For Dagster Cloud Serverless users, we’ve added our static IP addresses to the Serverless docs.

1.3.11 (core) / 0.19.11 (libraries)#

New#

  • Assets with lazy auto-materialize policies are no longer auto-materialized if they are missing but don’t need to be materialized in order to help downstream assets meet their freshness policies.
  • [ui] The descriptions of auto-materialize policies in the UI now include their skip conditions along with their materialization conditions.
  • [dagster-dbt] Customized asset keys can now be specified for nodes in the dbt project, using meta.dagster.asset_key. This field takes in a list of strings that are used as the components of the generated AssetKey.
version: 2

models:
  - name: users
    config:
      meta:
        dagster:
          asset_key: ["my", "custom", "asset_key"]
  • [dagster-dbt] Customized groups can now be specified for models in the dbt project, using meta.dagster.group. This field takes in a string that is used as the Dagster group for the generated software-defined asset corresponding to the dbt model.
version: 2

models:
  - name: users
    config:
      meta:
        dagster:
          group: "my_group"

Bugfixes#

  • Fixed an issue where the dagster-msteams and dagster-mlflow packages could be installed with incompatible versions of the dagster package due to a missing pin.
  • Fixed an issue where the dagster-daemon run command sometimes kept code server subprocesses open longer than it needed to, making the process use more memory.
  • Previously, when using @observable_source_assets with AutoMaterializePolicies, it was possible for downstream assets to get “stuck”, not getting materialized when other upstream assets changed, or for multiple down materializations to be kicked off in response to the same version being observed multiple times. This has been fixed.
  • Fixed a case where the materialization count for partitioned assets could be wrong.
  • Fixed an error which arose when trying to request resources within run failure sensors.
  • [dagster-wandb] Fixed handling for multi-dimensional partitions. Thanks @chrishiste

Experimental#

  • [dagster-dbt] improvements to @dbt_assets
    • project_dir and target_path in DbtCliTask are converted from type str to type pathlib.Path.
    • In the case that dbt logs are not emitted as json, the log will still be redirected to be printed in the Dagster compute logs, under stdout.

Documentation#

  • Fixed a typo in dagster_aws S3 resources. Thanks @akan72
  • Fixed a typo in link on the Dagster Instance page. Thanks @PeterJCLaw

1.3.10 (core) / 0.19.10 (libraries)#

New#

  • [dagster-dbt] By default, freshness policies and auto materialize policies on dbt assets can now be specified using the dagster field under +meta configuration. The following are equivalent:

Before:

version: 2

models:
  - name: users
    config:
      dagster_freshness_policy:
        maximum_lag_minutes: 60
        cron_schedule: "0 9 * * *"
      dagster_auto_materialize_policy:
        type: "lazy"

After:

version: 2

models:
  - name: users
    config:
      meta:
        dagster:
          freshness_policy:
            maximum_lag_minutes: 60
            cron_schedule: "0 9 * * *"
          auto_materialize_policy:
            type: "lazy"
  • Added support for Pythonic Config classes to the @configured API, which makes reusing op and asset definitions easier:

    class GreetingConfig(Config):
        message: str
    
    @op
    def greeting_op(config: GreetingConfig):
        print(config.message)
    
    class HelloConfig(Config):
        name: str
    
    @configured(greeting_op)
    def hello_op(config: HelloConfig):
        return GreetingConfig(message=f"Hello, {config.name}!")
    
  • Added AssetExecutionContext to replace OpExecutionContext as the context object passed in to @asset functions.

  • TimeWindowPartitionMapping now contains an allow_nonexistent_upstream_partitions argument that, when set to True, allows a downstream partition subset to have nonexistent upstream parents.

  • Unpinned the alembic dependency in the dagster package.

  • [ui] A new “Assets” tab is available from the Overview page.

  • [ui] The Backfills table now includes links to the assets that were targeted by the backfill.

Bugfixes#

  • Dagster is now compatible with a breaking change introduced in croniter==1.4.0. Users of earlier versions of Dagster can pin croniter<1.4.
  • Fixed an issue introduced in 1.3.8 which prevented resources from being bound to sensors when the specified job required late-bound resources.
  • Fixed an issue which prevented specifying resource requirements on a @run_failure_sensor.
  • Fixed an issue where the asset reconciliation sensor failed with a “invalid upstream partitions” error when evaluating time partitions definitions with different start times.
  • [dagster-k8s] Fixed an issue where annotations are not included in the Dagster Helm chart for the pod that is created when configuring the Helm chart to run database migrations.
  • [ui] Fixed an issue with filtering runs by created date on the Runs page.
  • [ui] The “upstream partitions missing” warning no longer appears in the asset backfill dialog if the upstream partitioned asset is a source asset.
  • [dagster-dbt] Fixed an issue where asset dependencies for dbt models with ephemeral models in between them would sometimes be improperly rendered.

Community Contributions#

  • Added support for setting resources in asset and multi_asset sensors. Thanks @plaflamme!
  • Fixed an issue where py.typed was missing in the dagster-graphql package. Thanks @Tanguy-LeFloch!

Experimental#

  • Evaluation history for AutoMaterializePolicys will now be cleared after 1 week.
  • [dagster-dbt] Several improvements to @dbt_assets:
    • profile and target can now be customized on the DbtCli resource.
    • If a partial_parse.msgpack is detected in the target directory of your dbt project, it is now copied into the target directories created by DbtCli to take advantage of partial parsing.
    • The metadata of assets generated by @dbt_assets can now be customized by overriding DbtManifest.node_info_to_metadata.
    • Execution duration of dbt models is now added as default metadata to AssetMaterializations.

Documentation#

Dagster Cloud#

  • Fixed an issue where overriding the container name of a code server pod using serverK8sConfig.containerConfig.name did not actually change the container name.

1.3.9 (core) / 0.19.9 (libraries)#

Dagster Cloud#

  • Fixed an issue in the 1.3.8 release where the Dagster Cloud agent would sometimes fail to start up with an import error.

1.3.8 (core) / 0.19.8 (libraries)#

New#

  • Multipartitioned assets with one time dimension can now depend on earlier partitions of themselves.
  • define_asset_job now accepts a hooks argument.
  • Added support for sqlalchemy==2.x
  • [ui] The Runs page has been revamped with better filtering support
  • [ui] The automaterialize policy page for SDA’s using the experimental AutomaterializePolicy feature now indicates time periods where no materializations happened due to no materialization conditions being met
  • [dagster-k8s] The Dagster Helm chart now includes an additionalInstanceConfig key that allows you to supply additional configuration to the Dagster instance.
  • [dagster-aws] The EcsRunLauncher now uses a different task definition family for each job, instead of registering a new task definition revision each time a different job is launched.
  • [dagster-aws] The EcsRunLauncher now includes a run_ecs_tags config key that lets you configure tags on the launched ECS task for each run.

Bugfixes#

  • When a sensor had yield statement and also returned a SkipReason, the SkipReason would be ignored. This has been fixed.
  • [dagster-cloud] Fixed a bug in the docker user code launcher that was preventing code location containers from being properly cleaned up.
  • Fixed an issue where the Dagster UI would sometimes raise a `RuntimeError: dictionary changed size during iteration exception while code servers were being reloaded.
  • Fixed an issue where the Dagster daemon reloaded your code server every 60 seconds when using the new experimental dagster code-server start CLI, instead of only reloading your code when you initiate a reload from the Dagster UI.
  • Fixed a GraphQL error which would occur when loading the default config for jobs without config.
  • [dagster-dbt] Fixed an error which would arise when trying to load assets from a DBT Cloud instance using the Pythonic-style resource.

Community Contributions#

  • Added the ability to specify metadata on asset jobs, by adding the metadata parameter to define_asset_job (Thanks Elliot2718!)
  • [dagster-databricks] Connected databricks stdout to local stdout, to be handled by the compute log manager (Thanks loerinczy!)
  • [dagster-census] Fixed poll_sync_run to handle the “preparing” status from the Census API (Thanks ldnicolasmay!)

Experimental#

  • @observable_source_asset-decorated functions can now return a DataVersionsByPartition to record versions for partitions.
  • @dbt_assets
    • DbtCliTask's created by invoking DbtCli.cli(...) now have a method .is_successful(), which returns a boolean representing whether the underlying CLI process executed the dbt command successfully.
    • Descriptions of assets generated by @dbt_assets can now be customized by overriding DbtManifest.node_info_to_description.
    • IO Managers can now be configured on @dbt_assets.

Documentation#

  • New guide on using Dagster to manage machine learning pipelines

Dagster Cloud#

  • Added support for streaming upload of compute logs to Dagster Cloud
  • The ECS agent now supports setting server_ecs_tags and run_ecs_tags that apply to each service or task created by the agent. See the docs for more information.
  • Fixed run filtering for calls to instance.get_run_partition_data in Dagster Cloud.