Dagster allows you to define and execute checks on your software-defined assets. Each asset check verifies some property of a data asset, e.g. that is has no null values in a particular column.
( experimental ) This API may break in future versions, even between dot releases. >
Create a definition for how to execute an asset check.
asset (Union[AssetKey, Sequence[str], str, AssetsDefinition, SourceAsset]) – The asset that the check applies to.
name (Optional[str]) – The name of the check. If not specified, the name of the decorated function will be used. Checks for the same asset must have unique names.
description (Optional[str]) – The description of the check.
required_resource_keys (Optional[Set[str]]) – A set of keys for resources that are required by the function that execute the check. These can alternatively be specified by including resource-typed parameters in the function signature.
config_schema (Optional[ConfigSchema) – The configuration schema for the check’s underlying op. If set, Dagster will check that config provided for the op matches this schema and fail if it does not. If not set, Dagster will accept any config provided for the op.
op_tags (Optional[Dict[str, Any]]) – A dictionary of tags for the op that executes the check. Frameworks may expect and require certain metadata to be attached to a op. Values that are not strings will be json encoded and must meet the criteria that json.loads(json.dumps(value)) == value.
compute_kind (Optional[str]) – A string to represent the kind of computation that executes the check, e.g. “dbt” or “spark”.
retry_policy (Optional[RetryPolicy]) – The retry policy for the op that executes the check.
Produces an AssetChecksDefinition
object.
Example
from dagster import asset, asset_check, AssetCheckResult
@asset
def my_asset() -> None:
...
@asset_check(asset=my_asset, description="Check that my asset has enough rows")
def my_asset_has_enough_rows() -> AssetCheckResult:
num_rows = ...
return AssetCheckResult(passed=num_rows > 5, metadata={"num_rows": num_rows})
from dagster import asset, asset_check, AssetCheckResult
from pandas import DataFrame
@asset
def my_asset() -> DataFrame:
...
@asset_check(asset=my_asset, description="Check that my asset has enough rows")
def my_asset_has_enough_rows(my_asset: DataFrame) -> AssetCheckResult:
num_rows = my_asset.shape[0]
return AssetCheckResult(passed=num_rows > 5, metadata={"num_rows": num_rows})
( experimental ) This API may break in future versions, even between dot releases. >
The result of an asset check.
The name of the check.
Optional[str]
The pass/fail result of the check.
bool
Arbitrary metadata about the asset. Keys are displayed string labels, and values are one of the following: string, float, int, JSON-serializable dict, JSON-serializable list, and one of the data classes returned by a MetadataValue static method.
Optional[Dict[str, RawMetadataValue]]
Severity of the check. Defaults to ERROR.
( experimental ) This API may break in future versions, even between dot releases. >
Defines information about an asset check, except how to execute it.
AssetCheckSpec is often used as an argument to decorators that decorator a function that can execute multiple checks - e.g. @asset, and @multi_asset. It defines one of the checks that will be executed inside that function.
name (str) – Name of the check.
asset (Union[AssetKey, Sequence[str], str, AssetsDefinition, SourceAsset]) – The asset that the check applies to.
description (Optional[str]) – Description for the check.
( experimental ) This API may break in future versions, even between dot releases. >
Severity level for an asset check.
Severities:
WARN: If the check fails, don’t fail the step.
ERROR: If the check fails, fail the step and, within the run, skip materialization of any assets that are downstream of the asset being checked.