Datasets

Created: January 26, 2024, Updated: April 17, 2024

Datasets are one of the possible ways to schedule an orchestration. You will learn how to create a datasets.

Datasets are virtual objects used only for scheduling orchestrations. For more information see Airflow dataset.

When datasets are defined in a configuration, Bizzflow creates a DAG for each of them with a single task. The task refreshes the dataset. The dataset task can also be added as a part of an orchestration - this allows to trigger another dependent orchestration.

Datasets Configuration

datasets.json (or datasets.yaml) is a simple list of datasets' names. Only alphanumeric characters, hyphens, and underscores are allowed.

datasets.json

[
  "sales-data",
  "marketing-data"
]

Again, everything works the same with YAML:

datasets.yaml

---
- sales-data
- marketing-data