Datamarts

Created: May 20, 2021, Updated: June 20, 2022

Datamarts are one of the ways to get your data outside of Bizzflow. You will learn how to create a datamart and how to connect to one you created.

Datamarts are isolated parts of storage (analytical warehouse) that may be made accessible from outside of your Bizzflow project. Physically they are represented as schemas (or datasets, in BigQuery) in the warehouse. Each datamart has a user assigned that is able to access the datamart and nothing else in the warehouse.

Datamart Configuration

Datamarts are managed in file datamarts.json (or datamarts.yaml) and are made to be basically copies of output kexes.

Datamart Reference

KeyTypeDescription
idstringrequired, a name for your datamart (max 38 characters)
out_kexstringrequired, an output stage kex that will be used to copy data from
dm_kexstringrequred, a datamart kex name
allowed_tablesarray of stringsan optional list of table names allowed to be copied from out_kex

Example: A simple datamart with table whitelisting

Let’s pretend we have following tables in our kex out_main:

  • accounts
  • deals
  • invoices
  • timesheets
  • customers

We want to connect Tableau to our data but we only want the Tableau workbook to be used by sales reps and we would like to be able not to share invoices and timesheets with them.

datamarts.json

[
  {
    "id": "sales",
    "out_kex": "out_main",
    "dm_kex": "dm_sales",
    "allowed_tables": ["accounts", "deals", "customers"]
  }
]

Again, everything works the same with YAML:

datamarts.yaml

- id: sales
  out_kex: out_main
  dm_kex: dm_sales
  allowed_tables:
    - accounts
    - deals
    - customers

After running the datamart DAG in Airflow, dm_sales kex will be created (if it doesn’t exist already) and tables accounts, deals and customers will be copied to it from kex out_main.