Datamarts
Created: May 20, 2021, Updated: June 20, 2022
Datamarts are one of the ways to get your data outside of Bizzflow. You will learn how to create a datamart and how to connect to one you created.
Datamarts are isolated parts of storage (analytical warehouse) that may be made accessible from outside of your Bizzflow project. Physically they are represented as schemas (or datasets, in BigQuery) in the warehouse. Each datamart has a user assigned that is able to access the datamart and nothing else in the warehouse.
Datamart Configuration
Datamarts are managed in file datamarts.json
(or datamarts.yaml
) and are made to be basically copies
of output kexes.
Datamart Reference
Key | Type | Description |
---|---|---|
id | string | required, a name for your datamart (max 38 characters) |
out_kex | string | required, an output stage kex that will be used to copy data from |
dm_kex | string | requred, a datamart kex name |
allowed_tables | array of strings | an optional list of table names allowed to be copied from out_kex |
Example: A simple datamart with table whitelisting
Let’s pretend we have following tables in our kex out_main
:
accounts
deals
invoices
timesheets
customers
We want to connect Tableau to our data but we only want the Tableau workbook to be used by sales reps and we would
like to be able not to share invoices
and timesheets
with them.
datamarts.json
[
{
"id": "sales",
"out_kex": "out_main",
"dm_kex": "dm_sales",
"allowed_tables": ["accounts", "deals", "customers"]
}
]
Again, everything works the same with YAML
:
datamarts.yaml
- id: sales
out_kex: out_main
dm_kex: dm_sales
allowed_tables:
- accounts
- deals
- customers
After running the datamart DAG in Airflow, dm_sales
kex will be created (if it doesn’t exist already)
and tables accounts
, deals
and customers
will be copied to it from kex out_main
.