Data requirements
Created: May 10, 2023, Updated: September 14, 2023
This guideline should help you understand what are the requirements on the data and their formats.
Let’s make it clear from the beginning: Bizzflow is not a data warehouse. It is a tool that helps you to orchestrate your data warehouse. Therefore, it is not Bizzflow’s responsibility to make sure that the data you are loading into your warehouse are valid. Yo need to make sure that the data are valid and well-formatted before you attempt to load them.
Data encoding
Each of Bizzflow’s extractors handles input data in its own way. In order to check whether your input data conforms with the extractor’s requirements, please refer to the documentation of the extractor you are using.
A good rule of thumb is to always use either utf-8-transcodable characters or ASCII characters only.
Please note that if you are using Azure SQL as your data warehouse, you may find it difficult to load non-latin
characters. If you need to use cyrillic or other non-latin characters, you may set
default_column_type
parameter
in your project.json
or project.yaml
file to nvarchar(max)
. This will however limit the maximum size of
all columns to 4000 characters.
Data length
Each of the warehouses has its own limits on the maximum length of the columns. Please refer to the documentation of your data warehouse. As of writing this, the limits are as follows:
Warehouse | default_column_type | Max length |
---|---|---|
Azure SQL | nvarchar(max) | 4,000 |
Azure SQL | varchar(max) | 8,000 |
BigQuery | string | 10,485,760 |
Snowflake | varchar | 16,777,216 |