Skip to main content

Custom schemas

By default, all dbt models are built in the schema specified in your environment (dbt Cloud) or profile's target (dbt Core). This default schema is called your target schema.

For dbt projects with lots of models, it's common to build models across multiple schemas and group similar models together. For example, you might want to:

  • Group models based on the business unit using the model, creating schemas such as core, marketing, finance and support.
  • Hide intermediate models in a staging schema, and only present models that should be queried by an end user in an analytics schema.

To do this, specify a custom schema. dbt generates the schema name for a model by appending the custom schema to the target schema. For example, <target_schema>_<custom_schema>.

Target schemaCustom schemaResulting schema
analytics_prodNoneanalytics_prod
alice_devNonealice_dev
dbt_cloud_pr_123_456Nonedbt_cloud_pr_123_456
analytics_prodmarketinganalytics_prod_marketing
alice_devmarketingalice_dev_marketing
dbt_cloud_pr_123_456marketingdbt_cloud_pr_123_456_marketing

How do I use custom schemas?

To specify a custom schema for a model, use the schema configuration key. As with any configuration, you can do one of the following:

  • apply this configuration to a specific model by using a config block within a model
  • apply it to a subdirectory of models by specifying it in your dbt_project.yml file
orders.sql
{{ config(schema='marketing') }}

select ...
dbt_project.yml
# models in `models/marketing/ will be built in the "*_marketing" schema
models:
my_project:
marketing:
+schema: marketing

Understanding custom schemas

When first using custom schemas, it's a common misunderstanding to assume that a model only uses the new schema configuration; for example, a model that has the configuration schema: marketing would be built in the marketing schema. However, dbt puts it in a schema like <target_schema>_marketing.

There's a good reason for this deviation. Each dbt user has their own target schema for development (refer to Managing Environments). If dbt ignored the target schema and only used the model's custom schema, every dbt user would create models in the same schema and would overwrite each other's work.

By combining the target schema and the custom schema, dbt ensures that objects it creates in your data warehouse don't collide with one another.

If you prefer to use different logic for generating a schema name, you can change the way dbt generates a schema name (see below).

How does dbt generate a model's schema name?

dbt uses a default macro called generate_schema_name to determine the name of the schema that a model should be built in.

The following code represents the default macro's logic:

{% macro generate_schema_name(custom_schema_name, node) -%}

{%- set default_schema = target.schema -%}
{%- if custom_schema_name is none -%}

{{ default_schema }}

{%- else -%}

{{ default_schema }}_{{ custom_schema_name | trim }}

{%- endif -%}

{%- endmacro %}

💡 Use Jinja's whitespace control to tidy your macros!

When you're modifying macros in your project, you might notice extra white space in your code in the target/compiled folder.

You can remove unwanted spaces and lines with Jinja's whitespace control by using a minus sign. For example, use {{- ... -}} or {%- ... %} around your macro definitions (such as {%- macro generate_schema_name(...) -%} ... {%- endmacro -%}).

Changing the way dbt generates a schema name

If your dbt project has a custom macro called generate_schema_name, dbt will use it instead of the default macro. This allows you to customize the name generation according to your needs.

To customize this macro, copy the example code in the section How does dbt generate a model's schema name into a file named macros/generate_schema_name.sql and make changes as necessary.

Be careful. dbt will ignore any custom generate_schema_name macros included in installed packages.

 Warning: Don't replace `default_schema` in the macro

generate_schema_name arguments

ArgumentDescriptionExample
custom_schema_nameThe configured value of schema in the specified node, or none if a value is not suppliedmarketing
nodeThe node that is currently being processed by dbt{"name": "my_model", "resource_type": "model",...}

Jinja context available in generate_schema_name

If you choose to write custom logic to generate a schema name, it's worth noting that not all variables and methods are available to you when defining this logic. In other words: the generate_schema_name macro is compiled with a limited Jinja context.

The following context methods are available in the generate_schema_name macro:

Jinja contextTypeAvailable
targetVariable
env_varVariable
varVariableLimited, see below
exceptionsMacro
logMacro
Other macros in your projectMacro
Other macros in your packagesMacro

Which vars are available in generate_schema_name?

Globally-scoped variables and variables defined on the command line with --vars are accessible in the generate_schema_name context.

Managing different behaviors across packages

See docs on macro dispatch: "Managing different global overrides across packages"

A built-in alternative pattern for generating schema names

A common customization is to ignore the target schema in production environments, and ignore the custom schema configurations in other environments (such as development and CI).

Production Environment (target.name == 'prod')

Target schemaCustom schemaResulting schema
analytics_prodNoneanalytics_prod
analytics_prodmarketingmarketing

Development/CI Environment (target.name != 'prod')

Target schemaCustom schemaResulting schema
alice_devNonealice_dev
alice_devmarketingalice_dev
dbt_cloud_pr_123_456Nonedbt_cloud_pr_123_456
dbt_cloud_pr_123_456marketingdbt_cloud_pr_123_456

Similar to the regular macro, this approach guarantees that schemas from different environments will not collide.

dbt ships with a macro for this use case — called generate_schema_name_for_env — which is disabled by default. To enable it, add a custom generate_schema_name macro to your project that contains the following code:

macros/get_custom_schema.sql
-- put this in macros/get_custom_schema.sql

{% macro generate_schema_name(custom_schema_name, node) -%}
{{ generate_schema_name_for_env(custom_schema_name, node) }}
{%- endmacro %}

When using this macro, you'll need to set the target name in your production job to prod.

Managing environments

In the generate_schema_name macro examples shown in the built-in alternative pattern section, the target.name context variable is used to change the schema name that dbt generates for models. If the generate_schema_name macro in your project uses the target.name context variable, you must ensure that your different dbt environments are configured accordingly. While you can use any naming scheme you'd like, we typically recommend:

  • dev — Your local development environment; configured in a profiles.yml file on your computer.
  • ci — A continuous integration environment running on pull requests in GitHub, GitLab, and so on.
  • prod — The production deployment of your dbt project, like in dbt Cloud, Airflow, or similar.

If your schema names are being generated incorrectly, double-check your target name in the relevant environment.

For more information, consult the managing environments in dbt Core guide.

0
Loading