Design patterns for dynamic tables¶
This page collects common recipes for structuring dynamic table pipelines. Use the decision table below to find the right starting point. For per-table configuration (target lag, refresh mode, warehouse selection), see Quick-start best practices for dynamic tables.
Choose a pattern¶
| Pattern | Use when | How it works |
|---|---|---|
| Multi-table pipeline | Your definition has more than one logical step | TARGET_LAG = DOWNSTREAM on intermediates; medallion layers |
| Controller table | You need coordinated refreshes across independent pipelines | Zero-row coordination table with independent leaf lags |
| SCD Type 1 deduplication | Source appends CDC records; you need current state | QUALIFY ROW_NUMBER; handles out-of-order arrival |
| Stream-static join | CDC enrichment with dimension lookups and delete propagation | MERGE INTO SELF with CHANGES and LEFT OUTER JOIN |
| Top-K leaderboard | Maintain a bounded result set that shrinks or grows | MERGE INTO SELF reading current state through FROM SELF |
| Daily snapshot | Accumulate periodic snapshots without CHANGES | INSERT INTO SELF with ROW_TIMESTAMP |
Decompose a definition into a multi-table pipeline¶
Break a monolithic definition into a pipeline of two or more dynamic tables, each handling one logical step. Many teams organize pipeline layers using medallion vocabulary (bronze, silver, gold). In dynamic table terms: bronze is the raw landing table, silver is a cleaned or conformed dynamic table with TARGET_LAG = DOWNSTREAM, and gold is an aggregated dynamic table with a time-based freshness goal.
Decomposing joins and aggregations across multiple dynamic tables enables each step to refresh incrementally. Place joins in the first dynamic table and aggregations in the next.
Important
Avoid combining joins, aggregations, and window functions in a single dynamic table. This forces the entire computation to re-run on every refresh. Separating these operators into different pipeline stages enables incremental refresh for each step.
The silver table (dt_orders_enriched) uses TARGET_LAG = DOWNSTREAM so it refreshes only
when the gold table needs fresh data. The gold table (dt_orders_daily) owns the
time-based freshness goal and drives the entire pipeline. A common variation adds a
second gold table that rolls up hourly data into daily totals as an additional aggregation
layer.
Synchronize multiple pipelines with a controller table¶
A controller dynamic table merges independent pipelines into a single coordinated refresh. When triggered, the controller reads all inputs at the same timestamp. Between controller refreshes, leaf tables refresh on their own schedules. Use this pattern when a dashboard or downstream consumer joins data from independent domains that must be temporally consistent.
The controller is a zero-row dynamic table that depends on multiple leaf tables. Leaves have their own time-based target lag and refresh independently between controller refreshes.
If the controller breaks, leaves continue refreshing independently on their own schedules. If one leaf dynamic table fails, the controller also fails. Monitor all tables in the controller group, not the controller alone.
Important
Avoid listing too many leaf tables in the FROM clause. The cross join in the controller is never run (because of LIMIT 0), but compilation time grows with the number of tables. For pipelines with more than five or six leaves, consider grouping leaves into intermediate controllers.
Latest row deduplication with SCD Type 1¶
When your base table receives append-only change data capture (CDC) records, use QUALIFY ROW_NUMBER to keep only the latest row per business key. The window function picks the correct row regardless of ingestion order, handling out-of-order arrival without additional logic.
Using SELECT * EXCLUDE supports incremental schema evolution: columns added to or dropped from the
base table automatically propagate without changing the dynamic table definition.
For more on schema evolution, see Modify dynamic tables.
To handle soft deletes, add a filter to the QUALIFY expression:
This expresses soft-delete logic as a single window expression with no additional orchestration.
For best incremental performance, use a monotonic ORDER BY column (a sequence number or timestamp that only increases). Keep the QUALIFY expression as the top-level construct in the dynamic table. If you need further transformations, place them in a downstream dynamic table rather than combining them in the same definition.
Tip
Use ROW_TIMESTAMP as the built-in mechanism for tracking when rows were last refreshed.
This is cleaner than adding CURRENT_TIMESTAMP() as a column in the definition, which
forces full refresh mode.
Stream-static join with full CDC (MERGE)¶
A stream-static join reads only the rows that changed since the last refresh (from a source
using CHANGES()) and joins them with a dimension table that provides lookup values. The
dimension table is read in full on each refresh but doesn’t drive which rows are processed.
For an append-only variant using INSERT INTO SELF, see Append-only enrichment.
This pattern enriches CDC changes from stock with dimension lookups before merging:
stockchanges are enriched with dimension tables (products,warehouses) through LEFT OUTER JOIN.- ROW_NUMBER deduplication ensures one source row per key for the MERGE.
- When a key has both DELETE and INSERT in the same interval (an update), the INSERT is kept.
- Deletes propagate, inserts create new rows, and updates modify existing rows.
Top-K leaderboard (MERGE reading SELF)¶
This pattern uses custom incrementalization to maintain a bounded top-K result set that shrinks or grows as scores change.
This pattern maintains a bounded top-K result set that shrinks or grows as scores change:
FROM SELF AS curreads the dynamic table’s current contents. SELF can be both the MERGE target and a read source.- FULL OUTER JOIN merges the current state with incoming changes from
player_scores. - Players dropping below rank 3 are deleted; new top-3 players are inserted.
- Custom incremental dynamic tables can maintain bounded-size result sets that standard refresh modes can’t express, and these result sets can be reused as memoized state for the next round of incrementalization. In the example above, the previous top-K results are combined with current changes to derive fresh results, instead of having to scan all historical data.
Daily snapshot (no CHANGES clause)¶
Not every custom incremental dynamic table needs CHANGES(). You can re-read source data on each refresh and accumulate snapshots over time using ROW_TIMESTAMP.
Each refresh appends a full snapshot of src into the dynamic table. ROW_TIMESTAMP = TRUE automatically records when each row was inserted into the dynamic table, making it easy to query point-in-time state. Because there’s no CHANGES() clause, every refresh reads the full source. Keep the source small or use filtering to control cost.
Anti-patterns¶
These patterns cause failures or unexpected behavior in production.
Setting all tables to TARGET_ LAG = DOWNSTREAM¶
Why it fails: If no dynamic table in the pipeline has a time-based target lag, nothing triggers a refresh. The pipeline produces tables that don’t change and no error is raised.
What to do instead: At least one dynamic table (the terminal table) must have a time-based target lag. Intermediate tables use DOWNSTREAM.
Short target lag relative to retention time¶
Why it fails: If lag exceeds the retention window due to transient errors, the table goes stale and requires reinitialization.
What to do instead: Set target lag several times shorter than retention time to leave a buffer for recovery. This applies to both incremental and full refresh modes.
What’s next¶
- Quick-start best practices for dynamic tables
- Optimize queries for incremental refresh
- Set the target lag for a dynamic table
- Modify dynamic tables
- Understanding costs for dynamic tables
- For workloads that require imperative logic or don’t fit these patterns, see Decision guide for dynamic tables.