Informatica PowerCenter - dbt mappings and transformations¶

This page shows how each Informatica PowerCenter Mapping transformation is converted into a dbt model, with a before (Informatica) and after (Snowflake) example for each. For the dbt project layout and the three-tier model architecture, see the dbt output overview. For the concept map and supported-component matrix, see the Informatica PowerCenter overview.

Each transformation becomes a SQL model: source reads become stg_ staging models, intermediate transformations become int_ models, and Targets become marts models. The examples below are taken from the test suite.

Sources and staging¶

Source Qualifier and Source Definition¶

A Source Qualifier (or Source Definition) becomes a staging model that reads from a table declared in sources.yml. One staging model per source is generated along with a sources.yml that lists the source tables.

Informatica (m_different_source_qualifiers.XML):

<SOURCE NAME="DimCurrency" DATABASETYPE="Microsoft SQL Server" OWNERNAME="dbo">
   <!-- CurrencyKey (primary key), CurrencyAlternateKey, CurrencyName source fields omitted -->
</SOURCE>
<TRANSFORMATION NAME="SQ_DimCurrency" TYPE="Source Qualifier">
   <!-- CurrencyKey, CurrencyAlternateKey, CurrencyName ports omitted -->
   <TABLEATTRIBUTE NAME="Sql Query" VALUE=""/>
   <TABLEATTRIBUTE NAME="Source Filter" VALUE=""/>
   <TABLEATTRIBUTE NAME="Select Distinct" VALUE="NO"/>
</TRANSFORMATION>

sources.yml:

version: 2
sources:
  - name: raw
    schema: YOUR_SCHEMA
    database: YOUR_DB
    tables:
      - name: DimCustomer
      - name: FactInternetSales

Snowflake (stg_raw__SQ_DimCurrency.sql):

SELECT
   DimCurrency.CurrencyKey AS CurrencyKey,
   DimCurrency.CurrencyAlternateKey AS CurrencyAlternateKey,
   DimCurrency.CurrencyName AS CurrencyName
FROM
   {{ source('raw', 'DimCurrency') }} AS DimCurrency

Important

Replace the YOUR_SCHEMA and YOUR_DB placeholders in sources.yml with your actual Snowflake schema and database before you run the project.

Row-level transformations¶

Expression¶

An Expression transformation becomes an int_ model whose SELECT list carries each output port’s formula. Informatica operators and functions convert to their Snowflake equivalents (for example, || stays string concatenation). For the full list, see the expression functions reference (coming soon).

Informatica (MappingForTest.XML):

<TRANSFORMATION NAME="EXPTRANS" TYPE="Expression">
   <TRANSFORMFIELD NAME="NAME" PORTTYPE="INPUT/OUTPUT" EXPRESSION="NAME"/>
   <TRANSFORMFIELD NAME="Description" PORTTYPE="INPUT/OUTPUT" EXPRESSION="Description" DEFAULTVALUE="&apos;&lt;No description&gt;&apos;"/>
   <TRANSFORMFIELD NAME="NEW_NAME" PORTTYPE="OUTPUT" EXPRESSION="NAME || &apos;!!!&apos;"/>
</TRANSFORMATION>

Snowflake (int_EXPTRANS.sql):

WITH source_data AS
(
   SELECT
      NAME1 AS NAME
   FROM
      {{ ref('int_SALARY_RTRTRANS__HIGH_SALARY') }}
)
SELECT
   NAME AS NAME,
   '<No description>' AS Description,
   NAME || '!!!' AS NEW_NAME
FROM
   source_data

Filter¶

A Filter transformation becomes an int_ model with a WHERE clause that carries the Filter condition.

Informatica (m_dummy_filter_test.XML):

<TRANSFORMATION NAME="t_dummy_filter_1" TYPE="Filter">
   <!-- EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, SALARY input/output ports omitted -->
   <TABLEATTRIBUTE NAME="Filter Condition" VALUE="SALARY &gt; 10 AND SALARY &lt;= 100000"/>
</TRANSFORMATION>

Snowflake (int_t_dummy_filter_1.sql):

SELECT
    SQ_employee_data.EMPLOYEE_ID AS EMPLOYEE_ID,
    null AS FIRST_NAME,
    SQ_employee_data.LAST_NAME AS LAST_NAME,
    SQ_employee_data.EMAIL AS EMAIL,
    SQ_employee_data.SALARY AS SALARY
FROM
    {{ ref('stg_raw__SQ_employee_data') }} AS SQ_employee_data
WHERE
    SALARY > 10
  AND SALARY <= 100000

Combining data¶

Joiner¶

A Joiner becomes an int_ model with a SQL JOIN. The Informatica join type maps to the matching SQL join: Normal Join to INNER JOIN, Master Outer to RIGHT JOIN, Detail Outer to LEFT JOIN, and Full Outer to FULL OUTER JOIN. The Join condition becomes the ON clause.

Informatica (m_dummy_joiner_test.XML):

<TRANSFORMATION NAME="t_join_DimProduct_FactInternetSales" TYPE="Joiner">
   <!-- master (DimProduct) and detail (FactInternetSales) ports omitted -->
   <TABLEATTRIBUTE NAME="Join Condition" VALUE="ProductKey1 = ProductKey"/>
   <TABLEATTRIBUTE NAME="Join Type" VALUE="Normal Join"/>
</TRANSFORMATION>

Snowflake (int_t_join_DimProduct_FactInternetSales.sql):

SELECT
   SQ_FactInternetSales.UnitPrice AS UnitPrice,
   SQ_FactInternetSales.OrderQuantity AS OrderQuantity,
   SQ_FactInternetSales.CurrencyKey AS CurrencyKey,
   SQ_FactInternetSales.ProductKey AS ProductKey1,
   SQ_DimProduct.ProductKey AS ProductKey,
   SQ_DimProduct.EnglishProductName AS EnglishProductName,
   SQ_DimProduct.SpanishProductName AS SpanishProductName
FROM
   {{ ref('stg_raw__SQ_FactInternetSales') }} AS SQ_FactInternetSales
   INNER JOIN
      {{ ref('stg_raw__SQ_DimProduct') }} AS SQ_DimProduct
      ON SQ_FactInternetSales.ProductKey = SQ_DimProduct.ProductKey

Union¶

A Union transformation becomes an int_ model that combines its inputs with UNION ALL. Each input is a CTE that aligns its columns to the Union’s output ports.

Informatica (union_simple_two_inputs.xml):

<TRANSFORMATION NAME="UNION" TYPE="Custom Transformation" TEMPLATENAME="Union Transformation">
   <GROUP NAME="OUTPUT" TYPE="OUTPUT"/>
   <GROUP NAME="A" TYPE="INPUT"/>
   <GROUP NAME="B" TYPE="INPUT"/>
   <!-- vala (output), vala1 (group A), vala2 (group B) ports omitted -->
   <FIELDDEPENDENCY INPUTFIELD="vala1" OUTPUTFIELD="vala"/>
   <FIELDDEPENDENCY INPUTFIELD="vala2" OUTPUTFIELD="vala"/>
</TRANSFORMATION>

Snowflake (int_union.sql):

WITH a AS
(
   SELECT
      vala vala
   FROM
      {{ ref('stg_raw__SQ_tableA') }}
),
b AS
(
   SELECT
      valb vala
   FROM
      {{ ref('stg_raw__SQ_tableB') }}
)
SELECT
   a.vala vala
FROM
   a
UNION ALL
SELECT
   b.vala vala
FROM
   b

Lookup¶

A Lookup becomes an int_ model with a LEFT JOIN to the lookup source. The lookup source is a CTE that keeps one row per lookup key with QUALIFY ROW_NUMBER(), matching Informatica’s single-row lookup behavior.

Informatica (lookups_sample.XML):

<TRANSFORMATION NAME="t_lookup_currency" TYPE="Lookup Procedure">
   <!-- CurrencyKey, CurrencyAlternateKey, CurrencyName (lookup/output), CurrencyKey1 (input) ports omitted -->
   <TABLEATTRIBUTE NAME="Lookup table name" VALUE="DimCurrency"/>
   <TABLEATTRIBUTE NAME="Lookup condition" VALUE="CurrencyKey = CurrencyKey1"/>
   <TABLEATTRIBUTE NAME="Lookup policy on multiple match" VALUE="Use Any Value"/>
</TRANSFORMATION>

Snowflake (int_t_lookup_currency.sql):

WITH lookup_reference AS
(
   SELECT
      CurrencyKey ,
      CurrencyAlternateKey ,
      CurrencyName
   FROM
      {{ source('raw', 'DimCurrency') }}
   QUALIFY
      ROW_NUMBER() OVER (
      PARTITION BY
         CurrencyKey
      ORDER BY
         (
            SELECT
               --** SSC-FDM-INF0070 - Use Any Value RETURNS AN ARBITRARY ROW WHEN MULTIPLE ROWS MATCH; IN SNOWFLAKE THE SELECTED ROW MAY VARY PER RUN. ADD AN ORDER BY TO THE LOOKUP SQL OVERRIDE ONLY IF A SPECIFIC ROW IS REQUIRED. **
               null
         )) = 1
),
input_data AS
(
   SELECT
      CurrencyKey CurrencyKey1,
      TotalProductCost SQ_FactInternetSales__TotalProductCost,
      OrderDate SQ_FactInternetSales__OrderDate,
      ProductKey SQ_FactInternetSales__ProductKey1,
      CustomerKey SQ_FactInternetSales__CustomerKey1
   FROM
      {{ ref('stg_raw__SQ_FactInternetSales') }}
)
SELECT
   input_data.CurrencyKey1 ,
   input_data.SQ_FactInternetSales__TotalProductCost ,
   input_data.SQ_FactInternetSales__OrderDate ,
   input_data.SQ_FactInternetSales__ProductKey1 ,
   input_data.SQ_FactInternetSales__CustomerKey1 ,
   lookup_reference.CurrencyKey,
   lookup_reference.CurrencyAlternateKey,
   lookup_reference.CurrencyName
FROM
   input_data
   LEFT JOIN
      lookup_reference
      ON lookup_reference.CurrencyKey = input_data.CurrencyKey1

Note

When a lookup can match multiple rows, Snowflake’s row selection is not guaranteed to be stable between runs, so the functional-difference marker SSC-FDM-INF0070 is emitted. Add an ORDER BY to the lookup if you need a specific row. An unconnected Lookup is converted the same way, with the lookup result returned to the calling expression.

Aggregation and ranking¶

Aggregator¶

An Aggregator becomes an int_ model that computes each aggregate with a window function partitioned by the group-by ports, then keeps one row per group with QUALIFY ROW_NUMBER() = 1. This reproduces Informatica’s one-row-per-group output.

Informatica (AggregatorMapping.XML):

<TRANSFORMATION NAME="AGGTRANS" TYPE="Aggregator">
   <TRANSFORMFIELD NAME="NAME" PORTTYPE="INPUT/OUTPUT" EXPRESSION="NAME"/>
   <TRANSFORMFIELD NAME="deptsalary" PORTTYPE="OUTPUT" EXPRESSION="SUM(SALARY)"/>
   <TRANSFORMFIELD NAME="SALARY" PORTTYPE="INPUT/OUTPUT" EXPRESSION="SALARY"/>
   <TRANSFORMFIELD NAME="DEPARTMENT" PORTTYPE="INPUT/OUTPUT" EXPRESSIONTYPE="GROUPBY" EXPRESSION="DEPARTMENT"/>
</TRANSFORMATION>

Snowflake (int_AGGTRANS.sql):

WITH source_data AS
(
   SELECT
      NAME,
      SALARY,
      DEPARTMENT
   FROM
      {{ ref('stg_raw__SQ_EMPLOYEE') }}
),
aggregation AS
(
   SELECT
      LAST_VALUE(NAME)
      OVER (
      PARTITION BY (
         DEPARTMENT)
      ORDER BY
         NAME,
         SALARY,
         DEPARTMENT) AS NAME,
      SUM(SALARY)
      OVER (
      PARTITION BY (
         DEPARTMENT)) AS deptsalary,
      LAST_VALUE(SALARY)
      OVER (
      PARTITION BY (
         DEPARTMENT)
      ORDER BY
         NAME,
         SALARY,
         DEPARTMENT) AS SALARY,
      LAST_VALUE(DEPARTMENT)
      OVER (
      PARTITION BY (
         DEPARTMENT)
      ORDER BY
         NAME,
         SALARY,
         DEPARTMENT) AS DEPARTMENT
   FROM
      source_data
   QUALIFY
      --** SSC-FDM-INF0002 - ALL INPUT COLUMNS WILL BE USED TO DETERMINE GROUP ORDER. **
      ROW_NUMBER()
      OVER (
      PARTITION BY (
         DEPARTMENT)
      ORDER BY
         NAME,
         SALARY,
         DEPARTMENT) = 1
)
SELECT
    NAME,
    deptsalary,
    SALARY,
    DEPARTMENT
FROM
    aggregation

Rank¶

A Rank transformation becomes an int_ model that assigns ROW_NUMBER() over the rank port and keeps the top or bottom N rows with QUALIFY. The example below selects the bottom 10 rows by price.

Informatica (rank_bottom_n.xml):

<TRANSFORMATION NAME="RNK_BottomProducts" TYPE="Rank">
   <!-- product_id (input/output), RANKINDEX (output) ports omitted -->
   <TRANSFORMFIELD NAME="price" PORTTYPE="INPUT/OUTPUT" EXPRESSIONTYPE="RANKPORT"/>
   <TABLEATTRIBUTE NAME="Top/Bottom" VALUE="BOTTOM"/>
   <TABLEATTRIBUTE NAME="Number of Ranks" VALUE="10"/>
</TRANSFORMATION>

Snowflake (int_rank.sql):

WITH source_data AS
(
   SELECT
      product_id,
      price
   FROM
      {{ ref('stg_raw__SQ_Products') }}
),
ranked_data AS
(
   SELECT
      product_id,
      price,
      ROW_NUMBER()
      OVER (
      ORDER BY
         price ASC) AS RANKINDEX
   FROM
      source_data
   QUALIFY
      RANKINDEX <= 10
)
SELECT
   RANKINDEX,
   product_id,
   price
FROM
   ranked_data

Sorting¶

Sorter¶

A Sorter becomes an ORDER BY on the model that consumes its output, using the Sorter’s sort keys and their direction. When the Sorter has the distinct option enabled, duplicate rows are also removed. Because dbt models are unordered relations, the ordering is applied where the data is consumed rather than stored.

Routing and normalizing¶

Router¶

A Router produces one int_ model per output group, each named int_{router_name}__{group_name} and filtered by that group’s condition. The default group keeps the rows that match no other group.

Informatica (routerexample.XML):

<TRANSFORMATION NAME="RTRTRANS" TYPE="Router">
   <GROUP NAME="INPUT" TYPE="INPUT"/>
   <GROUP NAME="high_credits" TYPE="OUTPUT" EXPRESSION="Credits &gt; 70"/>
   <GROUP NAME="low_credits" TYPE="OUTPUT" EXPRESSION="Credits &gt; 5"/>
   <GROUP NAME="DEFAULT1" TYPE="OUTPUT/DEFAULT"/>
   <!-- FirstName, LastName, Department, Credits ports omitted -->
</TRANSFORMATION>

Snowflake (int_RTRTRANS__high_credits.sql):

WITH source_data AS
(
   SELECT
      FirstName,
      LastName,
      Department,
      Credits
   FROM
      {{ ref('stg_raw__SQ_Students') }}
)
SELECT
   FirstName FirstName1,
   LastName LastName1,
   Department Department1,
   Credits Credits1
FROM
   source_data
WHERE
   Credits > 70

Normalizer¶

A Normalizer becomes an int_ model that pivots repeating groups from columns into rows. Each occurrence becomes a CTE, the occurrences are combined with UNION ALL, and a generated-column ID (gcid_) and generated key (gk_) reproduce Informatica’s occurrence numbering.

Informatica (normalizer_simple_wide_to_tall.xml):

<TRANSFORMATION NAME="NRM_SimpleExpense" TYPE="Normalizer">
   <!-- EmpId, EmpName source fields omitted -->
   <SOURCEFIELD NAME="Expense" OCCURS="5"/>
   <!-- Expense_in1..in5 (inputs mapped from food/rent/transport/medical/misc_expense),
        Expense (output), GK_Expense, GCID_Expense ports omitted -->
</TRANSFORMATION>

Snowflake (int_normalizer.sql):

WITH source_data AS
(
   SELECT
      empid,
      empname,
      food_expense,
      rent_expense,
      transport_expense,
      medical_expense,
      misc_expense
   FROM
      {{ ref('stg_raw__SQ_NRM_SIMPLE_SRC') }}
),
occurrence_1 AS
(
   SELECT
      empid,
      empname,
      food_expense AS expense,
      1 AS gcid_expense
   FROM
      source_data
),
occurrence_2 AS
(
   SELECT
      empid,
      empname,
      rent_expense AS expense,
      2 AS gcid_expense
   FROM
      source_data
),
normalized AS
(
   SELECT * FROM occurrence_1
   UNION ALL
   SELECT * FROM occurrence_2
),
with_gk AS
(
   SELECT
      empid,
      empname,
      expense,
      gcid_expense,
      ROW_NUMBER()
      OVER (
      ORDER BY
         empid,
         gcid_expense) AS gk_expense
   FROM
      normalized
)
SELECT
   *
FROM
   with_gk

Note

The example is trimmed to two occurrences for brevity. One occurrence_n CTE per repeating column is generated. VSAM Normalizers are not supported and generate an EWI.

Keys and load strategy¶

Sequence Generator¶

A Sequence Generator becomes a ROW_NUMBER() expression, offset by the Sequence’s start value. Because a dbt model is reprocessed on each run and rows are not inherently ordered, the generated numbers are not stable across runs, so SSC-EWI-INF0043 is emitted.

Informatica (seqgen_start_value_offset.xml):

<TRANSFORMATION NAME="SEQ_RowKey" TYPE="Sequence">
   <!-- NEXTVAL, CURRVAL output ports omitted -->
   <TABLEATTRIBUTE NAME="Start Value" VALUE="10"/>
   <TABLEATTRIBUTE NAME="Increment By" VALUE="1"/>
   <TABLEATTRIBUTE NAME="Cycle" VALUE="NO"/>
</TRANSFORMATION>

Snowflake (int_seqgen.sql):

!!!RESOLVE EWI!!! /*** SSC-EWI-INF0043 - THE SEQUENCE GENERATOR IS TRANSLATED TO ROW_NUMBER() OVER (ORDER BY 1). ROW ORDERING IS NON-DETERMINISTIC (VALUES MAY BE ASSIGNED TO DIFFERENT ROWS BETWEEN RUNS) AND SEQUENCE STATE IS NOT PERSISTED ACROSS DBT EXECUTIONS (ALWAYS RESTARTS FROM START VALUE). ADD A DETERMINISTIC ORDER BY IF STABLE SEQUENCE ASSIGNMENT IS REQUIRED. ***/!!!
WITH source_data AS
(
   SELECT
      *
   FROM
      {{ ref('stg_raw__SQ_Products') }}
),
sequence_data AS
(
   SELECT
      *,
      ROW_NUMBER()
      OVER (
      ORDER BY
         1) + 9 AS NEXTVAL
   FROM
      source_data
)
SELECT
   *
FROM
   sequence_data

Note

Add a deterministic ORDER BY to the ROW_NUMBER() window if you need stable surrogate keys, or generate keys with a Snowflake sequence for state that persists across runs.

Update Strategy¶

An Update Strategy directs how rows reach the Target. When a Mapping uses one, the Target mart model is materialized as incremental with the merge strategy, so inserts and updates are applied with a Snowflake MERGE.

Informatica (update_strategy_multiple_targets_same_mapping.xml):

<TRANSFORMATION NAME="UPDTRANS" TYPE="Update Strategy">
   <!-- input/output ports omitted -->
   <TABLEATTRIBUTE NAME="Update Strategy Expression" VALUE="DD_UPDATE"/>
</TRANSFORMATION>
<TARGET NAME="After_Students_Clear" DATABASETYPE="Microsoft SQL Server">
   <TARGETFIELD NAME="Id" KEYTYPE="PRIMARY KEY" DATATYPE="int"/>
   <TARGETFIELD NAME="Name" KEYTYPE="NOT A KEY" DATATYPE="nvarchar"/>
   <!-- remaining target fields omitted -->
</TARGET>

Snowflake (mart model):

{{ config(
    materialized='incremental',
    incremental_strategy='merge',
    unique_key='Id',
    merge_update_columns=['Name'],
    alias='After_Students_Clear'
) }}
SELECT
   SALARY AS Id,
   NAME AS Name
FROM
   {{ ref('int_UPDTRANS') }}

Reuse and external logic¶

Mapplet¶

A Mapplet is a reusable group of transformations. A Mapplet is converted into a dbt macro (macros/{mapplet_name}.sql) and called from each Mapping that uses the Mapplet, so the shared logic is defined once.

Stored Procedure¶

A Stored Procedure transformation calls an external procedure. The call is preserved so you can point it at the migrated Snowflake stored procedure. A connected Stored Procedure participates in the data flow; a disconnected (pre- or post-session) Stored Procedure becomes a dbt hook. When the transformation references a named database connection, an EWI is emitted so you can confirm the target.

Targets¶

Target Definition¶

A Target Definition becomes a mart model named after the target table. It reads from the last intermediate model in the Mapping and aliases the columns to the target’s column names.

Informatica (m_dummy_source_target_test.XML):

<TARGET NAME="TargetTable" DATABASETYPE="Microsoft SQL Server">
   <TARGETFIELD NAME="TotalPrice" DATATYPE="decimal" PRECISION="19" SCALE="4"/>
   <TARGETFIELD NAME="gender" DATATYPE="char" PRECISION="1"/>
   <TARGETFIELD NAME="fullName" DATATYPE="varchar" PRECISION="101"/>
   <TARGETFIELD NAME="description" DATATYPE="varchar" PRECISION="500"/>
</TARGET>

Snowflake (TargetTable.sql):

WITH source_data AS
(
   SELECT
       t_dummy_filter_1__TotalPrice,
      GENDER,
       t_dummy_filter_1__description
   FROM
      {{ ref('int_t_dummy_gender') }}
)
SELECT
    sd.t_dummy_filter_1__TotalPrice AS TotalPrice,
    sd.GENDER AS gender,
    sd.t_dummy_filter_1__description AS description
FROM
    source_data AS sd

When the Mapping includes an Update Strategy, the mart is materialized incrementally with a merge, as shown in Update Strategy.