Configuration File Examples¶

This document provides ready-to-use configuration examples for various validation scenarios. Copy and adapt these examples for your specific use case.

Table of Contents¶


SQL Server Examples¶

Example 1: Minimal SQL Server Configuration¶

Perfect for quick testing or simple validations.

source_platform: SqlServer
target_platform: Snowflake
output_directory_path: ./validation_output

source_connection:
  mode: credentials
  host: localhost
  port: 1433
  username: sa
  password: YourPassword123
  database: TestDB

target_connection:
  mode: default

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: false

tables:
  - fully_qualified_name: TestDB.dbo.Customers
    use_column_selection_as_exclude_list: false
    column_selection_list: []
Copy

Example 2: Production SQL Server with SSL/TLS¶

Secure production setup with proper encryption settings.

source_platform: SqlServer
target_platform: Snowflake
output_directory_path: /data/validation/production
max_threads: 16

source_connection:
  mode: credentials
  host: sqlserver-prod.company.com
  port: 1433
  username: validation_user
  password: SecurePassword123!
  database: PRODUCTION_DB
  trust_server_certificate: "no"
  encrypt: "yes"

target_connection:
  mode: name
  name: snowflake_production

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: true
  max_failed_rows_number: 250

comparison_configuration:
  tolerance: 0.01

logging_configuration:
  level: INFO
  console_level: WARNING
  file_level: DEBUG

database_mappings:
  PRODUCTION_DB: PROD_SNOWFLAKE

schema_mappings:
  dbo: PUBLIC

tables:
  - fully_qualified_name: PRODUCTION_DB.dbo.Orders
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - order_id
    chunk_number: 20

  - fully_qualified_name: PRODUCTION_DB.dbo.Customers
    use_column_selection_as_exclude_list: true
    column_selection_list:
      - ssn
      - credit_card_number
    index_column_list:
      - customer_id

  - fully_qualified_name: PRODUCTION_DB.dbo.Products
    use_column_selection_as_exclude_list: false
    column_selection_list:
      - product_id
      - product_name
      - price
      - category
    where_clause: "is_active = 1"
    target_where_clause: "is_active = 1"
Copy

Example 3: SQL Server Incremental Validation¶

Validate only recent changes using date filters.

source_platform: SqlServer
target_platform: Snowflake
output_directory_path: ./incremental_validation
max_threads: auto

source_connection:
  mode: credentials
  host: sqlserver.company.com
  port: 1433
  username: etl_user
  password: EtlPassword123
  database: DataWarehouse

target_connection:
  mode: name
  name: snowflake_dw

validation_configuration:
  schema_validation: false
  metrics_validation: true
  row_validation: true
  max_failed_rows_number: 100

comparison_configuration:
  tolerance: 0.001

tables:
  - fully_qualified_name: DataWarehouse.dbo.FactSales
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - transaction_id
    where_clause: "transaction_date >= DATEADD(day, -7, GETDATE())"
    target_where_clause: "transaction_date >= DATEADD(day, -7, CURRENT_TIMESTAMP)"
    chunk_number: 10

  - fully_qualified_name: DataWarehouse.dbo.DimCustomer
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    where_clause: "modified_date >= DATEADD(day, -7, GETDATE())"
    target_where_clause: "modified_date >= DATEADD(day, -7, CURRENT_TIMESTAMP)"
Copy

Example 4: SQL Server with Column Mappings¶

Handle renamed columns during migration.

source_platform: SqlServer
target_platform: Snowflake
output_directory_path: ./validation_with_mappings
max_threads: 8

source_connection:
  mode: credentials
  host: legacy-sql.company.com
  port: 1433
  username: migration_user
  password: MigrationPass123
  database: LegacyDB

target_connection:
  mode: name
  name: snowflake_modernized

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: true

tables:
  - fully_qualified_name: LegacyDB.dbo.CustomerMaster
    use_column_selection_as_exclude_list: false
    column_selection_list:
      - cust_id
      - cust_name
      - cust_email
      - cust_phone
      - addr_line1
      - addr_line2
      - addr_city
      - addr_state
      - addr_zip
    index_column_list:
      - cust_id
    column_mappings:
      cust_id: customer_id
      cust_name: customer_name
      cust_email: email_address
      cust_phone: phone_number
      addr_line1: address_line_1
      addr_line2: address_line_2
      addr_city: city
      addr_state: state
      addr_zip: postal_code
Copy

Teradata Examples¶

Example 5: Basic Teradata Configuration¶

Simple Teradata to Snowflake validation.

source_platform: Teradata
target_platform: Snowflake
output_directory_path: ./teradata_validation
target_database: SNOWFLAKE_DB
max_threads: auto

source_connection:
  mode: credentials
  host: teradata.company.com
  username: td_user
  password: TeradataPass123
  database: PROD_DB

target_connection:
  mode: default

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: false

tables:
  - fully_qualified_name: PROD_DB.sales_data
    use_column_selection_as_exclude_list: false
    column_selection_list: []
Copy

Example 6: Teradata Large-Scale Migration¶

Enterprise-scale Teradata migration validation.

source_platform: Teradata
target_platform: Snowflake
output_directory_path: /opt/validation/teradata_migration
target_database: ENTERPRISE_DW
max_threads: 32

source_connection:
  mode: credentials
  host: teradata-prod.company.com
  username: validation_service
  password: SecureTdPassword!123
  database: ENTERPRISE_TD

target_connection:
  mode: name
  name: snowflake_enterprise

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: true
  max_failed_rows_number: 500
  exclude_metrics: false

comparison_configuration:
  tolerance: 0.005

logging_configuration:
  level: INFO
  console_level: ERROR
  file_level: DEBUG

schema_mappings:
  ENTERPRISE_TD: PUBLIC

tables:
  # Large fact table - high chunking
  - fully_qualified_name: ENTERPRISE_TD.fact_transactions
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - transaction_key
    chunk_number: 100
    max_failed_rows_number: 1000

  # Dimension table with exclusions
  - fully_qualified_name: ENTERPRISE_TD.dim_customer
    use_column_selection_as_exclude_list: true
    column_selection_list:
      - ssn
      - tax_id
      - bank_account
    index_column_list:
      - customer_key
    chunk_number: 20

  # Filtered validation for current year only
  - fully_qualified_name: ENTERPRISE_TD.fact_sales
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - sale_key
    where_clause: "sale_date >= DATE '2024-01-01'"
    target_where_clause: "sale_date >= DATE '2024-01-01'"
    chunk_number: 50
Copy

Example 7: Teradata Multi-Schema Validation¶

Validate multiple schemas with different settings.

source_platform: Teradata
target_platform: Snowflake
output_directory_path: ./multi_schema_validation
target_database: MULTI_SCHEMA_DW
max_threads: 16

source_connection:
  mode: credentials
  host: teradata.company.com
  username: schema_validator
  password: ValidatorPass123
  database: DBC

target_connection:
  mode: name
  name: snowflake_multi_schema

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: true
  max_failed_rows_number: 200

comparison_configuration:
  tolerance: 0.01

schema_mappings:
  SALES_SCHEMA: SALES
  FINANCE_SCHEMA: FINANCE
  HR_SCHEMA: HUMAN_RESOURCES

tables:
  # Sales schema tables
  - fully_qualified_name: SALES_SCHEMA.orders
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - order_id

  - fully_qualified_name: SALES_SCHEMA.order_details
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - order_id
      - line_number

  # Finance schema tables
  - fully_qualified_name: FINANCE_SCHEMA.invoices
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - invoice_id
    chunk_number: 30

  # HR schema tables - exclude sensitive data
  - fully_qualified_name: HR_SCHEMA.employees
    use_column_selection_as_exclude_list: true
    column_selection_list:
      - ssn
      - salary
      - bank_account
      - emergency_contact
    index_column_list:
      - employee_id
Copy

Redshift Examples¶

Example 8: Basic Redshift Configuration¶

Simple Redshift to Snowflake validation.

source_platform: Redshift
target_platform: Snowflake
output_directory_path: ./redshift_validation
max_threads: auto

source_connection:
  mode: credentials
  host: redshift-cluster.us-east-1.redshift.amazonaws.com
  port: 5439
  username: redshift_user
  password: RedshiftPass123
  database: analytics

target_connection:
  mode: default

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: false

tables:
  - fully_qualified_name: public.events
    use_column_selection_as_exclude_list: false
    column_selection_list: []
Copy

Example 9: Redshift Data Lake Migration¶

Validate Redshift data lake migration to Snowflake.

source_platform: Redshift
target_platform: Snowflake
output_directory_path: /data/validation/redshift_datalake
max_threads: 24

source_connection:
  mode: credentials
  host: datalake-cluster.us-west-2.redshift.amazonaws.com
  port: 5439
  username: datalake_validator
  password: SecureRedshiftPass!123
  database: datalake

target_connection:
  mode: name
  name: snowflake_datalake

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: true
  max_failed_rows_number: 500

comparison_configuration:
  tolerance: 0.02

logging_configuration:
  level: INFO
  console_level: WARNING
  file_level: DEBUG

database_mappings:
  datalake: DATALAKE_PROD

schema_mappings:
  public: PUBLIC
  staging: STAGING
  analytics: ANALYTICS

tables:
  # Raw data staging
  - fully_qualified_name: staging.raw_events
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - event_id
    chunk_number: 80
    max_failed_rows_number: 1000

  # Analytics tables
  - fully_qualified_name: analytics.user_sessions
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - session_id
    where_clause: "session_date >= CURRENT_DATE - 30"
    target_where_clause: "session_date >= CURRENT_DATE - 30"
    chunk_number: 40

  - fully_qualified_name: analytics.aggregated_metrics
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - metric_id
      - date_key
    chunk_number: 20

  # Public schema - exclude system columns
  - fully_qualified_name: public.customer_360
    use_column_selection_as_exclude_list: true
    column_selection_list:
      - _sys_created_at
      - _sys_modified_at
      - _sys_user_id
    index_column_list:
      - customer_id
    chunk_number: 50
Copy

Example 10: Redshift with Complex Filtering¶

Advanced filtering and column selection for Redshift.

source_platform: Redshift
target_platform: Snowflake
output_directory_path: ./complex_validation
max_threads: 16

source_connection:
  mode: credentials
  host: analytics-cluster.region.redshift.amazonaws.com
  port: 5439
  username: validator
  password: ComplexPass123!
  database: analytics_db

target_connection:
  mode: name
  name: snowflake_analytics

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: true
  max_failed_rows_number: 100

comparison_configuration:
  tolerance: 0.01

tables:
  # Complex WHERE clause with multiple conditions
  - fully_qualified_name: public.transactions
    use_column_selection_as_exclude_list: false
    column_selection_list:
      - transaction_id
      - customer_id
      - amount
      - transaction_date
      - status
      - payment_method
    index_column_list:
      - transaction_id
    where_clause: "status IN ('completed', 'settled') AND amount > 100 AND transaction_date >= '2024-01-01' AND payment_method != 'test'"
    target_where_clause: "status IN ('completed', 'settled') AND amount > 100 AND transaction_date >= '2024-01-01' AND payment_method != 'test'"
    chunk_number: 30

  # Date-based partitioned validation
  - fully_qualified_name: public.daily_metrics
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - metric_date
      - metric_id
    where_clause: "metric_date >= DATE_TRUNC('month', CURRENT_DATE)"
    target_where_clause: "metric_date >= DATE_TRUNC('month', CURRENT_DATE)"
    chunk_number: 10

  # Selective column validation with mappings
  - fully_qualified_name: public.legacy_customers
    use_column_selection_as_exclude_list: false
    column_selection_list:
      - cust_no
      - full_name
      - email_addr
      - phone_num
      - signup_dt
    index_column_list:
      - cust_no
    column_mappings:
      cust_no: customer_number
      full_name: customer_name
      email_addr: email
      phone_num: phone
      signup_dt: signup_date
Copy

Scenario-Based Examples¶

Example 11: Development Environment - Fast Validation¶

Quick validation for development with minimal overhead.

source_platform: SqlServer
target_platform: Snowflake
output_directory_path: ./dev_validation
max_threads: 4

source_connection:
  mode: credentials
  host: localhost
  port: 1433
  username: dev_user
  password: DevPass123
  database: DevDB
  trust_server_certificate: "yes"
  encrypt: "no"

target_connection:
  mode: default

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: false  # Skip for speed

comparison_configuration:
  tolerance: 0.05  # More lenient

logging_configuration:
  level: WARNING  # Less verbose

tables:
  - fully_qualified_name: DevDB.dbo.TestTable1
    use_column_selection_as_exclude_list: false
    column_selection_list: []

  - fully_qualified_name: DevDB.dbo.TestTable2
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    where_clause: "id <= 1000"  # Limit rows for speed
    target_where_clause: "id <= 1000"
Copy

Example 12: Staging Environment - Comprehensive Testing¶

Thorough validation for staging environment.

source_platform: SqlServer
target_platform: Snowflake
output_directory_path: /staging/validation
max_threads: 12

source_connection:
  mode: credentials
  host: sqlserver-staging.company.com
  port: 1433
  username: staging_validator
  password: StagingPass123!
  database: STAGING_DB
  trust_server_certificate: "no"
  encrypt: "yes"

target_connection:
  mode: name
  name: snowflake_staging

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: true
  max_failed_rows_number: 200

comparison_configuration:
  tolerance: 0.01

logging_configuration:
  level: INFO
  console_level: INFO
  file_level: DEBUG

tables:
  - fully_qualified_name: STAGING_DB.dbo.Orders
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - order_id
    chunk_number: 15

  - fully_qualified_name: STAGING_DB.dbo.Customers
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - customer_id
    chunk_number: 10
Copy

Example 13: Production - Maximum Performance¶

Optimized for large-scale production validation.

source_platform: Teradata
target_platform: Snowflake
output_directory_path: /prod/validation
target_database: PROD_SNOWFLAKE
max_threads: 32  # Maximum parallelization

source_connection:
  mode: credentials
  host: teradata-prod.company.com
  username: prod_validator
  password: SecureProdPass!123
  database: PROD_TD

target_connection:
  mode: name
  name: snowflake_prod

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: true
  max_failed_rows_number: 1000
  exclude_metrics: false

comparison_configuration:
  tolerance: 0.001  # Strict tolerance

logging_configuration:
  level: INFO
  console_level: ERROR  # Minimal console output
  file_level: DEBUG  # Detailed file logging

tables:
  # Massive fact table - heavy chunking
  - fully_qualified_name: PROD_TD.fact_transactions
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - transaction_key
    chunk_number: 200  # Maximum chunking
    max_failed_rows_number: 5000

  # Other tables...
  - fully_qualified_name: PROD_TD.fact_sales
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - sale_key
    chunk_number: 150
Copy

Example 14: PII-Compliant Validation¶

Exclude sensitive personally identifiable information.

source_platform: SqlServer
target_platform: Snowflake
output_directory_path: ./pii_compliant_validation
max_threads: auto

source_connection:
  mode: credentials
  host: sqlserver.company.com
  port: 1433
  username: compliance_validator
  password: CompliancePass123!
  database: CustomerDB

target_connection:
  mode: name
  name: snowflake_customer

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: true
  max_failed_rows_number: 100

comparison_configuration:
  tolerance: 0.01

tables:
  - fully_qualified_name: CustomerDB.dbo.Customers
    use_column_selection_as_exclude_list: true
    column_selection_list:
      # Exclude all PII columns
      - ssn
      - tax_id
      - date_of_birth
      - drivers_license
      - passport_number
      - credit_card_number
      - bank_account_number
      - email_address
      - phone_number
      - home_address
      - mailing_address
    index_column_list:
      - customer_id

  - fully_qualified_name: CustomerDB.dbo.Transactions
    use_column_selection_as_exclude_list: true
    column_selection_list:
      - credit_card_last4
      - account_number
      - routing_number
    index_column_list:
      - transaction_id
Copy

Example 15: Migration Cutover Validation¶

Final validation before production cutover.

source_platform: Redshift
target_platform: Snowflake
output_directory_path: /cutover/validation
max_threads: 32

source_connection:
  mode: credentials
  host: redshift-prod.amazonaws.com
  port: 5439
  username: cutover_validator
  password: CutoverPass123!
  database: production

target_connection:
  mode: name
  name: snowflake_production_new

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: true
  max_failed_rows_number: 0  # Zero tolerance for cutover

comparison_configuration:
  tolerance: 0.0001  # Extremely strict

logging_configuration:
  level: DEBUG  # Maximum detail
  console_level: INFO
  file_level: DEBUG

# Validate ALL tables
tables:
  - fully_qualified_name: public.customers
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - customer_id
    chunk_number: 50

  - fully_qualified_name: public.orders
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - order_id
    chunk_number: 100

  - fully_qualified_name: public.order_items
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - order_id
      - item_id
    chunk_number: 150

  - fully_qualified_name: public.products
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - product_id
    chunk_number: 20

  - fully_qualified_name: public.inventory
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - inventory_id
    chunk_number: 30
Copy

Example 16: Continuous Validation - Daily Incremental¶

Daily validation of incremental loads.

source_platform: SqlServer
target_platform: Snowflake
output_directory_path: /daily/validation
max_threads: 16

source_connection:
  mode: credentials
  host: sqlserver.company.com
  port: 1433
  username: daily_validator
  password: DailyPass123!
  database: ETL_DB

target_connection:
  mode: name
  name: snowflake_daily

validation_configuration:
  schema_validation: false  # Skip schema check for daily runs
  metrics_validation: true
  row_validation: true
  max_failed_rows_number: 100

comparison_configuration:
  tolerance: 0.01

logging_configuration:
  level: INFO
  console_level: WARNING
  file_level: INFO

tables:
  # Validate only yesterday's data
  - fully_qualified_name: ETL_DB.dbo.DailyTransactions
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - transaction_id
    where_clause: "CAST(created_date AS DATE) = CAST(DATEADD(day, -1, GETDATE()) AS DATE)"
    target_where_clause: "CAST(created_date AS DATE) = CAST(DATEADD(day, -1, CURRENT_TIMESTAMP) AS DATE)"
    chunk_number: 10

  - fully_qualified_name: ETL_DB.dbo.DailyOrders
    use_column_selection_as_exclude_list: false
    column_selection_list: []
    index_column_list:
      - order_id
    where_clause: "CAST(order_date AS DATE) = CAST(DATEADD(day, -1, GETDATE()) AS DATE)"
    target_where_clause: "CAST(order_date AS DATE) = CAST(DATEADD(day, -1, CURRENT_TIMESTAMP) AS DATE)"
    chunk_number: 5
Copy

Tips for Adapting These Examples¶

  1. Replace connection details with your actual database credentials

  2. Update table names to match your schema

  3. Adjust max_threads based on your system resources

  4. Modify chunk_number based on table sizes

  5. Set appropriate tolerance based on your data characteristics

  6. Customize where_clause for your filtering needs

  7. Add/remove columns in column_selection_list as needed

  8. Update column_mappings if column names differ

Security Best Practices¶

  • Never commit configuration files with real passwords to version control

  • Use environment variables for sensitive data

  • Consider secret management tools (AWS Secrets Manager, Azure Key Vault, etc.)

  • Use least privilege database accounts for validation

  • Encrypt configuration files containing sensitive information