AI code conversion with source-system verification
AI code conversion with source-system verification improves the accuracy of the conversion process. It runs the generated test case against both the converted code in Snowflake and the original source code in the source database. It then checks that both the source code and the converted code produce equivalent results.
Never use your production account for AI code conversion with source-system verification. Use a testing account instead, as AI code conversion may introduce unpredictable modifications.
Compared to the default AI code conversion where SnowConvert only verifies that the converted code is running successfully on Snowflake, AI code conversion with source-system verification ensures functional parity, higher confidence, and overall better conversion quality.
Prerequisites
Select your source database platform to learn more about the prerequisites for running AI code conversion with source-system verification.
```{tab} SQL Server
AI code conversion with source-system verification currently requires an instance of SQL Server hosted in Snowpark Container Services (SPCS). This SQL Server instance acts as a host for the test cases that will be executed by the source-system verification process. The test results from this instance will be used as a baseline to compare against the test results generated by executing the test cases on the converted Snowflake SQL code.
The prerequisites are:
- An instance of SQL Server should be running in your SPCS environment. Download and run this [shell script](https://snowconvert.snowflake.com/storage/linux/prod/scripts/push_mssql_server.sh) to deploy an instance of SQL Server in the SPCS environment.
- To obtain your `dns_name` you can run on your account `SHOW SERVICES LIKE 'mssql_server_demo_service';`
- Understand and agree to the legal responsibilities of using AI code conversion with source-system verification on your source data platform code inside the SPCS environment.
- Create a custom specification file (for example, `spec.yaml`). This file contains the connection parameters for the SQL Server instance running inside SPCS.
Example:
```yaml
mode: "TWO_SIDED"
n_tests: 3
repair: true
num_workers: 2
source_test_database:
connection_params:
hostname: <dns_name> # Example: "mssql-server-demo-service.n4yw.svc.spcs.internal"
port: 1433
username: "user_name"
password: "password"
connection_metadata:
type: "SPCS"
spcs_service:
name: "MSSQL_SERVER_DEMO_SERVICE"
database: "SNOWCONVERT_AI"
schema: "PUBLIC"
project:
custom_instructions:
- "Preserve NULL handling semantics from source"
- "Use ANSI SQL where possible"
exclude: "test/.*\\.sql|backup/.*"
extra_target_prerequisites: CREATE SCHEMA IF NOT EXISTS test_schema;
verified objects:
- "[dbo].[AlreadyVerifiedProc]"
- "[dbo].[VerifiedView]"
use_custom_database: true
extra_file_dependencies:
salesdb/summary.sql:
- salesdb/customers.sql
- salesdb/orders.sql
additional_options:
--n-tests: 5
--project.custom-instructions:
- "Additional instruction via additional_options"
--project.extra-file-dependencies:
additional_salesdb/summary.sql:
- additional_salesdb/customers.sql
- additional_salesdb/orders.sql
--project.use-custom-database: false
--project.verified-objects:
- "[dbo].[additional_AlreadyVerifiedProc]"
- "[dbo].[additional_VerifiedView]"
```
- The `mode` "TWO_SIDED" indicates that the AI code conversion with source-system verification process will run on both source database code and target database code.
- The host name, port number, and credentials for the SQL Server database running in SPCS are specified under `source_test_database`.
- The name of the container service, database name, and schema name where the test cases for the source will be executed are specified under `connection_metadata`.
```
```{tab} Redshift
AI code conversion with source-system verification requires a Redshift instance running in your AWS environment. The instance should be accessible from the Snowpark Container Services (SPCS) container, which acts as a host for the test cases that will be executed by the source-system verification process. The test results from this instance will be used as a baseline to compare against the test results generated by executing the same test cases on the converted Snowflake SQL code.
The prerequisites are:
- Create a network rule and an External Access Integration (EAI) to allow incoming traffic on port 5439 (the default Redshift port) from the IP addresses used by your Snowflake environment and establish a connection with the Redshift instance.
Example:
```sql
-- Create a network rule allowing egress to Redshift
CREATE OR REPLACE NETWORK RULE SNOWCONVERT_AI.PUBLIC.AI_MIGRATIONS_REDSHIFT_NETWORK_RULE
MODE = EGRESS
TYPE = HOST_PORT
VALUE_LIST = ('your-testing-redshift-account.us-west-2.redshift-serverless.amazonaws.com:5439');
-- Create the External Access Integration
CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION AI_MIGRATIONS_REDSHIFT_EAI
ALLOWED_NETWORK_RULES = (SNOWCONVERT_AI.PUBLIC.AI_MIGRATIONS_REDSHIFT_NETWORK_RULE)
ENABLED = TRUE;
```
- Create a custom specification file (for example, `spec.yaml`). This file contains the connection parameters for the Redshift instance to be used for source-system verification.
Example:
```yaml
mode: "TWO_SIDED"
n_tests: 3
repair: true
num_workers: 2
source_test_database:
connection_params:
hostname: <dns_name> # Example: your-testing-redshift-account.us-west-2.redshift-serverless.amazonaws.com
port: 5439
database: "database_name"
username: "user_name"
password: "password"
connection_metadata:
type: "EXTERNAL"
spcs_service:
eai: "AI_MIGRATIONS_REDSHIFT_EAI"
project:
custom_instructions:
- "Preserve NULL handling semantics from source"
- "Use ANSI SQL where possible"
exclude: "test/.*\\.sql|backup/.*"
extra_target_prerequisites: CREATE SCHEMA IF NOT EXISTS test_schema;
verified objects:
- "[dbo].[AlreadyVerifiedProc]"
- "[dbo].[VerifiedView]"
use_custom_database: true
extra_file_dependencies:
salesdb/summary.sql:
- salesdb/customers.sql
- salesdb/orders.sql
additional_options:
--n-tests: 5
--project.custom-instructions:
- "Additional instruction via additional_options"
--project.extra-file-dependencies:
additional_salesdb/summary.sql:
- additional_salesdb/customers.sql
- additional_salesdb/orders.sql
--project.use-custom-database: false
--project.verified-objects:
- "[dbo].[additional_AlreadyVerifiedProc]"
- "[dbo].[additional_VerifiedView]"
```
- The mode “TWO_SIDED” indicates that the AI code conversion with source-system verification process will run on both source database code and target database code.
- The host name, port number, and credentials for the Redshift database are specified under `source_test_database`.
- The name of the container service, database name, and schema name where the test cases for the source will be executed are specified under `connection_metadata`.
```
Configuration parameters
This section describes the components of the YAML file used to control the AI code conversion with source-system verification process.
Project options
The project section configures how the conversion and verification process handles your source objects. The following options are available:
Additional options
The additional_options section accepts arbitrary key-value pairs that are passed directly to the CLI as flags. This means new CLI flags can be used immediately through the spec file without requiring code changes.
For example, the following spec entries:
additional_options:
--some-new-flag: 42
--experimental.feature: true
Are passed through to the API as:
{
"--some-new-flag": 42,
"--experimental.feature": ""
}
Boolean values set to true are passed as flag-only options (empty string value). Non-boolean values are passed with their specified value.
YAML to API payload mapping
The project options and additional options from the YAML spec are converted into a flat dictionary of CLI flags in the API payload. The following example shows how the full spec maps:
{
"--project.custom-instructions": "'[\"Use MERGE instead of INSERT for upserts\",\"Preserve original column aliases\"]'",
"--project.exclude": "'\"staging_*\"'",
"--project.extra-target-prerequisites": "'\"CREATE SCHEMA IF NOT EXISTS analytics;\"'",
"--project.path-replacements": "'{\"//old/source/path\":\"//new/source/path\",\"//legacy/scripts\":\"//migrated/scripts\"}'",
"--project.verified-objects": "'[\"DBO.CUSTOMERS\",\"DBO.ORDERS\"]'",
"--project.use-custom-database": "",
"--project.extra-file-dependencies": "'{\"main_procedure.sql\":[\"helper_functions.sql\",\"common_types.sql\"],\"etl_load.sql\":[\"staging_tables.sql\"]}'",
"--n-tests": 5,
"--some-new-flag": 42,
"--experimental.feature": ""
}
Note the following mapping behaviors:
- Project options are prefixed with
--project. and use kebab-case (for example, use_custom_database becomes --project.use-custom-database).
- List values are serialized as JSON arrays.
- Dictionary values are serialized as JSON objects.
- Boolean flags (such as
use_custom_database: true) are passed with an empty string value, indicating a flag-only option.
- Top-level spec fields like
n_tests are mapped as --n-tests.
- Entries in
additional_options are passed through as-is.