Agentic ML with Cortex Code

Cortex Code powers agentic ML by autonomously planning, executing, and iterating on machine learning workflows – all with native context awareness of your data, models, notebooks, features on a unified platform. You can ask it to complete a single task or pursue a broader goal that requires multi-step reasoning and action. From a natural language prompt, Cortex Code can explore data, engineer features, train and evaluate models, debug issues, and prepare models for deployment—all within Snowflake’s governed environment.

Rather than requiring you to manually orchestrate every step, Cortex Code can determine what needs to happen next, write and run code, evaluate results, recover from errors, and continue iterating. You define the objective, approve key decisions, and step in when you want to steer. Cortex Code handles the execution.

Cortex Code is available in the Cortex Code CLI, Cortex Code Desktop, and Cortex Code in Snowsight, which is directly integrated with Snowflake Notebooks. Data scientists can use Cortex Code to automate the full ML lifecycle, including data exploration, feature engineering, model training, evaluation, deployment, monitoring, and debugging.

Using the machine-learning skill

Cortex Code comes embedded with a rich set of ML-specific capabilities that streamline design, implementation, and optimization of end-to-end workflows in Snowflake ML. The ML skill activates automatically based on your intent. When you describe an ML task in natural language, Cortex Code detects the intent, loads the ML skill, and routes to the appropriate subskill to handle your request. To ensure that the skill is always triggered, you can also explicitly invoke it by typing /machine-learning before your prompt.

/machine-learning Train a model on the table CUSTOMER_FEATURES to predict churn

Subskills

The ML skill is composed of specialized subskills. Cortex Code selects and routes between them autonomously based on your goal. For example, if you train a model and then say “deploy it”, Cortex Code switches from the training subskill to the deployment subskill automatically, preserving context across the full workflow. A single conversation can span many subskills without you needing to invoke each one explicitly.

The following table lists the available subskills.

SubskillDescriptionExample Prompt
Model TrainingTrain classification, regression, forecasting, and clustering models with scikit-learn, XGBoost, LightGBM, PyTorch, or AutoGluonTrain an XGBoost model to predict churn on CUSTOMER_FEATURES
Feature StoreCreate entities, feature views, pipelines, training datasets with point-in-time correctness, and online servingCreate a feature store with 7-day rolling spend windows
Model RegistryLog, version, and deploy trained models for inferenceRegister this model in the registry and deploy it
Batch InferenceScore tables at scale using mv.run() on warehouses or mv.run_batch() on compute poolsRun batch predictions on my test data using CHURN_MODEL V2
Online InferenceDeploy low-latency REST endpoints on Snowpark Container ServicesDeploy my model as a real-time inference service
Distributed TrainingMulti-node XGBoost, LightGBM, and PyTorch training with hyperparameter tuningTrain XGBoost distributed across 4 GPU nodes with 20 HPO trials
ML JobsSubmit Python scripts to Snowflake compute pools for GPU or high-memory workloadsSubmit this training script to run on a GPU compute pool
Pipeline OrchestrationSchedule and chain ML tasks using Snowflake Task Graphs (DAGs)Create a weekly pipeline: refresh features, retrain, deploy if better
Experiment TrackingLog parameters, metrics, and artifacts across training runs for comparisonTrack this experiment and log accuracy, F1, and training time
PreprocessingScale, encode, impute, and transform features before trainingNormalize numeric features and one-hot encode categoricals
Model MonitoringTrack drift, performance degradation, and set alerts on production modelsSet up drift monitoring for my fraud detection model
DatasetsCreate versioned, immutable data snapshots for reproducible trainingCreate a versioned dataset from my training query results
ML LineageTrace the flow from source tables to features, datasets, models, and servicesWhat data trained this model? Show the full lineage.
Inference LogsQuery auto-captured inference data from model servicesShow me the last 100 inference requests for FRAUD_MODEL
Debug InferenceDiagnose dtype errors, OOM issues, container crashes, and service failuresMy model service is returning errors — help me debug

Prepare data

Use Cortex Code to explore, clean, and transform raw data into ML-ready inputs. This is useful when you need to understand your data before modeling or prepare features for training.

Transform and preprocess features

Use Cortex Code to scale, encode, impute, and engineer features to prepare data for training.

Normalize numeric features and one-hot encode categoricals

Save as datasets

Use Cortex Code to create versioned, immutable data snapshots for reproducible training with Snowflake Datasets.

Create a versioned dataset from my training query results as Parquet files

Manage and serve features

Use Cortex Code to set up your Feature Store. Define entities, build feature views, generate training datasets with point-in-time correctness, and enable online serving. This is useful when you need reusable, production-ready features that multiple models and teams can share and reuse or serve in real time.

Create and manage feature views

Create a feature store with 7-day rolling spend windows for each customer

Generate training datasets

Generate a training dataset with point-in-time correctness using my Feature Views

Serve features online

Enable online serving on CUSTOMER_SPEND_FEATURES for real-time inference

Train and tune models

Use Cortex Code to train classification, regression, forecasting, and clustering models using scikit-learn, XGBoost, LightGBM, PyTorch, or AutoGluon. This is useful when you want to build a predictive model from your data, run distributed training at scale, or let the Auto ML skill handle the full workflow autonomously.

For more information, see Train models and Distributed training.

Train a model

Use Cortex Code to train models on your data with the framework and configuration you specify.

Train an XGBoost model to predict churn on CUSTOMER_FEATURES

Distribute training

Use Cortex Code to distribute training across multiple nodes for large datasets or parallel hyperparameter search.

Train XGBoost distributed across 4 GPU nodes with 20 HPO trials

Submit training jobs to compute pools

Use Cortex Code to submit custom training scripts to Snowflake compute pools for GPU or high-memory workloads.

Submit this training script to run on a GPU compute pool

For more information, see Container Runtime for ML.

Experiment tracking

Use Cortex Code to log parameters, metrics, and artifacts for every training run so you can compare trials systematically.

Track this experiment and log accuracy, F1, and training time

For more information, see Experiment Tracking.

Manage and deploy models

Use Cortex Code to version, register, and deploy trained models for batch or real-time inference. This is useful when you have a trained model and need to move it to production.

For more information, see Model Registry and Inference overview.

Register and version models

Use Cortex Code to log models to the Snowflake Model Registry with metadata, lineage, and deployment-ready packaging.

Register the best model in the registry as CHURN_MODEL V2

For more information, see Model management.

Run batch predictions

Use Cortex Code to score entire tables or datasets at scale using warehouses or compute pools.

Run batch predictions on my test data using CHURN_MODEL V2

For more information, see Batch inference.

Deploy real-time inference endpoints

Use Cortex Code to deploy models as low-latency REST endpoints on Snowpark Container Services.

Deploy my fraud model as a real-time inference service on SPCS

For more information, see Real-time inference.

Operationalize ML workflows

Use Cortex Code to automate recurring ML workflows and manage the full pipeline lifecycle. This is useful when you need to schedule retraining, chain tasks, or trace how models were produced.

Build and schedule pipelines

Use Cortex Code to build Snowflake Task Graphs (DAGs) that chain feature refresh, training, evaluation, and deployment.

Create a weekly pipeline: refresh features, retrain, deploy if better

For more information, see Create pipelines.

Trace data and model lineage

Use Cortex Code to trace the full provenance graph from source tables through features, datasets, models, and deployed services.

What data trained CHURN_MODEL V2? Show the full lineage.

For more information, see ML Lineage.

Monitor and observe models

Use Cortex Code to track production model health, analyze inference patterns, and debug failures. This is useful when you have models in production and need to ensure they continue performing well.

For more information, see Model Observability.

Monitor drift and performance

Use Cortex Code to track drift, performance degradation, and set up alerts before issues impact users.

Set up drift monitoring for my fraud detection model

Analyze inference logs

Use Cortex Code to query auto-captured request and response data from model services and analyze usage patterns.

Show me the last 100 inference requests for FRAUD_MODEL

For more information, see Inference logs.

Debug inference failures

Use Cortex Code to diagnose dtype mismatches, OOM issues, container crashes, and service failures.

My model service is returning TypeErrors. Help me debug.

Auto ML: End-to-end model exploration and optimization

The Auto ML skill builds a complete machine learning model from scratch. It explores the model space across multiple frameworks (XGBoost, LightGBM, AutoGluon, and others), runs hyperparameter optimization for each candidate, and engineers features to maximize performance. The full workflow runs autonomously: data exploration, quality gates, feature engineering, model training, hyperparameter tuning, and evaluation. You describe what machine learning task you want to perform and Cortex Code handles the rest.

Build a model predicting whether a customer will churn
in the next 30 days using the CUSTOMERS table with the highest accuracy
Forecast daily order volume for the next 2 weeks from ORDERS table

Supported task types:

  • Classification — predict a category (churn yes/no, fraud detection, lead scoring)
  • Regression — predict a number (revenue forecast, price estimation, demand prediction)
  • Time-series forecasting — predict future values over time (sales forecast, capacity planning)
  • Clustering — segment data into groups (customer segmentation, anomaly grouping)

All you need to provide is:

  • Data source — a table, view, or DataFrame
  • What to predict or analyze — for example, “predict churn”, “forecast sales”, “segment customers”
  • Time budget — how long you want to invest in building the model (default is 30 minutes)

Cortex Code infers everything else (task type, target column, evaluation metric, train/test split strategy) and asks you to confirm before proceeding.

How it works:

The skill follows a 4-step workflow with built-in checkpoints:

  1. Understand and configure — Cortex Code explores your data, proposes a configuration (task type, metric, time budget, trial plan), and waits for your approval.
  2. Quality gates — Mandatory data quality checks (leakage detection, baseline scoring, EDA) before any modeling begins.
  3. Build — Cortex Code works autonomously on feature engineering, model training, and evaluation. It reports after feature engineering and after each training trial with metrics, leaderboard, and overfitting checks. You can interject at any time.
  4. Deliver — Cortex Code updates the experiment manifest and presents next steps (deeper research, training notebook, batch inference, Feature Store, monitoring, or retraining).

Cortex Code asks how long you want to invest (default 30 minutes). It proposes a trial plan, tracks elapsed time, and adjusts if the budget runs low.

What you get:

ArtifactLocation
Working notebookYour Workspace notebook (or a local .ipynb in CLI). Contains EDA and all experimentation cells with the full decision trail.
Experiment trackingSnowflake Experiment Tracking: queryable metrics, parameters, and artifacts for every trial.
Manifest<experiment_name>_manifest.json in the notebook’s directory. Lists all Snowflake objects created.
Diagnostic plotsLogged as experiment artifacts: feature importance, SHAP, confusion matrix, forecast components, and others.