Snowpark Submit Python API¶
In addition to the Snowpark Submit CLI, you can submit and manage Spark workloads programmatically from Python scripts using the Snowpark Submit Python API. The API uses your existing Snowpark session for authentication, making it a natural fit for Python-based pipelines and notebooks that already work with Snowflake.
Like the CLI, the Python API supports Spark applications written in Python, Java, and Scala.
Use the Python API instead of the CLI when you want to:
- Embed job submission directly in a Python script or notebook.
- Capture job status or logs as structured data and act on them programmatically.
- Build a custom orchestration loop around submission, status polling, and cancellation.
For CLI-based job submission, see Using Snowpark Submit.
Prerequisites¶
- Python 3.10 or later (earlier than 3.13)
snowpark-submitinstalled:pip install snowpark-submit- A Snowflake Connection in
connections.toml: A valid Snowflake connection with a warehouse and a compute pool specified. For more information, see Manage Snowflake connections.
Quick start¶
The following example submits a PySpark script and waits for it to finish:
Examples¶
client.submit()¶
Submits a workload to Snowflake and optionally waits for it to complete.
Fire-and-forget: Submit without blocking, then retrieve the workload name for later status checks.
Blocking with logs: Wait for the job to finish and print logs in real time. Raise an exception if the job fails.
client.status()¶
Returns the current status and logs for a workload that is running or has already completed. Use the full workload name
(with the timestamp suffix) that client.submit() returns.
Note
Log availability has a small latency of a few seconds to a minute. When an event table isn’t configured to store log data, logs are retained for only a short period, such as five minutes or less.
client.kill()¶
Terminates a running workload. Pass the full workload name (with the timestamp suffix).
client.list_ workloads()¶
Lists workloads in a compute pool, optionally filtered by name prefix. Output is printed to the console.
Submitting Java and Scala applications¶
The Python API can submit Java and Scala JAR files, not just Python scripts. Set file to the path of a fat (uber) JAR
and main_class to the fully qualified class name.
For instructions on building fat JARs for Java and Scala, see the Java and Scala tabs in Using Snowpark Submit.
API Reference¶
SnowparkSubmit¶
The main client class. Accepts a Snowpark Session and exposes methods for submitting and managing workloads.
WorkloadConfig¶
A dataclass that describes the Spark workload to run. Pass a WorkloadConfig instance to client.submit().
StatusInfo¶
The return type for all API methods. Fields are None for operations that don’t produce that value.
| Field | Type | Description |
|---|---|---|
exit_code | int | 0 = success; non-zero = failure |
terminated | bool | None | Whether the workload has finished |
workload_name | str | None | Full workload name, including the appended timestamp |
service_status | str | None | SPCS service status |
workload_status | str | None | PENDING, RUNNING, DONE, or FAILED |
created_on | str | None | Creation timestamp (UTC) |
started_at | str | None | Start timestamp (UTC) |
terminated_at | str | None | Termination timestamp (UTC) |
job_exit_code | int | None | Exit code of the Spark job itself |
logs | list[str] | Application log lines |
error | str | None | Error message, if any |