snowflake.core.cortex.inference_service.CompleteRequestΒΆ

class snowflake.core.cortex.inference_service.CompleteRequest(*, model: Annotated[str, Strict(strict=True)], anthropic: CompleteRequestAnthropic | None = None, openai: CompleteRequestOpenai | None = None, messages: Annotated[List[CompleteRequestMessagesInner], MinLen(min_length=1)], temperature: Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0.0)])] | Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])] | None = None, top_p: Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0.0), Le(le=1.0)])] | Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0), Le(le=1)])] | None = 1.0, max_tokens: Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])] | None = 4096, max_output_tokens: Annotated[int, Strict(strict=True)] | None = None, response_format: CompleteRequestResponseFormat | None = None, guardrails: GuardrailsConfig | None = None, tools: List[Tool] | None = None, tool_choice: ToolChoice | None = None, provisioned_throughput_id: Annotated[str, Strict(strict=True)] | None = None, sf_ml_xp_inflight_prompt_action: Annotated[str, Strict(strict=True)] | None = None, sf_ml_xp_inflight_prompt_client_id: Annotated[str, Strict(strict=True)] | None = None, sf_ml_xp_inflight_prompt_public_key: Annotated[str, Strict(strict=True)] | None = None, stream: Annotated[bool, Strict(strict=True)] | None = True)ΒΆ

Bases: BaseModel

A model object representing the CompleteRequest resource.

Constructs an object of type CompleteRequest with the provided properties.

Parameters:

anthropic : CompleteRequestAnthropic, optional

openai : CompleteRequestOpenai, optional

temperaturefloat, optional

Temperature controls the amount of randomness used in response generation. A higher temperature corresponds to more randomness.

top_pfloat, default 1.0

Threshold probability for nucleus sampling. A higher top-p value increases the diversity of tokens that the model considers, while a lower value results in more predictable output.

max_tokensint, default 4096

The maximum number of output tokens to produce. The default value is model-dependent.

max_output_tokensint, optional

Deprecated in favor of β€œmax_tokens”, which has identical behavior.

response_format : CompleteRequestResponseFormat, optional

guardrails : GuardrailsConfig, optional

toolslist[Tool], optional

List of tools to be used during tool calling

tool_choice : ToolChoice, optional

provisioned_throughput_idstr, optional

The provisioned throughput ID to be used with the request.

sf_ml_xp_inflight_prompt_actionstr, optional

Reserved

sf_ml_xp_inflight_prompt_client_idstr, optional

Reserved

sf_ml_xp_inflight_prompt_public_keystr, optional

Reserved

streambool, default True

Reserved

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Methods

classmethod from_dict(obj: dict) β†’ CompleteRequestΒΆ

Create an instance of CompleteRequest from a dict.

classmethod from_json(json_str: str) β†’ CompleteRequestΒΆ

Create an instance of CompleteRequest from a JSON string

to_dict(hide_readonly_properties: bool = False) β†’ dict[str, Any]ΒΆ

Returns the dictionary representation of the model using alias

to_dict_without_readonly_properties() β†’ dict[str, Any]ΒΆ

Return the dictionary representation of the model without readonly properties.

to_json() β†’ strΒΆ

Returns the JSON representation of the model using alias.

to_str() β†’ strΒΆ

Returns the string representation of the model using alias.