You are viewing documentation about an older version (1.7.0). View latest version

snowflake.ml.modeling.metrics.roc_curve

snowflake.ml.modeling.metrics.roc_curve(*, df: DataFrame, y_true_col_name: str, y_score_col_name: str, pos_label: Optional[Union[str, int]] = None, sample_weight_col_name: Optional[str] = None, drop_intermediate: bool = True) Tuple[ndarray[Any, dtype[float64]], ndarray[Any, dtype[float64]], ndarray[Any, dtype[float64]]]

Compute Receiver operating characteristic (ROC).

Note: this implementation is restricted to the binary classification task.

Parameters:
  • df – snowpark.DataFrame Input dataframe.

  • y_true_col_name – string Column name representing true binary labels. If labels are not either {-1, 1} or {0, 1}, then pos_label should be explicitly given.

  • y_score_col_name – string Column name representing target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).

  • pos_label – string, default=None The label of the positive class. When pos_label=None, if y_true is in {-1, 1} or {0, 1}, pos_label is set to 1, otherwise an error will be raised.

  • sample_weight_col_name – string, default=None Column name representing sample weights.

  • drop_intermediate – boolean, default=True Whether to drop some suboptimal thresholds which would not appear on a plotted ROC curve. This is useful in order to create lighter ROC curves.

Returns:

Tuple containing following items
fpr - ndarray of shape (>2,)

Increasing false positive rates such that element i is the false positive rate of predictions with score >= thresholds[i].

tpr - ndarray of shape (>2,)

Increasing true positive rates such that element i is the true positive rate of predictions with score >= thresholds[i].

thresholds - ndarray of shape = (n_thresholds,)

Decreasing thresholds on the decision function used to compute fpr and tpr. thresholds[0] represents no instances being predicted and is arbitrarily set to max(y_score) + 1.