You are viewing documentation about an older version (1.7.0). View latest version

snowflake.ml.modeling.metrics.precision_recall_curve¶

snowflake.ml.modeling.metrics.precision_recall_curve(*, df: DataFrame, y_true_col_name: str, probas_pred_col_name: str, pos_label: Optional[Union[str, int]] = None, sample_weight_col_name: Optional[str] = None) → Tuple[ndarray[Any, dtype[float64]], ndarray[Any, dtype[float64]], ndarray[Any, dtype[float64]]]¶

Compute precision-recall pairs for different probability thresholds.

Note: this implementation is restricted to the binary classification task.

The precision is the ratio tp / (tp + fp) where tp is the number of true positives and fp the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative.

The recall is the ratio tp / (tp + fn) where tp is the number of true positives and fn the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples.

The last precision and recall values are 1. and 0. respectively and do not have a corresponding threshold. This ensures that the graph starts on the y axis.

The first precision and recall values are precision=class balance and recall=1.0 which corresponds to a classifier that always predicts the positive class.

Parameters:
  • df – snowpark.DataFrame Input dataframe.

  • y_true_col_name – string Column name representing true binary labels. If labels are not either {-1, 1} or {0, 1}, then pos_label should be explicitly given.

  • probas_pred_col_name – string Column name representing target scores. Can either be probability estimates of the positive class, or non-thresholded measure of decisions (as returned by decision_function on some classifiers).

  • pos_label – string or int, default=None The label of the positive class. When pos_label=None, if y_true is in {-1, 1} or {0, 1}, pos_label is set to 1, otherwise an error will be raised.

  • sample_weight_col_name – string, default=None Column name representing sample weights.

Returns:

Tuple containing following items
precision - ndarray of shape (n_thresholds + 1,)

Precision values such that element i is the precision of predictions with score >= thresholds[i] and the last element is 1.

recall - ndarray of shape (n_thresholds + 1,)

Decreasing recall values such that element i is the recall of predictions with score >= thresholds[i] and the last element is 0.

thresholds - ndarray of shape (n_thresholds,)

Increasing thresholds on the decision function used to compute precision and recall.