You are viewing documentation about an older version (1.7.0). View latest version

snowflake.ml.modeling.metrics.confusion_matrix¶

snowflake.ml.modeling.metrics.confusion_matrix(*, df: DataFrame, y_true_col_name: str, y_pred_col_name: str, labels: Optional[Union[_SupportsArray[dtype], _NestedSequence[_SupportsArray[dtype]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]] = None, sample_weight_col_name: Optional[str] = None, normalize: Optional[str] = None) → Union[ndarray[Any, dtype[int64]], ndarray[Any, dtype[float64]]]¶

Compute confusion matrix to evaluate the accuracy of a classification.

By definition a confusion matrix C is such that C_{i, j} is equal to the number of observations known to be in group i and predicted to be in group j.

Thus in binary classification, the count of true negatives is C_{0,0}, false negatives is C_{1,0}, true positives is C_{1,1} and false positives is C_{0,1}.

Parameters:
  • df – snowpark.DataFrame Input dataframe.

  • y_true_col_name – string or list of strings Column name representing actual values.

  • y_pred_col_name – string or list of strings Column name representing predicted values.

  • labels – list of labels, default=None List of labels to index the matrix. This may be used to reorder or select a subset of labels. If None is given, those that appear at least once in the y true or y pred column are used in sorted order.

  • sample_weight_col_name – string, default=None Column name representing sample weights.

  • normalize – {‘true’, ‘pred’, ‘all’}, default=None Normalizes confusion matrix over the true (rows), predicted (columns) conditions or all the population. If None, confusion matrix will not be normalized.

Returns:

ndarray of shape (n_classes, n_classes)

Confusion matrix whose i-th row and j-th column entry indicates the number of samples with true label being i-th class and predicted label being j-th class.

Return type:

C

Raises:
  • ValueError – The given labels is empty.

  • ValueError – No label specified in the given labels is in the y true column.

  • ValueError – normalize is not one of {‘true’, ‘pred’, ‘all’, None}.