snowflake.ml.modeling.metrics.confusion_matrix¶
- snowflake.ml.modeling.metrics.confusion_matrix(*, df: DataFrame, y_true_col_name: str, y_pred_col_name: str, labels: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] | None = None, sample_weight_col_name: str | None = None, normalize: str | None = None) ndarray[Any, dtype[int64]] | ndarray[Any, dtype[float64]] ¶
Compute confusion matrix to evaluate the accuracy of a classification.
By definition a confusion matrix \(C\) is such that \(C_{i, j}\) is equal to the number of observations known to be in group \(i\) and predicted to be in group \(j\).
Thus in binary classification, the count of true negatives is \(C_{0,0}\), false negatives is \(C_{1,0}\), true positives is \(C_{1,1}\) and false positives is \(C_{0,1}\).
- Args:
df: Input dataframe. y_true_col_name: Column name representing actual values. y_pred_col_name: Column name representing predicted values. labels: List of labels to index the matrix. This may be used to
reorder or select a subset of labels. If
None
is given, those that appear at least once in the y true or y pred column are used in sorted order.sample_weight_col_name: Column name representing sample weights. normalize: {‘true’, ‘pred’, ‘all’}, default=None
Normalizes confusion matrix over the true (rows), predicted (columns) conditions or all the population. If None, confusion matrix will not be normalized.
- Returns:
- C: ndarray of shape (n_classes, n_classes)
Confusion matrix whose i-th row and j-th column entry indicates the number of samples with true label being i-th class and predicted label being j-th class.
- Raises:
ValueError: The given
labels
is empty. ValueError: No label specified in the givenlabels
is in the y true column. ValueError:normalize
is not one of {‘true’, ‘pred’, ‘all’, None}.