snowflake.ml.modeling.metrics.f1_score¶
- snowflake.ml.modeling.metrics.f1_score(*, df: DataFrame, y_true_col_names: Union[str, List[str]], y_pred_col_names: Union[str, List[str]], labels: Optional[Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]] = None, pos_label: Union[str, int] = 1, average: Optional[str] = 'binary', sample_weight_col_name: Optional[str] = None, zero_division: Union[str, int] = 'warn') Union[float, ndarray[Any, dtype[float64]]]¶
- Compute the F1 score, also known as balanced F-score or F-measure. - The F1 score can be interpreted as a harmonic mean of the precision and recall, where an F1 score reaches its best value at 1 and worst score at 0. The relative contribution of precision and recall to the F1 score are equal. The formula for the F1 score is: - F1 = 2 * (precision * recall) / (precision + recall) - In the multi-class and multi-label case, this is the average of the F1 score of each class with weighting depending on the - averageparameter.- Parameters:
- df – snowpark.DataFrame Input dataframe. 
- y_true_col_names – string or list of strings Column name(s) representing actual values. 
- y_pred_col_names – string or list of strings Column name(s) representing predicted values. 
- labels – list of labels, default=None The set of labels to include when - average != 'binary', and their order if- average is None. Labels present in the data can be excluded, for example to calculate a multiclass average ignoring a majority negative class, while labels not present in the data will result in 0 components in a macro average. For multilabel targets, labels are column indices. By default, all labels in the y true and y pred columns are used in sorted order.
- pos_label – string or integer, default=1 The class to report if - average='binary'and the data is binary. If the data are multiclass or multilabel, this will be ignored; setting- labels=[pos_label]and- average != 'binary'will report scores for that label only.
- average – - {‘micro’, ‘macro’, ‘samples’, ‘weighted’, ‘binary’} or None, default=’binary’ This parameter is required for multiclass/multilabel targets. If - None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:- 'binary'- Only report results for the class specified by - pos_label. This is applicable only if targets (y true, y pred) are binary.- 'micro'
- Calculate metrics globally by counting the total true positives, false negatives and false positives. 
- 'macro'
- Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account. 
- 'weighted'
- Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall. 
- 'samples'
- Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from func`accuracy_score`). 
 
- sample_weight_col_name – string, default=None Column name representing sample weights. 
- zero_division – “warn”, 0 or 1, default=”warn” Sets the value to return when there is a zero division, i.e. when all predictions and labels are negative. If set to “warn”, this acts as 0, but warnings are also raised. 
 
- Returns:
- f1_score - float or array of float, shape = [n_unique_labels]
- F1 score of the positive class in binary classification or weighted average of the F1 scores of each class for the multiclass task.