Python sklearn.metrics 提供了很多任务的评价指标,如分类任务的混淆矩阵、平均分类精度、每类分类精度、总体分类精度、F1-score 等;以及回归任务、聚类任务等多种内置函数.

1. 分类 - 混淆矩阵 Confusion Matrix

sklearn.metrics.confusion_matrix

from sklearn.metrics import confusion_matrix

计算混淆矩阵,以估计分类精度.

记混淆矩阵 ${ C }$,混淆矩阵元素 ${ C_{ij} }$ 为 gt_label=i , pred_label=j 的元素个数,i,j 为类别 labels.

二值分类中, true negatives 数为 ${ C_{0,0} }$,false negatives 数为 ${ C_{1,0} }$,true positives 数为 ${ C_{1,1} }$,false negatives 数为 ${ C_{0,1} }$.

使用示例:

C = confusion_matrix(gt_labels, pred_labels, labels=None, sample_weight=None)[source]
# C 为 n_classes x n_classes 的混淆矩阵

其中,

[1] - gt_labels - Groundtruth label 值

[2] - pred_labels - 分类器预测的 label 值

[3] - labels - labels 列表,用于索引混淆矩阵

示例1:

from sklearn.metrics import confusion_matrix
gt_labels = [2, 0, 2, 2, 0, 1]
pred_labels = [0, 0, 2, 2, 0, 2]
confusion_matrix(gt_labels, pred_labels)
# array([[2, 0, 0],
#        [0, 0, 1],
#        [1, 0, 2]])

示例2:

from sklearn.metrics import confusion_matrix
gt_labels = ["cat", "ant", "cat", "cat", "ant", "bird"]
pred_labels = ["ant", "ant", "cat", "cat", "ant", "cat"]
confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])
# array([[2, 0, 0],
#        [0, 0, 1],
#        [1, 0, 2]])

示例3:

二值分类情况,

from sklearn.metrics import confusion_matrix
tn, fp, fn, tp = confusion_matrix([0, 1, 0, 1], [1, 1, 1, 0]).ravel()
#(tn, fp, fn, tp)
#(0, 2, 1, 1)
Last modification:April 28th, 2021 at 10:40 am