keras上的F1-Score

精度,召回率和F1这些度量指标在Keras2.0中都被删除

首先,精度,召回率和F1这些度量指标在Keras2.0中都被删除了,原因是验证集如果是分批次的,那么 Keras计算的是每一个batch验证结果的平均值 。对度量没有帮助。

https://github.com/keras-team/keras/commit/a56b1a55182acf061b1eb2e2c86b48193a0e88f7

Losses & metrics

Several legacy metric functions have been removed, namely matthews_correlationprecisionrecallfbeta_scorefmeasure.

https://github.com/keras-team/keras/wiki/Keras-2.0-release-notes

使用Keras度量函数不是计算F1或AUC或诸如此类的正确方法。这样做的原因是,在验证的每个批处理步骤中都会调用度量函数。 这样,Keras系统将计算批处理结果的平均值。 那不是正确的F1分数。
Using a Keras metric function is not the right way to calculate F1 or AUC or something like that. The reason for this is that the metric function is called at each batch step at validation. That way the Keras system calculates an average on the batch results. And that is not the right F1 score.

https://stackoverflow.com/questions/43547402/how-to-calculate-f1-macro-in-keras

失败原因

由于数据量太大,验证集没法一次性放入GPU内存,只能分批预测,所以即使是在Keras compile 时加入自己写的f1-score ,由于如上原因,也是没有作用的。

由于是批处理验证集,以下代码会计算批处理结果的平均值 ,f1-score计算错误。

metrics = ['accuracy', f1]

self.model.compile(loss='categorical_crossentropy', optimizer=optimizer,metrics=metrics)

解决方案

利用Keras的Callback,在每一个epoch结束时对所有的validation样本进行预测,并计算f1-score

class Metrics(Callback):
    def __init__(self, data_set, data_type):
        super().__init__()
        self.data_set = data_set
        self.data_type = data_type

    def on_epoch_end(self, epoch, logs={}):
        ground_truths = []
        predict_results = []
        for i in range(math.floor(len(self.data_set.validate) / self.data_set.batch_size)):
            X, y = self.data_set.validation_data(self.data_type, i)
            results = self.model.predict_on_batch(np.array(X))
            ground_truths.extend(np.array(y).argmax(axis=1))
            predict_results.extend(np.array(results).argmax(axis=1))
        p_class, r_class, f_class, support_micro = precision_recall_fscore_support(ground_truths, predict_results)
        with open("val_f1_score_callback.txt", "a") as file:
            txt = "precision:{},recall:{},f1_score:{},label_number:{}".format(p_class, r_class, f_class,support_micro)
            file.writelines(txt)
            file.writelines(
                "avg f1-score:{}".format(f1_score(ground_truths, predict_results, average="macro", zero_division=1)))
            file.write("\n")
        return
上一页 -