互信息,MI,Mutual Information,是用于评价相同数据的两个标签之间的相似性度量. 其公式如:

MI(U,V)=i=1|U|j=1|V||UiVj|NlogN|UiVj||Ui||Vj|

其中,|Ui| 是聚类簇 Ui 中的样本数;|Vj| 是聚类簇 Vj 中的样本数.

MI 是与标签的绝对值无关的:类别或聚类簇标签值的排列方式不会改变 MI 结果.

MI 还具有对称性.

MI 常用的两种形式为,归一化互信息(NMI, Normalized Mutual Information) 和可调整互信息(AMI,Adjusted Mutual Information). 其中,NMI 在论文中更为常用.

1. NMI - sklearn

sklearn.metrics.normalized_mutual_info_score

from sklearn.metrics.cluster import normalized_mutual_info_score # c1 = [0, 0, 1, 1] c2 = [0, 0, 1, 1] nmi = normalized_mutual_info_score(c1, c2) print('[INFO]NMI: ', nmi) # 1.0

2. AMI - sklearn

sklearn.metrics.adjusted_mutual_info_score

AMI(U,V)=MI(U,V)E(MI(U,V))avg(H(U),H(V))E(MI(U,V))

from sklearn.metrics.cluster import adjusted_mutual_info_score # c1 = [0, 0, 1, 1] c2 = [0, 0, 1, 1] ami = adjusted_mutual_info_score(c1, c2) print('[INFO]AMI: ', ami) # 1.0
Last modification:April 30th, 2021 at 04:12 pm