互信息,MI,Mutual Information,是用于评价相同数据的两个标签之间的相似性度量. 其公式如:
其中,
MI 是与标签的绝对值无关的:类别或聚类簇标签值的排列方式不会改变 MI 结果.
MI 还具有对称性.
MI 常用的两种形式为,归一化互信息(NMI, Normalized Mutual Information) 和可调整互信息(AMI,Adjusted Mutual Information). 其中,NMI 在论文中更为常用.
1. NMI - sklearn
from sklearn.metrics.cluster import normalized_mutual_info_score
#
c1 = [0, 0, 1, 1]
c2 = [0, 0, 1, 1]
nmi = normalized_mutual_info_score(c1, c2)
print('[INFO]NMI: ', nmi)
# 1.0
2. AMI - sklearn
from sklearn.metrics.cluster import adjusted_mutual_info_score
#
c1 = [0, 0, 1, 1]
c2 = [0, 0, 1, 1]
ami = adjusted_mutual_info_score(c1, c2)
print('[INFO]AMI: ', ami)
# 1.0