summaryrefslogtreecommitdiff
path: root/SI/Resource/Fundamentals of Data Mining/Content/NMI.md
blob: d8865997ddd6b8d77cb8d441ecff2f04157b7780 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---
id: NMI
aliases:
  - Normalized mutual information (NMI)
tags: []
---

## Normalized mutual information (NMI)

- Mutual information:
  - Quantifies the amount of shared info between $I(C,T) =
    \sum_{i=1}^{r}\sum{j=1}^{k}p_{ij}log\dfrac{p{ij}}{p_{c_i}p_{T_j}}$
  - Measures the dependency between the observed joint probability $p_{ij}$ of
    $C$ and $T$, and the expected joint probability $p_{Ci} * p_P{Tj}$ under the
    independence assumption
  - When $C$ and $T$ are independent, $p_{ij} = p_{Ci} * p_{Tj}, I(C, T) = 0$.
    However, there is no upper bound on the mutual information
- **Normalized mutual information (NMI)** $$N M I(C, T) =
  \sqrt{\dfrac{I(C,T)}{H(C)}*\dfrac{I(C, T)}{H(T)}} = \dfrac{I(C,
  T)}{\sqrt{H(C) * H(T)}}$$
  - Value range of NMI: [0, 1]. Value close to 1 indicates a good clustering