diff options
Diffstat (limited to 'SI/Resource/Fundamentals of Data Mining/Content/NMI.md')
| -rw-r--r-- | SI/Resource/Fundamentals of Data Mining/Content/NMI.md | 21 |
1 files changed, 21 insertions, 0 deletions
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/NMI.md b/SI/Resource/Fundamentals of Data Mining/Content/NMI.md new file mode 100644 index 0000000..d886599 --- /dev/null +++ b/SI/Resource/Fundamentals of Data Mining/Content/NMI.md @@ -0,0 +1,21 @@ +--- +id: NMI +aliases: + - Normalized mutual information (NMI) +tags: [] +--- + +## Normalized mutual information (NMI) + +- Mutual information: + - Quantifies the amount of shared info between $I(C,T) = + \sum_{i=1}^{r}\sum{j=1}^{k}p_{ij}log\dfrac{p{ij}}{p_{c_i}p_{T_j}}$ + - Measures the dependency between the observed joint probability $p_{ij}$ of + $C$ and $T$, and the expected joint probability $p_{Ci} * p_P{Tj}$ under the + independence assumption + - When $C$ and $T$ are independent, $p_{ij} = p_{Ci} * p_{Tj}, I(C, T) = 0$. + However, there is no upper bound on the mutual information +- **Normalized mutual information (NMI)** $$N M I(C, T) = + \sqrt{\dfrac{I(C,T)}{H(C)}*\dfrac{I(C, T)}{H(T)}} = \dfrac{I(C, + T)}{\sqrt{H(C) * H(T)}}$$ + - Value range of NMI: [0, 1]. Value close to 1 indicates a good clustering |
