From 4d53fa14ee0cd615444aca6f6ba176e0ccc1b5be Mon Sep 17 00:00:00 2001 From: TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com> Date: Mon, 29 Apr 2024 22:06:12 -0400 Subject: init --- .../Fundamentals of Data Mining/Content/NMI.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) create mode 100644 SI/Resource/Fundamentals of Data Mining/Content/NMI.md (limited to 'SI/Resource/Fundamentals of Data Mining/Content/NMI.md') diff --git a/SI/Resource/Fundamentals of Data Mining/Content/NMI.md b/SI/Resource/Fundamentals of Data Mining/Content/NMI.md new file mode 100644 index 0000000..d886599 --- /dev/null +++ b/SI/Resource/Fundamentals of Data Mining/Content/NMI.md @@ -0,0 +1,21 @@ +--- +id: NMI +aliases: + - Normalized mutual information (NMI) +tags: [] +--- + +## Normalized mutual information (NMI) + +- Mutual information: + - Quantifies the amount of shared info between $I(C,T) = + \sum_{i=1}^{r}\sum{j=1}^{k}p_{ij}log\dfrac{p{ij}}{p_{c_i}p_{T_j}}$ + - Measures the dependency between the observed joint probability $p_{ij}$ of + $C$ and $T$, and the expected joint probability $p_{Ci} * p_P{Tj}$ under the + independence assumption + - When $C$ and $T$ are independent, $p_{ij} = p_{Ci} * p_{Tj}, I(C, T) = 0$. + However, there is no upper bound on the mutual information +- **Normalized mutual information (NMI)** $$N M I(C, T) = + \sqrt{\dfrac{I(C,T)}{H(C)}*\dfrac{I(C, T)}{H(T)}} = \dfrac{I(C, + T)}{\sqrt{H(C) * H(T)}}$$ + - Value range of NMI: [0, 1]. Value close to 1 indicates a good clustering -- cgit v1.2.3