From 4d53fa14ee0cd615444aca6f6ba176e0ccc1b5be Mon Sep 17 00:00:00 2001 From: TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com> Date: Mon, 29 Apr 2024 22:06:12 -0400 Subject: init --- .../Content/K-Medians.md | 25 ++++++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 SI/Resource/Fundamentals of Data Mining/Content/K-Medians.md (limited to 'SI/Resource/Fundamentals of Data Mining/Content/K-Medians.md') diff --git a/SI/Resource/Fundamentals of Data Mining/Content/K-Medians.md b/SI/Resource/Fundamentals of Data Mining/Content/K-Medians.md new file mode 100644 index 0000000..91614d3 --- /dev/null +++ b/SI/Resource/Fundamentals of Data Mining/Content/K-Medians.md @@ -0,0 +1,25 @@ +--- +id: K-Medians +aliases: + - *K-Medians*: Handling Outliers by Computing Medians [(Youtube)]() +tags: [] +--- + +## _K-Medians_: Handling Outliers by Computing Medians [(Youtube)]() + +- Medians are less sensitive to outliers than means + - Think of the median salary vs. mean salary of a large firm when adding a few + top executives! +- _**K-Medians**_: Instead of taking the **mean** value of the object in a + cluster as a reference point, **medians** are used ($L_1$-norm is often used + as the distance measure) +- The criterion function for the _K-Medians_ algorithm: $$ S = + \sum*{k=1}^{K}\sum*{x*{i\in{C_k}}}|x*{ij} - m e d\_{kj}|$$ +- The _K-Medians_ clustering algorithm: + - Select _K_ points as the initial representative objects (i.e., as initial _K + medians_) + - **Repeat** + - Assign every point to its nearest median + - Re-compute the median using the median of ==each individual + feature== + - **Until** convergence criterion is satisfied -- cgit v1.2.3