SI/Resource/Fundamentals of Data Mining/Content/K-Medians.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

---
id: K-Medians
aliases:
  - *K-Medians*: Handling Outliers by Computing Medians [(Youtube)]()
tags: []
---

## _K-Medians_: Handling Outliers by Computing Medians [(Youtube)]()

- Medians are less sensitive to outliers than means
  - Think of the median salary vs. mean salary of a large firm when adding a few
    top executives!
- _**K-Medians**_: Instead of taking the **mean** value of the object in a
  cluster as a reference point, **medians** are used ($L_1$-norm is often used
  as the distance measure)
- The criterion function for the _K-Medians_ algorithm: $$ S =
  \sum*{k=1}^{K}\sum*{x*{i\in{C_k}}}|x*{ij} - m e d\_{kj}|$$
- The _K-Medians_ clustering algorithm:
  - Select _K_ points as the initial representative objects (i.e., as initial _K
    medians_)
  - **Repeat**
    - Assign every point to its nearest median
    - Re-compute the median using the median of <u>==each individual
      feature==</u>
  - **Until** convergence criterion is satisfied