summaryrefslogtreecommitdiff
path: root/SI/Resource/Fundamentals of Data Mining/Content/SSE.md
blob: 007fcb7501d24c994434098682b6b6f8ecc9876d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---
id: SSE
aliases:
  - Partitioning Algorithms: Basic Concepts
tags: []
---

## Partitioning Algorithms: Basic Concepts

- <u>Partitioning method</u>: Discovering the groupings in the data by
  optimizing a specific ==objective function== and ==iteratively== improving the
  quality of partitions
- _K-partitioning_ method: Partitioning a dataset _**D**_ of _**n**_ objects
  into a set of _**K**_ clusters so that an objective function is optimized
  (e.g., the sum of squared distances is minimized within each cluster, where
  $C_k$ is the centroid or medoid of cluster $C_k$)
  - A typical objective function: **Sum of Squared Errors (SSE)** $$ SSE(C) =
    \sum*{k=1}^{K}\sum*{x\_{i\in{C_k}}}||x_i - c_k||^2$$
- **Problem definition**: Given _K_, find a partition of _K clusters_ that
  optimizes the chosen partitioning criterion
  - Global optimal: Needs to exhaustively enumerate all partitions
  - Heuristic methods (i.e., greedy algorithms): _[[K-Means]], [[K-Medians]],
    [[K-Medoids]], etc_