init

author: TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com> 2024-04-29 22:06:12 -0400
committer: TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com> 2024-04-29 22:06:12 -0400
commit: 4d53fa14ee0cd615444aca6f6ba176e0ccc1b5be (patch)
tree: 4d9f0527d9e6db4f92736ead0aa9bb3f840a0f89 /SI/Resource/Fundamentals of Data Mining/Content/SSE.md
1 files changed, 23 insertions, 0 deletions
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/SSE.md b/SI/Resource/Fundamentals of Data Mining/Content/SSE.md
new file mode 100644
index 0000000..007fcb7
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/SSE.md
@@ -0,0 +1,23 @@
+---
+id: SSE
+aliases:
+  - Partitioning Algorithms: Basic Concepts
+tags: []
+---
+
+## Partitioning Algorithms: Basic Concepts
+
+- <u>Partitioning method</u>: Discovering the groupings in the data by
+  optimizing a specific ==objective function== and ==iteratively== improving the
+  quality of partitions
+- _K-partitioning_ method: Partitioning a dataset _**D**_ of _**n**_ objects
+  into a set of _**K**_ clusters so that an objective function is optimized
+  (e.g., the sum of squared distances is minimized within each cluster, where
+  $C_k$ is the centroid or medoid of cluster $C_k$)
+  - A typical objective function: **Sum of Squared Errors (SSE)** $$ SSE(C) =
+    \sum*{k=1}^{K}\sum*{x\_{i\in{C_k}}}||x_i - c_k||^2$$
+- **Problem definition**: Given _K_, find a partition of _K clusters_ that
+  optimizes the chosen partitioning criterion
+  - Global optimal: Needs to exhaustively enumerate all partitions
+  - Heuristic methods (i.e., greedy algorithms): _[[K-Means]], [[K-Medians]],
+    [[K-Medoids]], etc_
author	TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com>	2024-04-29 22:06:12 -0400
committer	TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com>	2024-04-29 22:06:12 -0400
commit	4d53fa14ee0cd615444aca6f6ba176e0ccc1b5be (patch)
tree	4d9f0527d9e6db4f92736ead0aa9bb3f840a0f89 /SI/Resource/Fundamentals of Data Mining/Content/SSE.md