From 4d53fa14ee0cd615444aca6f6ba176e0ccc1b5be Mon Sep 17 00:00:00 2001
From: TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com>
Date: Mon, 29 Apr 2024 22:06:12 -0400
Subject: init

---
 .../Fundamentals of Data Mining/Content/SSE.md     | 23 ++++++++++++++++++++++
 1 file changed, 23 insertions(+)
 create mode 100644 SI/Resource/Fundamentals of Data Mining/Content/SSE.md

(limited to 'SI/Resource/Fundamentals of Data Mining/Content/SSE.md')
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/SSE.md b/SI/Resource/Fundamentals of Data Mining/Content/SSE.md
new file mode 100644
index 0000000..007fcb7
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/SSE.md	
@@ -0,0 +1,23 @@
+---
+id: SSE
+aliases:
+  - Partitioning Algorithms: Basic Concepts
+tags: []
+---
+
+## Partitioning Algorithms: Basic Concepts
+
+- <u>Partitioning method</u>: Discovering the groupings in the data by
+  optimizing a specific ==objective function== and ==iteratively== improving the
+  quality of partitions
+- _K-partitioning_ method: Partitioning a dataset _**D**_ of _**n**_ objects
+  into a set of _**K**_ clusters so that an objective function is optimized
+  (e.g., the sum of squared distances is minimized within each cluster, where
+  $C_k$ is the centroid or medoid of cluster $C_k$)
+  - A typical objective function: **Sum of Squared Errors (SSE)** $$ SSE(C) =
+    \sum*{k=1}^{K}\sum*{x\_{i\in{C_k}}}||x_i - c_k||^2$$
+- **Problem definition**: Given _K_, find a partition of _K clusters_ that
+  optimizes the chosen partitioning criterion
+  - Global optimal: Needs to exhaustively enumerate all partitions
+  - Heuristic methods (i.e., greedy algorithms): _[[K-Means]], [[K-Medians]],
+    [[K-Medoids]], etc_
-- 
cgit v1.2.3