summaryrefslogtreecommitdiff
path: root/SI/Resource/Fundamentals of Data Mining/Content/K-Means.md
diff options
context:
space:
mode:
Diffstat (limited to 'SI/Resource/Fundamentals of Data Mining/Content/K-Means.md')
-rw-r--r--SI/Resource/Fundamentals of Data Mining/Content/K-Means.md23
1 files changed, 23 insertions, 0 deletions
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/K-Means.md b/SI/Resource/Fundamentals of Data Mining/Content/K-Means.md
new file mode 100644
index 0000000..d61d82b
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/K-Means.md
@@ -0,0 +1,23 @@
+---
+id: K-Means
+aliases: []
+tags:
+ - Clustering-Algorithms
+ - Compare-and-Contrast
+---
+
+- K-Means [(Youtube)](https://www.youtube.com/watch?v=KzJORp8bgqs)
+ - Each cluster is represented by the center/centroid of the cluster
+- Given K, the number of clusters, the _K-Means_ clustering algorithm is
+ outlined as follows
+ - Select _**K**_ points as initial centroids
+ - **Repeat**
+ - Form _K_ clusters by assigning each point to its **closest** centroid
+ - Re-compute the centroid (i.e., _**mean point**_) of each cluster
+ - **Until** convergence criterion is satisfied (**e.g., no change of cluster
+ membership, or a certain # of iterations have been reached, or, the [[SSE]]
+ is < a pre-defined threshold**)
+- Different kinds of distance measures can be used
+ - [[Manhattan distance]] ($L_1$ norm), [[Euclidean distance]] ($L_2$ norm),
+ [[Cosine similarity]], [[Mahalanobis distance]] ![[CleanShot 2023-10-24 at
+15.34.07@2x.png]]