SI/Resource/Fundamentals of Data Mining/Content/Hierarchical Clustering.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

---
id: Hierarchical Clustering
aliases:
  - Hierarchical Clustering: Basic Concepts
tags: []
---

## Hierarchical Clustering: Basic Concepts

- Hierarchical clustering
  - Generate a clustering hierarchy (drawn as a **dendrogram**)
  - Not required to specify _K_, the number of clusters
  - More deterministic
  - No iterative refinement
- Two categories of algorithms:
  - **[[#Agglomerative Clustering Algorithm|Agglomerative]]**: Start with
    sigleton clusters, continuously merge two clusters at a time to build a
    **bottom-up** hierarchy of clusters
  - **Divisive**: Start with a huge macro-cluster, split it continuously into
    two groups, generating a **top-down** hierarchy of clusters

## Dendrogram: Shows How Clusters are Merged

![[CleanShot 2023-10-24 at 22.11.21@2x.png]]

## Strengths of Hierarchical Clustering

- Do not have to assume any particular number of clusters
  - Any desired number of clusters can be obtained by 'cutting' the dendrogram
    at the proper level
- They may correspond to meaningful taxonomies
  - Example in biological sciences (e.g., animal kingdom, phylogeny
    reconstruction, ...)

## Agglomerative Clustering Algorithm

![[CleanShot 2023-10-24 at 22.14.30@2x.png]] ![[CleanShot 2023-10-24 at
22.14.45@2x.png]] ![[CleanShot 2023-10-24 at 22.15.07@2x.png]] ![[CleanShot
2023-10-24 at 22.15.23@2x.png]]

## Extensions to Hierarchical Clustering

- Major weaknesses of hierarchical clustering methods
  - Can never undo what was done previously
  - Do not scale well
    - Time complexity of at least $O(n^2)$, where $n$ is the number of total
      objects
- Other hierarchical clustering algorithms
  - BIRCH (1996): Use CF-tree and incrementally adjust the quality of
    sub-clusters
  - CURE (1998): Represent a cluster using a set of well-scattered
    representative points
  - CHAMELEON (1999): Use graph partitioning methods on the K-nearest neighbor
    graph of the data