init

author: TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com> 2024-04-29 22:06:12 -0400
committer: TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com> 2024-04-29 22:06:12 -0400
commit: 4d53fa14ee0cd615444aca6f6ba176e0ccc1b5be (patch)
tree: 4d9f0527d9e6db4f92736ead0aa9bb3f840a0f89 /SI/Resource/Fundamentals of Data Mining
96 files changed, 793 insertions, 0 deletions
diff --git a/SI/Resource/Fundamentals of Data Mining/Ch.3.md b/SI/Resource/Fundamentals of Data Mining/Ch.3.md
new file mode 100644
index 0000000..f456f74
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Ch.3.md
@@ -0,0 +1,10 @@
+---
+id: Ch.3
+aliases: []
+tags: []
+---
+
+[[convex]]: in the clustering [[concave]]: not in the clustering [[K-Means]]:
+can only detect clusters that are linearly separable
+
+- in higher dimension, it can increase the chance for having a line
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Apriori.md b/SI/Resource/Fundamentals of Data Mining/Content/Apriori.md
new file mode 100644
index 0000000..450a899
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Apriori.md
@@ -0,0 +1,10 @@
+---
+id: Apriori
+aliases: []
+tags: []
+---
+
+![[CleanShot 2023-10-25 at 16.08.52@2x.png]] ![[CleanShot 2023-10-25 at
+16.09.04@2x.png]] ![[CleanShot 2023-10-25 at 16.09.16@2x.png]] ![[CleanShot
+2023-10-25 at 16.10.06@2x.png]] ![[CleanShot 2023-10-25 at 16.10.19@2x.png]]
+![[CleanShot 2023-10-25 at 16.10.34@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Attributes of Mixed Type.md b/SI/Resource/Fundamentals of Data Mining/Content/Attributes of Mixed Type.md
new file mode 100644
index 0000000..4f3149d
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Attributes of Mixed Type.md
@@ -0,0 +1,7 @@
+---
+id: Attributes of Mixed Type
+aliases: []
+tags: []
+---
+
+![[CleanShot 2023-10-23 at 18.43.35@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Binary.md b/SI/Resource/Fundamentals of Data Mining/Content/Binary.md
new file mode 100644
index 0000000..996865c
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Binary.md
@@ -0,0 +1,12 @@
+---
+id: Binary
+aliases:
+  - Example: Dissimilarity between Asymmetric Binary Variables
+tags: []
+---
+
+![[CleanShot 2023-10-23 at 18.19.11@2x.png]]
+
+### Example: Dissimilarity between Asymmetric Binary Variables
+
+![[CleanShot 2023-10-23 at 18.19.54@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Categorical.md b/SI/Resource/Fundamentals of Data Mining/Content/Categorical.md
new file mode 100644
index 0000000..6d1ec43
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Categorical.md
@@ -0,0 +1,7 @@
+---
+id: Categorical
+aliases: []
+tags: []
+---
+
+[[Nominal]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Compare and Contrast.md b/SI/Resource/Fundamentals of Data Mining/Content/Compare and Contrast.md
new file mode 100644
index 0000000..d72dd09
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Compare and Contrast.md
@@ -0,0 +1,27 @@
+---
+id: Compare and Contrast
+aliases:
+  - clustering algorithms
+tags:
+  - Compare-and-Contrast
+---
+
+## [[clustering algorithms]]
+
+- [[K-Means]] vs [[K-Medoids]]
+  - In _K-means_ algorithm, they choose means as the centroids but in the
+    _K-medoids_, data points are chosen to be the medoids[^1].
+- [[K-Means]] vs [[K-Medians]]
+
+| K-Means                                                    | K-Medians                                     |
+| ---------------------------------------------------------- | --------------------------------------------- |
+| The center is not necessarily one of the input data points | Centers will be chosen from data points       |
+| Not flexible                                               | More flexible                                 |
+| Not immune to noise and outliers                           | More robust to noise and outliers             |
+| Minimize the sum of squared Euclidian distance             | Minimize a sum of pairwise of dissimilarities |
+
+[^1]:
+    Medoids are **representative objects of a data set or a cluster within a
+    data set whose sum of dissimilarities to all the objects in the cluster is
+    minimal**. Medoids are similar in concept to means or centroids, but medoids are
+    always restricted to be members of the data set.
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Complexity.md b/SI/Resource/Fundamentals of Data Mining/Content/Complexity.md
new file mode 100644
index 0000000..e73595a
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Complexity.md
@@ -0,0 +1,25 @@
+---
+id: Complexity
+aliases:
+  - Computational/Time Complexity
+tags: []
+---
+
+## Computational/Time Complexity
+
+- K-Medoids:
+  - PAM: $O(K(n - k)^2)$
+- Kernel K-Means:
+  - Computational complexity (time and space) is higher than K-Means
+    - Need to compute and store n x n kernel matrix generated from the kernel
+      function on the original data, where n is the number of points
+- Hierarchical Clustering:
+  - Agglomerative Clustering
+    - Time complexity: $O(n^2)$
+    - Algorithmic Complexity: $O(m^2logm)$
+- Density-based Clustering:
+  - DBSCAN:
+    - Computational complexity: $O(nlogn)$
+      - worst case: $O(n^2)$
+  - OPTICS:
+    - Complexity: $O(NlogN)$
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/DBSCAN.md b/SI/Resource/Fundamentals of Data Mining/Content/DBSCAN.md
new file mode 100644
index 0000000..00e737d
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/DBSCAN.md
@@ -0,0 +1,12 @@
+---
+id: DBSCAN
+aliases:
+  - DBSCAN: A Density-Based Spatial Clustering Algorithm
+tags: []
+---
+
+## DBSCAN: A Density-Based Spatial Clustering Algorithm
+
+![[CleanShot 2023-10-24 at 22.21.02@2x.png]] ![[CleanShot 2023-10-24 at
+22.21.23@2x.png]] ![[CleanShot 2023-10-24 at 22.21.37@2x.png]] ![[CleanShot
+2023-10-24 at 22.21.59@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Data Matrix and Dissimilarity Matrix.md b/SI/Resource/Fundamentals of Data Mining/Content/Data Matrix and Dissimilarity Matrix.md
new file mode 100644
index 0000000..7f299d1
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Data Matrix and Dissimilarity Matrix.md
@@ -0,0 +1,7 @@
+---
+id: Data Matrix and Dissimilarity Matrix
+aliases: []
+tags: []
+---
+
+![[CleanShot 2023-10-23 at 18.09.20@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Density-based Clustering.md b/SI/Resource/Fundamentals of Data Mining/Content/Density-based Clustering.md
new file mode 100644
index 0000000..d5e11fe
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Density-based Clustering.md
@@ -0,0 +1,21 @@
+---
+id: Density-based Clustering
+aliases:
+  - Density-Based Clustering Methods
+tags: []
+---
+
+## Density-Based Clustering Methods
+
+- Clustering based on density (a **local** cluster criterion), such as
+  density-connected points
+- Major features:
+  - Discover clusters of **arbitrary** shape
+  - Handle noise
+  - One scan (only examine the local region to justify density)
+  - Need density parameters as termination condition
+- Several interesting studies:
+  - <u>[[DBSCAN]]</u>: Ester, et al.
+  - <u>[[OPTICS]]</u>: Ankerst, et al.
+  - DENCLUE: Hinneburg & D. Keim
+  - CLIQUE: Agrawal, et al.
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Dissimilarity.md b/SI/Resource/Fundamentals of Data Mining/Content/Dissimilarity.md
new file mode 100644
index 0000000..be8891f
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Dissimilarity.md
@@ -0,0 +1,12 @@
+---
+id: Dissimilarity
+aliases: []
+tags: []
+---
+
+- Dissimilarity (or distance) measure
+  - [Numerical measure](app://obsidian.md/Numeric) of how different two data
+    objects are
+  - **In some sense, the inverse of similarity**: The lower, the more alike
+  - Minimum dissimilarity is often 0 (i.e., completely similar)
+  - Range [0, 1] or [0, ], depending on the definition
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Distance functions.md b/SI/Resource/Fundamentals of Data Mining/Content/Distance functions.md
new file mode 100644
index 0000000..0b1c658
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Distance functions.md
@@ -0,0 +1,24 @@
+---
+id: Distance functions
+aliases:
+  - Numeric
+tags: []
+---
+
+## Numeric
+
+### Minkowski distance
+
+![[CleanShot 2023-10-23 at 18.16.28@2x.png]]
+
+### Sepcial Cases of Minkowski Distance
+
+![[CleanShot 2023-10-23 at 18.17.24@2x.png]]
+
+- Manhattan (or city block) distance
+- Euclidean distance
+- "supremum" distance
+
+### Example: Special Cases
+
+![[CleanShot 2023-10-23 at 18.18.06@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Distance measures (mixted types of attributes).md b/SI/Resource/Fundamentals of Data Mining/Content/Distance measures (mixted types of attributes).md
new file mode 100644
index 0000000..d97338b
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Distance measures (mixted types of attributes).md
@@ -0,0 +1,15 @@
+---
+id: Distance measures (mixted types of attributes)
+aliases:
+  - Distance measures (mixted types of attributes)
+tags: []
+---
+
+## Distance measures ([[mixted types of attributes]])
+
+- [[Dissimilarity]] (or distance) measure
+  - [[Distance functions|Numeric|Numerical measure]] of how different two data
+    objects are
+  - **In some sense, the inverse of similarity**: The lower, the more alike
+  - Minimum dissimilarity is often 0 (i.e., completely similar)
+  - Range [0, 1] or [0, $\infty$], depending on the definition
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/F-measure.md b/SI/Resource/Fundamentals of Data Mining/Content/F-measure.md
new file mode 100644
index 0000000..a467817
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/F-measure.md
@@ -0,0 +1,9 @@
+---
+id: F-measure
+aliases: []
+tags: []
+---
+
+- F-Measure ![[CleanShot 2023-10-25 at 15.57.51@2x.png]] ![[CleanShot 2023-10-25
+at 15.58.04@2x.png]] ![[CleanShot 2023-10-25 at 15.58.19@2x.png]] ![[CleanShot
+2023-10-25 at 15.58.40@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/FP-growth.md b/SI/Resource/Fundamentals of Data Mining/Content/FP-growth.md
new file mode 100644
index 0000000..f533719
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/FP-growth.md
@@ -0,0 +1,21 @@
+---
+id: FP-growth
+aliases: []
+tags: []
+---
+
+• You can expect to ‘draw’ the fp-tree using a text-based format as follows: A:3
+
+|
+
+B:3.   C:1
+
+|            |      \
+
+C:2      E:1   D:1
+
+![[CleanShot 2023-10-25 at 16.11.31@2x.png]] ![[CleanShot 2023-10-25 at
+16.11.42@2x.png]] ![[CleanShot 2023-10-25 at 16.11.56@2x.png]] ![[CleanShot
+2023-10-25 at 16.12.11@2x.png]] ![[CleanShot 2023-10-25 at 16.12.25@2x.png]]
+![[CleanShot 2023-10-25 at 22.44.57.png]]
+[Youtube](https://www.youtube.com/watch?v=GcgfSJAaBto)
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Hierarchical Clustering.md b/SI/Resource/Fundamentals of Data Mining/Content/Hierarchical Clustering.md
new file mode 100644
index 0000000..096fe2c
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Hierarchical Clustering.md
@@ -0,0 +1,54 @@
+---
+id: Hierarchical Clustering
+aliases:
+  - Hierarchical Clustering: Basic Concepts
+tags: []
+---
+
+## Hierarchical Clustering: Basic Concepts
+
+- Hierarchical clustering
+  - Generate a clustering hierarchy (drawn as a **dendrogram**)
+  - Not required to specify _K_, the number of clusters
+  - More deterministic
+  - No iterative refinement
+- Two categories of algorithms:
+  - **[[#Agglomerative Clustering Algorithm|Agglomerative]]**: Start with
+    sigleton clusters, continuously merge two clusters at a time to build a
+    **bottom-up** hierarchy of clusters
+  - **Divisive**: Start with a huge macro-cluster, split it continuously into
+    two groups, generating a **top-down** hierarchy of clusters
+
+## Dendrogram: Shows How Clusters are Merged
+
+![[CleanShot 2023-10-24 at 22.11.21@2x.png]]
+
+## Strengths of Hierarchical Clustering
+
+- Do not have to assume any particular number of clusters
+  - Any desired number of clusters can be obtained by 'cutting' the dendrogram
+    at the proper level
+- They may correspond to meaningful taxonomies
+  - Example in biological sciences (e.g., animal kingdom, phylogeny
+    reconstruction, ...)
+
+## Agglomerative Clustering Algorithm
+
+![[CleanShot 2023-10-24 at 22.14.30@2x.png]] ![[CleanShot 2023-10-24 at
+22.14.45@2x.png]] ![[CleanShot 2023-10-24 at 22.15.07@2x.png]] ![[CleanShot
+2023-10-24 at 22.15.23@2x.png]]
+
+## Extensions to Hierarchical Clustering
+
+- Major weaknesses of hierarchical clustering methods
+  - Can never undo what was done previously
+  - Do not scale well
+    - Time complexity of at least $O(n^2)$, where $n$ is the number of total
+      objects
+- Other hierarchical clustering algorithms
+  - BIRCH (1996): Use CF-tree and incrementally adjust the quality of
+    sub-clusters
+  - CURE (1998): Represent a cluster using a set of well-scattered
+    representative points
+  - CHAMELEON (1999): Use graph partitioning methods on the K-nearest neighbor
+    graph of the data
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/K-Means.md b/SI/Resource/Fundamentals of Data Mining/Content/K-Means.md
new file mode 100644
index 0000000..d61d82b
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/K-Means.md
@@ -0,0 +1,23 @@
+---
+id: K-Means
+aliases: []
+tags:
+  - Clustering-Algorithms
+  - Compare-and-Contrast
+---
+
+- K-Means [(Youtube)](https://www.youtube.com/watch?v=KzJORp8bgqs)
+  - Each cluster is represented by the center/centroid of the cluster
+- Given K, the number of clusters, the _K-Means_ clustering algorithm is
+  outlined as follows
+  - Select _**K**_ points as initial centroids
+  - **Repeat**
+    - Form _K_ clusters by assigning each point to its **closest** centroid
+    - Re-compute the centroid (i.e., _**mean point**_) of each cluster
+  - **Until** convergence criterion is satisfied (**e.g., no change of cluster
+    membership, or a certain # of iterations have been reached, or, the [[SSE]]
+    is < a pre-defined threshold**)
+- Different kinds of distance measures can be used
+  - [[Manhattan distance]] ($L_1$ norm), [[Euclidean distance]] ($L_2$ norm),
+    [[Cosine similarity]], [[Mahalanobis distance]] ![[CleanShot 2023-10-24 at
+15.34.07@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/K-Medians.md b/SI/Resource/Fundamentals of Data Mining/Content/K-Medians.md
new file mode 100644
index 0000000..91614d3
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/K-Medians.md
@@ -0,0 +1,25 @@
+---
+id: K-Medians
+aliases:
+  - *K-Medians*: Handling Outliers by Computing Medians [(Youtube)]()
+tags: []
+---
+
+## _K-Medians_: Handling Outliers by Computing Medians [(Youtube)]()
+
+- Medians are less sensitive to outliers than means
+  - Think of the median salary vs. mean salary of a large firm when adding a few
+    top executives!
+- _**K-Medians**_: Instead of taking the **mean** value of the object in a
+  cluster as a reference point, **medians** are used ($L_1$-norm is often used
+  as the distance measure)
+- The criterion function for the _K-Medians_ algorithm: $$ S =
+  \sum*{k=1}^{K}\sum*{x*{i\in{C_k}}}|x*{ij} - m e d\_{kj}|$$
+- The _K-Medians_ clustering algorithm:
+  - Select _K_ points as the initial representative objects (i.e., as initial _K
+    medians_)
+  - **Repeat**
+    - Assign every point to its nearest median
+    - Re-compute the median using the median of <u>==each individual
+      feature==</u>
+  - **Until** convergence criterion is satisfied
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/K-Medoids.md b/SI/Resource/Fundamentals of Data Mining/Content/K-Medoids.md
new file mode 100644
index 0000000..6dc59fe
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/K-Medoids.md
@@ -0,0 +1,45 @@
+---
+id: K-Medoids
+aliases:
+  - Handling Outliers: From _K-Means_ to _K-Medoids_ (Youtube)(Youtube-E.g.)
+tags:
+  - Compare-and-Contrast
+  - ""
+  - Complexity
+---
+
+## Handling Outliers: From _K-Means_ to _K-Medoids_ [(Youtube)](https://www.youtube.com/watch?v=OFELCn-6r2o) [(Youtube-E.g.)](https://www.youtube.com/watch?v=ChBxx4aR-bY&t=0s)
+
+- K-Medoids: Instead of taking the **mean** value of the objects in a cluster as
+  a reference point, **medoids** can be used, which is the **most centrally
+  located** object in a cluster
+
+- The _K-Medoids_ clustering algorithm:
+  - Select _K_ points as the initial ==representative== objects (i.e., as
+    initial _K medoids_)
+  - **Repeat**
+    - Assigning each point to the cluster with the closest medoid
+    - Randomly select a ==non-representative== object $o_i$
+    - Compute ==the total cost _S_== of swapping the medoid $m$ with
+      $o_i$(_==e.g.,useSSEtomeasure==_)
+    - If $S<0$ (_==e.g., new SSE < previous SSE==_), then swap $m$ with $o_i$ to
+      form the new set of medoids
+  - **Until** convergence criterion is satisfied
+
+### Discussion on _K-Medoids_ Clustering
+
+- _K-Medoids_ Clustering: Find _representative_ objects (<u>medoids</u>) in
+  clusters
+- _PAM_ (Partitioning Around Medoids)
+  - Starts from an initial set of medoids, and
+  - Iteratively replaces one of the medoids by one of the non-mdedoids if it
+    improves the total sum of the squared errors (SSE) of the resulting
+    clustering
+  - _PAM_ works effectively for small data sets but does not scale well for
+    large data sets (due to the computational complexity)
+  - Computational [[Complexity]]: PAM: $O(K(n - k)^2)$ (quite expensive!)
+- Efficiency improvements on PAM
+  - _**CLARA**_ (Kaufmann & Rousseeuw, 1990):
+    - PAM on samples; $O(Ks^2 + K(n - K))$, $s$ is the sample size
+  - _**CLARANS**_ (Ng & Han, 1994): ==Randomised re-sampling==, ensuring
+    efficiency + quality
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/K-Modes.md b/SI/Resource/Fundamentals of Data Mining/Content/K-Modes.md
new file mode 100644
index 0000000..6dea96c
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/K-Modes.md
@@ -0,0 +1,27 @@
+---
+id: K-Modes
+aliases:
+  - K-Modes: Clustering Categorical Data (Youtube)
+tags: []
+---
+
+## K-Modes: Clustering Categorical Data [(Youtube)](https://www.youtube.com/watch?v=b39_vipRkUo)
+
+- _K-Means_ cannot directly handle non-numerical (categorical) data - ==how to
+  calculate the mean? What do they mean?==
+  - Mapping categorical value to 0/1 cannot generate quality clusters (in
+    high-dimensional space)
+- _**K-Modes**_: An extension to _K-Means_ by replacing means of clusters with
+  _**modes**_
+  - Mode: The value that appears the most often in a **set** of data values
+- <u>Dissimilarity</u> measure between object X and the center of a cluster
+  $Z_l$
+  - $\Phi(x_j, z_j) = 1 - n_j^{\dfrac{r}{n_l}}$ when $x_j = z_j = r$; 1 when
+    $x_j \ne z_j$
+    - where $z_j$ is the categorical value of attribute j in $Z_l$, $n_l$ is the
+      number of objects in cluster $l$, and $n_j^r$ is the number of objects
+      whose attribute value is r
+- This dissimilarity measure (distance function) is _**frequency-based**_
+- Algorithm is still based on iterative _object_ cluster assignment and
+  _centroid_ update
+- A mixture of categorical and numerical data: Using a _**K-Prototype**_ method
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/K-means++.md b/SI/Resource/Fundamentals of Data Mining/Content/K-means++.md
new file mode 100644
index 0000000..5cf1343
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/K-means++.md
@@ -0,0 +1,11 @@
+---
+id: K-means++
+aliases: []
+tags: []
+---
+
+- The first centroid is selected at random
+- The next centroid selected is the one that is the farthest from the currently
+  selected (selection is based on a weighted probability score)
+- The selection continues until _K_ centroids are obtained
+- [Youtube](https://www.youtube.com/watch?v=z2yncM2HE6M)
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Kernel K-Means.md b/SI/Resource/Fundamentals of Data Mining/Content/Kernel K-Means.md
new file mode 100644
index 0000000..828a053
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Kernel K-Means.md
@@ -0,0 +1,39 @@
+---
+id: Kernel K-Means
+aliases:
+  - *Kernel K-Means Clustering*
+tags: []
+---
+
+## _Kernel K-Means Clustering_
+
+- _Kernel K-Means_ can be used to detect non-convex clusters
+  - A region is **convex** if it contains all the line segments connecting any
+    pair of its points. Otherwise, it is **concave**
+  - _K-Means_ can only detect clusters that are **linearly** separable
+- <u>Idea</u>: Project data onto the high-dimensional kernel space, and then
+  perform _K-Means_ clustering
+  - Map data points in the input space onto a high-demensional feature space
+    using the kernel function ![[CleanShot 2023-10-24 at 21.42.19@2x.png]]
+  - Perform _K-Means_ on the mapped feature space
+- Computational complexity (time and space) is higher than K-Means
+  - Need to compute and store _n x n_ kernel matrix generated from the kernel
+    function on the original data, where _n_ is the number of points
+
+## Kernel Functions and Kernel K-Means Clustering
+
+- Typical kernel functions:
+  - Polynomial kernel of degree h: $K(X_i, X_j) = (X_i*X_j+1)^h$
+  - <u>Gaussian radial basis function (RBF) kernel</u>: $K(X_i, X_j) =
+    e^{-||X_i - X_j||^2 / 2\sigma^2}$
+  - Sigmoid kernel: $K(X_i, X_j) = tanh(KX_i*X_j - \delta)$
+- The formula for kernel matrix K for any two points $x_i, x_j \in C_k$ is
+  $K_{x_ix_j} = \phi(x_i)*\phi(x_j)$
+- The [[SSE]] criterion of _kernel K-means_: $$SSE(c) =
+  \sum_{k=1}^{K}\sum_{x_i\in{C_k}}||\phi(x_i) - c_k||^2$$
+  - The formula for the cluster centroid: $$c_k =
+    \dfrac{\sum_{x_i\in{C_k}}\phi(x_i)}{|C_k|}$$
+- Clustering can be performed without the actual individual projections
+  $\Phi(x_i)$ and $\Phi(x_j)$ for the data points $x_i, x_j \in{C_k}$ (use
+  K(Xi,Xj) instead) ![[CleanShot 2023-10-24 at 22.06.43@2x.png]] ![[CleanShot
+2023-10-24 at 22.07.05@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/NMI.md b/SI/Resource/Fundamentals of Data Mining/Content/NMI.md
new file mode 100644
index 0000000..d886599
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/NMI.md
@@ -0,0 +1,21 @@
+---
+id: NMI
+aliases:
+  - Normalized mutual information (NMI)
+tags: []
+---
+
+## Normalized mutual information (NMI)
+
+- Mutual information:
+  - Quantifies the amount of shared info between $I(C,T) =
+    \sum_{i=1}^{r}\sum{j=1}^{k}p_{ij}log\dfrac{p{ij}}{p_{c_i}p_{T_j}}$
+  - Measures the dependency between the observed joint probability $p_{ij}$ of
+    $C$ and $T$, and the expected joint probability $p_{Ci} * p_P{Tj}$ under the
+    independence assumption
+  - When $C$ and $T$ are independent, $p_{ij} = p_{Ci} * p_{Tj}, I(C, T) = 0$.
+    However, there is no upper bound on the mutual information
+- **Normalized mutual information (NMI)** $$N M I(C, T) =
+  \sqrt{\dfrac{I(C,T)}{H(C)}*\dfrac{I(C, T)}{H(T)}} = \dfrac{I(C,
+  T)}{\sqrt{H(C) * H(T)}}$$
+  - Value range of NMI: [0, 1]. Value close to 1 indicates a good clustering
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Nominal.md b/SI/Resource/Fundamentals of Data Mining/Content/Nominal.md
new file mode 100644
index 0000000..e4f447f
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Nominal.md
@@ -0,0 +1,9 @@
+---
+id: Nominal
+aliases: []
+tags: []
+---
+
+- Proximity Measure for Categorical Attributes ![[CleanShot 2023-10-23 at
+18.20.19@2x.png]]
+- [[Target encoding]] for Multi-Class Classification
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/OPTICS.md b/SI/Resource/Fundamentals of Data Mining/Content/OPTICS.md
new file mode 100644
index 0000000..3578883
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/OPTICS.md
@@ -0,0 +1,35 @@
+---
+id: OPTICS
+aliases:
+  - OPTICS: Ordering Points To Identify Clustering Structure
+tags: []
+---
+
+## OPTICS: Ordering Points To Identify Clustering Structure
+
+![[CleanShot 2023-10-24 at 22.22.49@2x.png]] ![[CleanShot 2023-10-24 at
+22.23.05@2x.png]]
+
+## OPTICS (cont.)
+
+- OPTICS does not explicitly produce a data set clustering.
+- It outputs a cluster ordering.
+  - It is a linear list of all objects under analysis and
+  - Represents the density-based clustering structure of the data.
+- Objects in a denser cluster are listed closer to each other in the cluster
+  ordering
+- Ordering is equivalent to density-based clustering obtained from a wide range
+  of parameter settings.
+- Thus OPTICS does not require the user to provide a specific density threshold.
+- The cluster ordering can be used to extract basic clustering information
+  (e.g., cluster centers, or arbitrary-shaped clusters), derive the intrinsic
+  clustering structure, as well as provide a visualization of the clustering.
+- It computes an ordering of all objects in a given database. And
+- It stores the core-distance and a suitable reachability-distance for **each**
+  object in the database.
+- OPTICS maintains a list called **OrderSeeds** to help generate the output
+  ordering.
+- Objects in **OrderSeeds**
+  - Are stored by the reachability-distance from their respective closet core
+    objects,
+  - That is, by the smallest reachability-distance of each object.
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Ordinal.md b/SI/Resource/Fundamentals of Data Mining/Content/Ordinal.md
new file mode 100644
index 0000000..e5bf256
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Ordinal.md
@@ -0,0 +1,7 @@
+---
+id: Ordinal
+aliases: []
+tags: []
+---
+
+![[CleanShot 2023-10-23 at 18.28.55@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/SSE.md b/SI/Resource/Fundamentals of Data Mining/Content/SSE.md
new file mode 100644
index 0000000..007fcb7
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/SSE.md
@@ -0,0 +1,23 @@
+---
+id: SSE
+aliases:
+  - Partitioning Algorithms: Basic Concepts
+tags: []
+---
+
+## Partitioning Algorithms: Basic Concepts
+
+- <u>Partitioning method</u>: Discovering the groupings in the data by
+  optimizing a specific ==objective function== and ==iteratively== improving the
+  quality of partitions
+- _K-partitioning_ method: Partitioning a dataset _**D**_ of _**n**_ objects
+  into a set of _**K**_ clusters so that an objective function is optimized
+  (e.g., the sum of squared distances is minimized within each cluster, where
+  $C_k$ is the centroid or medoid of cluster $C_k$)
+  - A typical objective function: **Sum of Squared Errors (SSE)** $$ SSE(C) =
+    \sum*{k=1}^{K}\sum*{x\_{i\in{C_k}}}||x_i - c_k||^2$$
+- **Problem definition**: Given _K_, find a partition of _K clusters_ that
+  optimizes the chosen partitioning criterion
+  - Global optimal: Needs to exhaustively enumerate all partitions
+  - Heuristic methods (i.e., greedy algorithms): _[[K-Means]], [[K-Medians]],
+    [[K-Medoids]], etc_
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Target encoding.md b/SI/Resource/Fundamentals of Data Mining/Content/Target encoding.md
new file mode 100644
index 0000000..0502447
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Target encoding.md
@@ -0,0 +1,8 @@
+---
+id: Target encoding
+aliases: []
+tags: []
+---
+
+![[CleanShot 2023-10-23 at 18.20.36@2x.png]] ![[CleanShot 2023-10-23 at
+18.22.00@2x.png]] ![[CleanShot 2023-10-23 at 18.22.12@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/Z-score.md b/SI/Resource/Fundamentals of Data Mining/Content/Z-score.md
new file mode 100644
index 0000000..34730a6
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/Z-score.md
@@ -0,0 +1,20 @@
+---
+id: Z-score
+aliases:
+  - Z-score - An example
+tags: []
+---
+
+### Z-score - An example
+
+- John gets a mark of 64 in a physics test, where the mean is 50 and the
+  standard deviation is 8.
+- Jane gets a mark of 74 in a chemistry test, where the mean is 58 and the
+  standard deviation is 10
+
+Who has a better class performance?
+
+- John's z = (64 - 50) / 8 = 1.75
+- Jane's z = (74 - 58) / 10 = 1.6
+- Although Jane's score is higher, John's score is further above the mean, and
+  it might be concluded that John has achieved greater success.
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/attributes.md b/SI/Resource/Fundamentals of Data Mining/Content/attributes.md
new file mode 100644
index 0000000..5b03e85
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/attributes.md
@@ -0,0 +1,14 @@
+---
+id: attributes
+aliases: []
+tags: []
+---
+
+- Attribute (or dimensions, features, variables)
+  - A data field, representing a characteristic of feature of a data object
+  - E.g., customer_ID, name, address
+- Types:
+  - [[Nominal]](e.g., red, blue)
+  - [[Binary]](e.g., {true, false})
+  - [[Ordinal]](e.g., {freshman, sophomore, junior, senior})
+  - [[Numeric]]; [[quantitative]] ([[discrete]] vs [[continuous]])
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/clustering algorithms.md b/SI/Resource/Fundamentals of Data Mining/Content/clustering algorithms.md
new file mode 100644
index 0000000..ad1b29b
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/clustering algorithms.md
@@ -0,0 +1,21 @@
+---
+id: clustering algorithms
+aliases:
+  - Compare and Contrast
+tags:
+  - Clustering-Algorithms
+---
+
+## Compare and Contrast
+
+### _[[K-Means]]_
+
+### _[[K-means++]]_
+
+### _[[K-Medoids]]_
+
+### _[[K-Medians]]_
+
+### _[[K-Modes]]_
+
+### _[[Kernel K-Means]]_
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/clustering evaluation.md b/SI/Resource/Fundamentals of Data Mining/Content/clustering evaluation.md
new file mode 100644
index 0000000..2e175e9
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/clustering evaluation.md
@@ -0,0 +1,9 @@
+---
+id: clustering evaluation
+aliases: []
+tags: []
+---
+
+![[CleanShot 2023-10-25 at 16.48.09@2x.png]] ![[CleanShot 2023-10-25 at
+16.48.19@2x.png]] ![[CleanShot 2023-10-25 at 17.08.12@2x.png]] ![[CleanShot
+2023-10-25 at 17.08.29@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/external.md b/SI/Resource/Fundamentals of Data Mining/Content/external.md
new file mode 100644
index 0000000..91faabe
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/external.md
@@ -0,0 +1,42 @@
+---
+id: external
+aliases:
+  - Measuring Clustering Quality: ==External== Methods
+tags: []
+---
+
+## Measuring Clustering Quality: ==External== Methods
+
+- Given the **ground truth** _T, Q(C, T)_ is the **quality measure** for a
+  clustering C
+- _Q(C, T)_ is good if it satisfies the following **four** essential criteria
+  - **Cluster homogeneity**
+    - The purer, the better
+  - **Cluster completeness**
+    - Assign objects belonging to the same category in the ground truth to the
+      same cluster
+  - **Rag bag better than alien**
+    - Putting a heterogeneous object into a pure cluster should be penalized
+      **more** than putting it into a _rag bag_ (i.e., "miscellaneous" or
+      "other" category)
+  - **Small cluster preservation**
+    - Splitting a small category into pieces is more harmful than splitting a
+      large category into pieces
+
+## Commonly Used External Measures
+
+- **Matching-based measure**
+  - Purity, maximum matching, [[F-measure]]
+- **Entropy-Based Measures**
+  - Conditional entropy
+  - <u>Normalized mutual information (NMI)</u>
+  - Variation of information
+- **Pairwise measures**
+  - Four possibilities: True positive (TP), FN, FP, TN
+  - Jaccard coefficient, Rand statistic, Fowlkes-Mallow measure
+- **Correlation measures**
+  - Discretized Huber static, normalized discretized Huber static
+- Purity vs Maximum Matching ![[CleanShot 2023-10-25 at 15.57.30@2x.png]]
+- [[F-measure]] ![[CleanShot 2023-10-25 at 15.57.51@2x.png]] ![[CleanShot
+2023-10-25 at 15.58.04@2x.png]] ![[CleanShot 2023-10-25 at 15.58.19@2x.png]]
+  ![[CleanShot 2023-10-25 at 15.58.40@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/internal.md b/SI/Resource/Fundamentals of Data Mining/Content/internal.md
new file mode 100644
index 0000000..22834ec
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/internal.md
@@ -0,0 +1,8 @@
+---
+id: internal
+aliases: []
+tags: []
+---
+
+![[CleanShot 2023-10-25 at 15.59.49@2x.png]] ![[CleanShot 2023-10-25 at
+16.00.03@2x.png]]
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/mixted types of attributes.md b/SI/Resource/Fundamentals of Data Mining/Content/mixted types of attributes.md
new file mode 100644
index 0000000..8b8cb77
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/mixted types of attributes.md
@@ -0,0 +1,21 @@
+---
+id: mixted types of attributes
+aliases:
+  - Attributes of Mixed Types
+tags: []
+---
+
+### Attributes of Mixed Types
+
+- A dataset may contain all different types
+  - [[Nominal]], symmetric [[binary]], asymmetric [[binary]], [[Distance
+functions|numeric]], and [[ordinal]]
+- One may use a weighted formula to combine their effects: $$d(i, j) =
+  \dfrac{\Sigma_{f=1}^{p}w_{ij}^{(f)}d_{ij}^{(f)}}{\Sigma_{f=1}^{p}w_{ij}^{(f)}}$$
+  - if _f_ is numeric: use the **normalized distance (e.g., min-max distance
+    [0-1])**
+- If _f_ is binary or nominal: $d_{ij}^{(f)}=0$ if $x_{if} = x_{jf}$; or
+  $d_{ij}^{(f)} = 1$ otherwise (there are other options)
+- If _f_ is ordinal
+  - Compute ranks $z_{if}$ where $z_{if} = \dfrac{r_{if} - 1}{M_f - 1}$
+  - Treat $z_{if}$ as numeric
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md b/SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md
new file mode 100644
index 0000000..f88fe80
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md
@@ -0,0 +1,27 @@
+---
+id: pattern discovery
+aliases:
+  - What is Pattern Discovery?
+tags: []
+---
+
+## What is Pattern Discovery?
+
+- ==What are patterns?==
+  - ==Patterns==: A set of items, subsequences, or substructures that occur
+    frequently together (or strongly correlated) in a data set
+  - Patterns represent ==intrinsic== and ==important properties== of datasets
+- ==Pattern discovery==: Uncovering patterns from massive data sets
+- Motivation examples:
+  - What products were often purchased together?
+  - What are the subsequent purchases after buying an iPad?
+  - What code segments likely contain copy-and-paste bugs?
+  - What word sequences likely form phrases in this corpus? ![[CleanShot
+2023-10-26 at 01.53.56@2x.png]] ![[CleanShot 2023-10-26 at 01.54.32@2x.png]]
+    ![[CleanShot 2023-10-26 at 01.54.44@2x.png]] ![[CleanShot 2023-10-26 at
+01.55.00@2x.png]]
+
+## Efficient Pattern Mining Methods
+
+- The [[Apriori]] Algorithm
+- [[FP-Growth]]: A Frequent Pattern-Growth Approach
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/variants.md b/SI/Resource/Fundamentals of Data Mining/Content/variants.md
new file mode 100644
index 0000000..767e696
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Content/variants.md
@@ -0,0 +1,16 @@
+---
+id: variants
+aliases:
+  - Variations of _K-Means_
+tags: []
+---
+
+# Variations of _K-Means_
+
+- There are many variants of the _K-Means_ method, varying in different aspects
+  - Choosing better initial centroid estimates
+    - _[[K-means++]]_, _Intelligent K-Means_, _Genetic K-Means_
+  - Choosing different representative prototypes for the clusters
+    - _[[K-Medoids]]_, _[[K-Medians]]_, _[[K-Modes]]_
+  - Applying feature transformation techniques
+    - _Weighted K-Means_, _[[Kernel K-Means]]_
diff --git a/SI/Resource/Fundamentals of Data Mining/Midterm - CS663.md b/SI/Resource/Fundamentals of Data Mining/Midterm - CS663.md
new file mode 100644
index 0000000..358ed89
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Midterm - CS663.md
@@ -0,0 +1,69 @@
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+id: Midterm - CS663
+aliases:
+  - Review
+tags: []
+------------------------------------------------------------------------------------------------------------
+# Review
+## Types of Questions
+- True or false
+- Multi-choice
+- Explain (e.g., [[K-Means]], [[NMI]])
+- [[Compare and Contrast]] (e.g., [[clustering algorithms]])
+- Computational questions (e.g., [[DBSCAN]] and [[OPTICS]] (similar to your
+  assignment questions), [[FP-growth|fp-tree]] and [[pattern discovery]]
+  (examples from the lecture), [[clustering evaluation]] )
+
+## Subjects
+- [[Distance measures (mixted types of attributes)]]
+  - How to handle [[nominal]] attributes …
+    - [[nominal|Match or no-match]](as a whole or individually)
+    - [[nominal|One-hot encoding]]
+    - [[Target encoding]]
+- Normalization ([[z-score]], [[mixted types of attributes|min-max]], …)
+- Clustering techniques:
+  - [[K-Means]] and its [[variants]]
+  - [[Hierarchical Clustering]] ([[Hierarchical Clustering|Agglomerative]])
+  - [[Density-based Clustering]]([[DBSCAN]], [[OPTICS]])
+  - [[Complexity]], [[distance functions]]
+- How to measure clustering quality ([[internal]] and [[external]] measures,
+  [[F-measure]] and its averaging/combining options when applied to multiple
+  classes/clusters)
+- Frequent pattern mining ([[Apriori]] Algorithm, [[FP-growth]])
+
+---
+
+############################################################################ [[Data Matrix and Dissimilarity Matrix]]
+
+- Data matrix
+  - A data matrix of n data points with / dimensions ![[CleanShot 2023-10-23 at
+17.37.59@2x.png]]
+- Dissimilarity (distance) matrix (n by n)
+  - n data points, but registers only the distance _d(i,j)_(typically
+    metric)![[CleanShot 2023-10-23 at 17.41.47@2x.png]]
+  - Usually symmetric, thus a trinagular matrix
+  - **[[Distance functions]]** are usually different for real, boolean,
+    categorical, ordinal, ratio, and vector variables
+  - Weights can be associated with different variables based on applications and
+    data semantics
+
+### Standardizing Numeric Data
+
+- [[Z-score]]: $z = \dfrac{x - \mu}{\sigma}$
+  - X: raw score to be standardized, $\mu$: mean of the population, $\sigma$:
+    standard deviation
+  - the distance between the raw score and the population mean in units of the
+    standard deviation
+  - negative when the raw score is below the mean, "+" when above
+- An alternative way: Calculate the mean absolute deviation $S_{f} =
+  \dfrac{1}{n}(|x_{1f} - m_f| + |x_{2f} - m_f| + ... + |x_{nf} - m_f|)$ where
+  $m_f = \dfrac{1}{n}(x_{1f} + x_{2f} + ... + x_{nf})$
+  - standardized measure (z-score): $z_{if} = \dfrac{x_{if} - m_f}{S_f}$
+- **Using mean absolute devication is more robust than using standard
+  deviation**
+
+### Proximity Measure for [[Binary|Binary Attributes]]
+
+############################################################################ Proximity Measure for [[nominal|Categorical Attributes]]
+
+############################################################################ [[Ordinal]] Variables
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 17.37.59@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 17.37.59@2x.png
new file mode 100644
index 0000000..3696a4c
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 17.37.59@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 17.41.47@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 17.41.47@2x.png
new file mode 100644
index 0000000..c3b4196
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 17.41.47@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.09.20@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.09.20@2x.png
new file mode 100644
index 0000000..69ae2b8
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.09.20@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.16.28@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.16.28@2x.png
new file mode 100644
index 0000000..3dc651a
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.16.28@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.17.24@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.17.24@2x.png
new file mode 100644
index 0000000..c7af4ad
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.17.24@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.18.06@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.18.06@2x.png
new file mode 100644
index 0000000..06fdfcf
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.18.06@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.19.11@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.19.11@2x.png
new file mode 100644
index 0000000..15cbe36
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.19.11@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.19.54@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.19.54@2x.png
new file mode 100644
index 0000000..50b7e7b
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.19.54@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.20.19@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.20.19@2x.png
new file mode 100644
index 0000000..18e9111
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.20.19@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.20.36@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.20.36@2x.png
new file mode 100644
index 0000000..34a9fca
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.20.36@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.22.00@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.22.00@2x.png
new file mode 100644
index 0000000..11c66f2
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.22.00@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.22.12@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.22.12@2x.png
new file mode 100644
index 0000000..0cac265
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.22.12@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.28.55@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.28.55@2x.png
new file mode 100644
index 0000000..16a9b08
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.28.55@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.43.35@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.43.35@2x.png
new file mode 100644
index 0000000..1fff0f3
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-23 at 18.43.35@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 15.34.07@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 15.34.07@2x.png
new file mode 100644
index 0000000..1ac50c5
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 15.34.07@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 21.42.19@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 21.42.19@2x.png
new file mode 100644
index 0000000..88cdb81
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 21.42.19@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.06.43@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.06.43@2x.png
new file mode 100644
index 0000000..afe6414
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.06.43@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.07.05@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.07.05@2x.png
new file mode 100644
index 0000000..165004d
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.07.05@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.11.21@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.11.21@2x.png
new file mode 100644
index 0000000..c039b8b
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.11.21@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.14.30@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.14.30@2x.png
new file mode 100644
index 0000000..94d41cf
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.14.30@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.14.45@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.14.45@2x.png
new file mode 100644
index 0000000..2b934c4
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.14.45@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.15.07@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.15.07@2x.png
new file mode 100644
index 0000000..df28ef3
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.15.07@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.15.23@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.15.23@2x.png
new file mode 100644
index 0000000..396166d
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.15.23@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.21.02@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.21.02@2x.png
new file mode 100644
index 0000000..f7a8993
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.21.02@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.21.23@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.21.23@2x.png
new file mode 100644
index 0000000..33c153c
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.21.23@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.21.37@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.21.37@2x.png
new file mode 100644
index 0000000..65ccece
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.21.37@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.21.59@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.21.59@2x.png
new file mode 100644
index 0000000..f7b18fa
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.21.59@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.22.49@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.22.49@2x.png
new file mode 100644
index 0000000..9a8c1e5
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.22.49@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.23.05@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.23.05@2x.png
new file mode 100644
index 0000000..7b968ff
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-24 at 22.23.05@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.57.30@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.57.30@2x.png
new file mode 100644
index 0000000..688beff
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.57.30@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.57.51@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.57.51@2x.png
new file mode 100644
index 0000000..773775d
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.57.51@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.58.04@2x 1.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.58.04@2x 1.png
new file mode 100644
index 0000000..c1387ec
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.58.04@2x 1.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.58.04@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.58.04@2x.png
new file mode 100644
index 0000000..c1387ec
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.58.04@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.58.19@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.58.19@2x.png
new file mode 100644
index 0000000..9310051
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.58.19@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.58.40@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.58.40@2x.png
new file mode 100644
index 0000000..89ca276
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.58.40@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.59.49@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.59.49@2x.png
new file mode 100644
index 0000000..5c66d6d
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 15.59.49@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.00.03@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.00.03@2x.png
new file mode 100644
index 0000000..f86fd7e
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.00.03@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.08.52@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.08.52@2x.png
new file mode 100644
index 0000000..7a80a70
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.08.52@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.09.04@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.09.04@2x.png
new file mode 100644
index 0000000..ebce74a
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.09.04@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.09.16@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.09.16@2x.png
new file mode 100644
index 0000000..ee8f4e7
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.09.16@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.10.06@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.10.06@2x.png
new file mode 100644
index 0000000..3d22627
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.10.06@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.10.19@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.10.19@2x.png
new file mode 100644
index 0000000..f75a7dc
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.10.19@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.10.34@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.10.34@2x.png
new file mode 100644
index 0000000..17df89c
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.10.34@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.11.31@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.11.31@2x.png
new file mode 100644
index 0000000..4f5411b
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.11.31@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.11.42@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.11.42@2x.png
new file mode 100644
index 0000000..18c4973
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.11.42@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.11.56@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.11.56@2x.png
new file mode 100644
index 0000000..bc850a5
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.11.56@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.12.11@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.12.11@2x.png
new file mode 100644
index 0000000..0b8801a
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.12.11@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.12.25@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.12.25@2x.png
new file mode 100644
index 0000000..98bd7e1
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.12.25@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.48.09@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.48.09@2x.png
new file mode 100644
index 0000000..56fe16c
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.48.09@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.48.19@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.48.19@2x.png
new file mode 100644
index 0000000..156f250
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 16.48.19@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 17.08.12@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 17.08.12@2x.png
new file mode 100644
index 0000000..043a4e2
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 17.08.12@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 17.08.29@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 17.08.29@2x.png
new file mode 100644
index 0000000..ebe9790
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 17.08.29@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 17.08.47@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 17.08.47@2x.png
new file mode 100644
index 0000000..6eeb3e8
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 17.08.47@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 22.44.57.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 22.44.57.png
new file mode 100644
index 0000000..23c94d4
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-25 at 22.44.57.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-26 at 01.53.56@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-26 at 01.53.56@2x.png
new file mode 100644
index 0000000..ffe51dc
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-26 at 01.53.56@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-26 at 01.54.32@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-26 at 01.54.32@2x.png
new file mode 100644
index 0000000..c123219
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-26 at 01.54.32@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-26 at 01.54.44@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-26 at 01.54.44@2x.png
new file mode 100644
index 0000000..222267b
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-26 at 01.54.44@2x.png
diff --git a/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-26 at 01.55.00@2x.png b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-26 at 01.55.00@2x.png
new file mode 100644
index 0000000..b694ba0
--- /dev/null
+++ b/SI/Resource/Fundamentals of Data Mining/Screenshots/CleanShot 2023-10-26 at 01.55.00@2x.png
author	TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com>	2024-04-29 22:06:12 -0400
committer	TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com>	2024-04-29 22:06:12 -0400
commit	4d53fa14ee0cd615444aca6f6ba176e0ccc1b5be (patch)
tree	4d9f0527d9e6db4f92736ead0aa9bb3f840a0f89 /SI/Resource/Fundamentals of Data Mining