summaryrefslogtreecommitdiff
path: root/SI/Resource/Data Science/Machine Learning/Machine Learning.md
blob: 7d397369fcc063e604a4e566dca7a9cd95288848 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
id: "2023-12-17"
aliases:
  - December 17, 2023
  - Machine Learning
tags:
  - Machine-Learning
  - Data-Science
---

# Machine Learning

Machine Learning (ML) is a subset of artificial intelligence (AI) that focuses on the development of systems and algorithms that can learn from and make decisions or predictions based on data. The core idea is to enable a machine to make intelligent decisions or predictions without being explicitly programmed to perform the task.

## Machine

Machine is a model or a function derived from data given by human

## Learning

Learning is to find the best model represented data, meaning optimization of parameter

- Optimization of parameter: By statistical method or [[Gradient descent | gradient descent]]

## Goal

[[Optimization]]: Find optimal parameters

- Optimal: is the best representation of data
  - A model with the smallest difference between predictions $\hat{y}$ and actual values $y$
  - A model parameter makes the smallest loss

## Supervised Learning

1. [[Regression]]
   - [[Linear Regression]] and [[Nonlinear Regression]]
   - [[Gradient descent]]
   - [[Bias and Variance]] Trade-off
2. [[Classification]]
   - [[Sigmoid]]
   - [[Logistic Regression]] and [[Softmax Regression]]
   - [[Support Vector Machine]] ([[Support Vector Machine |SVM]])
   - [[Decision Tree]]
   - [[Linear Discriminant Analysis]] ([[Linear Discriminant Analysis |LDA]])
     1. [[Ensemble]]
     - [[Bagging]]
     - [[Boosting]]

## Unsupervised Learning

1. [[Preprocessing]]
   - [[Principal Component Analysis]] ([[Principal Component Analysis |PCA]])
   - [[Singular Value Decomposition]] ([[Singular Value Decomposition |SVD]])
2. [[Clustering]]
   - [[K-Means]]
   - [[Mean Shift]]
   - [[Gaussian Mixture Model]]
   - [[DBSCAN]]

## Notations

- Data Properties
  - Features (= attributes, independent variables): X
    - characteristics of data or items
    - N: # of data sample
    - D: # of features
  - Label (dependent variables): y
    - if there is a label, it is supervised. Otherwise, it is unsupervised
  - Parameter (=weight): learnable parameters that a model have, not given data
  - [[Hyperparameter]]: parameters that human has to decide
- Input vs. Output
  - Input ($X$): values, parts of features, are put into a model
  - Output ($\hat{y}$): values of prediction derived from model
- Linear vs. Nonlinear
  - Linear regression: a model can be implemented by a linear function
    - Simple Linear Regression: Involves two variables — one independent variable and one dependent variable. The relationship between these variables is modeled as a straight line.
    - Multiple Linear Regression: Uses more than one independent variable to predict a dependent variable. The relationship is still linear in nature, meaning it assumes a straight-line relationship between each independent variable and the dependent variable.
    - ex) $y = w_0 + w_1*x_1 + w_2*x_2 + \dots + w_D*x_D, y = w_0 + w_1*x_1 + w_2*x^2$
  - Non-linear regression: a model can't be implemented by a linear function
    - ex) $log(y) = w_0 + w_1*log(x), y = max(x, 0)$

## Basic Math for ML

- Function
  - Relationships or rules between two groups
  - $y = f(x)$, $x$ = input, $y$ = output
- Linear function
  - $y = a*x +b$, $(a \ne 0)$
  - a: coefficient (slope), b = intercept
- Instantaneous Rate of Change
  - is **the change in the rate at a particular instant**, and it is same as the change in the derivative value at a specific point.
  - Slope (coefficient) where a $x (=a)$ meets a graph
- Derivation
  - finds the instantaneous rate of change
  - $f'(x)$ or $\dfrac{d}{dx}f(x)$
- Minimum of function
  - **The instantaneous rate of change at the minimum of function is always _0_.**
  - Using this property, can find optimal parameter value
- Exponential
  - $e = \lim_{n \to \infty} (1 + \dfrac{1}{n})^n$
  - a function or growth pattern that increases at a rate proportional to its current value
  - $\dfrac{d}{dx} e^x = e^x$