blob: 7475105acecad2b822a318637082b68583b2e75a (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
|
---
id: 2023-12-18
aliases: December 18, 2023
tags:
- link-note
- Data-Science
- Machine-Learning
- Regularization
---
# Regularization
## Regularization Loss Function
- The complexity of the model **$\uparrow$** == the number of model parameters **$\uparrow$**
- As the complexity of the model **$\uparrow$** == overfitting **$\uparrow$**
- Define a model with high complexity, learn only important parameters, and set unnecessary parameter values to **0**
## Regularization Types
### Ridge Regression (L2 Regression)
- $L = \bbox[orange,3px] {\sum_{i=1}^{n} (y_i - (\beta_0 + \sum_{j=1}^{D} \beta_j x_{ij}))^{2}} + \bbox[blue,3px] {\lambda \sum_{j=1}^{D} \beta_j^2}$
- $\bbox[orange,3px]{\text{MSE}}$
- $\bbox[blue,3px]{\text{Ridge}}$
- If MSE loss is not reduced, the loss value of the penalty term becomes larger
- Lambda $\lambda$ is a hyperparameter that controls the impact of regularization
- Normalization function expressed as sum of squares
### Lasso Regression (L1 Regression)
- $L = \sum\limits_{i=1}^{n}(y_{i}- (\beta_{0}+ \sum\limits_{j=1}^{D} \beta_{j}x_{ij}))^{2}+ \lambda \sum\limits_{j=1}^{D} |\beta_j|$
- If MSE loss is not reduced, the loss value of the penalty term becomes larger
- Lambda $\lambda$ is a hyperparameter that controls the impact of regularization
- Normalization function expressed as sum of absolute
![[Pasted image 20231218032332.png]]
## Question
- $\lambda \uparrow$ == Bias error $\uparrow$ and Variance error $\downarrow$
- Sparsity: Ridge regression $<$ Lasso regression
- How to make more parameters that have 0 values?
1. $\lambda \uparrow$
2. Exponent $\downarrow$
- Good? or Bad?: don't know
|