--- id: 2023-12-18 aliases: December 18, 2023 tags: - link-note - Data-Science - Machine-Learning - Regularization --- # Regularization ## Regularization Loss Function - The complexity of the model **$\uparrow$** == the number of model parameters **$\uparrow$** - As the complexity of the model **$\uparrow$** == overfitting **$\uparrow$** - Define a model with high complexity, learn only important parameters, and set unnecessary parameter values to **0** ## Regularization Types ### Ridge Regression (L2 Regression) - $L = \bbox[orange,3px] {\sum_{i=1}^{n} (y_i - (\beta_0 + \sum_{j=1}^{D} \beta_j x_{ij}))^{2}} + \bbox[blue,3px] {\lambda \sum_{j=1}^{D} \beta_j^2}$ - $\bbox[orange,3px]{\text{MSE}}$ - $\bbox[blue,3px]{\text{Ridge}}$ - If MSE loss is not reduced, the loss value of the penalty term becomes larger - Lambda $\lambda$ is a hyperparameter that controls the impact of regularization - Normalization function expressed as sum of squares ### Lasso Regression (L1 Regression) - $L = \sum\limits_{i=1}^{n}(y_{i}- (\beta_{0}+ \sum\limits_{j=1}^{D} \beta_{j}x_{ij}))^{2}+ \lambda \sum\limits_{j=1}^{D} |\beta_j|$ - If MSE loss is not reduced, the loss value of the penalty term becomes larger - Lambda $\lambda$ is a hyperparameter that controls the impact of regularization - Normalization function expressed as sum of absolute ![[Pasted image 20231218032332.png]] ## Question - $\lambda \uparrow$ == Bias error $\uparrow$ and Variance error $\downarrow$ - Sparsity: Ridge regression $<$ Lasso regression - How to make more parameters that have 0 values? 1. $\lambda \uparrow$ 2. Exponent $\downarrow$ - Good? or Bad?: don't know