diff options
Diffstat (limited to 'SI/Resource/Data Science')
| -rw-r--r-- | SI/Resource/Data Science/Machine Learning/Contents/Bias and Variance.md | 6 | ||||
| -rw-r--r-- | SI/Resource/Data Science/Machine Learning/Machine Learning.md | 24 |
2 files changed, 15 insertions, 15 deletions
diff --git a/SI/Resource/Data Science/Machine Learning/Contents/Bias and Variance.md b/SI/Resource/Data Science/Machine Learning/Contents/Bias and Variance.md index 294f138..a3e6398 100644 --- a/SI/Resource/Data Science/Machine Learning/Contents/Bias and Variance.md +++ b/SI/Resource/Data Science/Machine Learning/Contents/Bias and Variance.md @@ -34,9 +34,9 @@ tags: - Solution - Use validation data set - - $\bbox[teal,5px,border:2px solid red]{\text{Train data (80\%)+ Valid data (10\%) + Test data (10\%)}}$ - - Cannot directly participate in model training - - Continuously evaluates in the learning base, and stores the best existing performance + - $\bbox[teal,5px,border:2px solid red]{\text{Train data (80\%)+ Valid data (10\%) + Test data (10\%)}}$ + - Cannot directly participate in model training + - Continuously evaluates in the learning base, and stores the best existing performance - K-fold cross validation - **Leave-One-Out Cross-Validation (LOOCV)** - a special case of k-fold cross-validation where **K** is equal to the number of data points in the dataset. diff --git a/SI/Resource/Data Science/Machine Learning/Machine Learning.md b/SI/Resource/Data Science/Machine Learning/Machine Learning.md index 6dbb5e8..3ee6924 100644 --- a/SI/Resource/Data Science/Machine Learning/Machine Learning.md +++ b/SI/Resource/Data Science/Machine Learning/Machine Learning.md @@ -32,7 +32,7 @@ Learning is to find the best model represented data, meaning optimization of par - A model with the smallest difference between predictions $\hat{y}$ and actual values $y$ - A model parameter makes the smallest loss -## Types of learning +## Types of Learning ### Supervised Learning @@ -46,9 +46,9 @@ Learning is to find the best model represented data, meaning optimization of par - [[Support Vector Machine]] ([[Support Vector Machine |SVM]]) - [[Decision Tree]] - [[Linear Discriminant Analysis]] ([[Linear Discriminant Analysis |LDA]]) - 1. [[Ensemble]] - - [[Bagging]] - - [[Boosting]] + 1. [[Ensemble]] + - [[Bagging]] + - [[Boosting]] ### Unsupervised Learning @@ -65,11 +65,11 @@ Learning is to find the best model represented data, meaning optimization of par - Data Properties - Features (= attributes, independent variables): X - - characteristics of data or items - - N: # of data sample - - D: # of features + - characteristics of data or items + - N: # of data sample + - D: # of features - Label (dependent variables): y - - if there is a label, it is supervised. Otherwise, it is unsupervised + - if there is a label, it is supervised. Otherwise, it is unsupervised - Parameter (=weight): learnable parameters that a model have, not given data - [[Hyperparameter]]: parameters that human has to decide - Input vs. Output @@ -77,11 +77,11 @@ Learning is to find the best model represented data, meaning optimization of par - Output ($\hat{y}$): values of prediction derived from model - Linear vs. Nonlinear - Linear regression: a model can be implemented by a linear function - - Simple Linear Regression: Involves two variables — one independent variable and one dependent variable. The relationship between these variables is modeled as a straight line. - - Multiple Linear Regression: Uses more than one independent variable to predict a dependent variable. The relationship is still linear in nature, meaning it assumes a straight-line relationship between each independent variable and the dependent variable. - - ex) $y = w_0 + w_1*x_1 + w_2*x_2 + \dots + w_D*x_D, y = w_0 + w_1*x_1 + w_2*x^2$ + - Simple Linear Regression: Involves two variables — one independent variable and one dependent variable. The relationship between these variables is modeled as a straight line. + - Multiple Linear Regression: Uses more than one independent variable to predict a dependent variable. The relationship is still linear in nature, meaning it assumes a straight-line relationship between each independent variable and the dependent variable. + - ex) $y = w_0 + w_1*x_1 + w_2*x_2 + \dots + w_D*x_D, y = w_0 + w_1*x_1 + w_2*x^2$ - Non-linear regression: a model can't be implemented by a linear function - - ex) $log(y) = w_0 + w_1*log(x), y = max(x, 0)$ + - ex) $log(y) = w_0 + w_1*log(x), y = max(x, 0)$ ## Basic Math for ML |
