From 92e7e3391473696d18eef7324548112adf01c080 Mon Sep 17 00:00:00 2001 From: lee Date: Thu, 15 Apr 2021 17:52:16 +0100 Subject: [PATCH 1/2] week2 - day5: testing and feedback --- one_md_per_day_format/piscine/Week2/day1.md | 349 ++++++++++---------- 1 file changed, 169 insertions(+), 180 deletions(-) diff --git a/one_md_per_day_format/piscine/Week2/day1.md b/one_md_per_day_format/piscine/Week2/day1.md index 62f7fb2..3d8aec1 100644 --- a/one_md_per_day_format/piscine/Week2/day1.md +++ b/one_md_per_day_format/piscine/Week2/day1.md @@ -1,13 +1,12 @@ -# W2D01 Piscine AI - Data Science +# W2D01 Piscine AI - Data Science -The goal of this day is to understand practical Linear regression and supervised learning. +The goal of this day is to understand practical Linear regression and supervised learning. +Author: -Author: - -# Table of Contents: -Historical part: +# Table of Contents +Historical part: # Introduction @@ -16,30 +15,33 @@ studied the size of individuals within a progeny. He was trying to understand wh large individuals in a population appeared to have smaller children, more close to the average population size; hence the introduction of the term "regression". -Today we will learn a basic algorithm used in **supervised learning** : **The Linear Regression**. We will be using **Scikit-learn** which is a machine learning library. It is designed to to interoperate with the Python libraries NumPy and Pandas. -We will also learn progressively the Machine Learning methodology for supervised learning - today we will focus on evalutatig a machine learning model by splitting the data set in a train set and a test set. +Today we will learn a basic algorithm used in **supervised learning** : **The Linear Regression**. We will be using **Scikit-learn** which is a machine learning library. It is designed to interoperate with the Python libraries NumPy and Pandas. + +We will also learn progressively the Machine Learning methodology for supervised learning - today we will focus on evaluating a machine learning model by splitting the data set in a train set and a test set. '0.22.1' ## Rules -## Ressources -### To start with Scikit-learn: +## Ressources + +### To start with Scikit-learn + - https://scikit-learn.org/stable/tutorial/basic/tutorial.html - https://jakevdp.github.io/PythonDataScienceHandbook/05.02-introducing-scikit-learn.html -https://scikit-learn.org/stable/modules/linear_model.html +- https://scikit-learn.org/stable/modules/linear_model.html -### Machine learning methodology and algorithms: +### Machine learning methodology and algorithms -- This course provides a broad introduction to machine learning, datamining, and statistical pattern recognition. Andrew Ng is a star in the Machine Learning community. I recommend to spend some time during the projects to focus on some algorithms. However, Python is not the langage used for the course. https://www.coursera.org/learn/machine-learning +- This course provides a broad introduction to machine learning, datamining, and statistical pattern recognition. Andrew Ng is a star in the Machine Learning community. I recommend to spend some time during the projects to focus on some algorithms. However, Python is not the language used for the course. https://www.coursera.org/learn/machine-learning - https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-cheat-sheet -https://scikit-learn.org/stable/tutorial/index.html +- https://scikit-learn.org/stable/tutorial/index.html -### Linear Regression +### Linear Regression - https://towardsdatascience.com/laymans-introduction-to-linear-regression-8b334a3dab09 @@ -48,78 +50,76 @@ https://scikit-learn.org/stable/tutorial/index.html ### Train test split - https://machinelearningmastery.com/train-test-split-for-evaluating-machine-learning-algorithms/ -- https://developers.google.com/machine-learning/crash-course/training-and-test-sets/video-lecture?hl=en +- https://developers.google.com/machine-learning/crash-course/training-and-test-sets/video-lecture?hl=en # Exercise 1 Scikit-learn estimator -The goal of this exercise is to learn to fit a Scikit-learn estimator and use it to predict. - -``` +The goal of this exercise is to learn to fit a Scikit-learn estimator and use it to predict. +```console X, y = [[1],[2.1],[3]], [[1],[2],[3]] +``` -``` -1. Fit a LinearRegression from Scikit-learn with X the features and y the target. +1. Fit a LinearRegression from Scikit-learn with X the features and y the target. 2. Predict for `x_pred = [[4]]` -3. Print the coefficients (`coefs_`) and the intercept (`intercept_`), the score (`score`)of the regression of X and y. - - -## Correction - +3. Print the coefficients (`coefs_`) and the intercept (`intercept_`), the score (`score`) of the regression of X and y. -1. This question is validated if the ouput of the fitted model is: +## Correction - ``` +1. This question is validated if the output of the fitted model is: + ```python LinearRegression(copy_X=True, fit_intercept=[[1], [2.1], [3]], n_jobs=None, normalize=[[1], [2], [3]]) ``` -2. This question is validated if the ouput is: +2. This question is validated if the output is: - ``` + ```python array([[3.96013289]]) ``` -3. This question is validated if the ouptut is: - ``` +3. This question is validated if the output is: + + ```output Coefficients: [[0.99667774]] Intercept: [-0.02657807] Score: 0.9966777408637874 ``` - # Exercise 2 Linear regression in 1D The goal of this exercise is to understand how the linear regression works in one dimension. To do so, we will generate a data in one dimension. Using `make regression` from Scikit-learn, generate a data set with 100 observations: - ``` - X, y, coef = make_regression(n_samples=100, - n_features=1, - n_informative=1, - noise=10, - coef=True, - random_state=0, - bias=100.0) - ``` +```python +X, y, coef = make_regression(n_samples=100, + n_features=1, + n_informative=1, + noise=10, + coef=True, + random_state=0, + bias=100.0) +``` -1. Plot the data using matplotlib. The plot should look like this: +1. Plot the data using matplotlib. The plot should look like this: ![alt text][q1] [q1]: images/day1/ex2/w2_day1_ex2_q1.png "Scatter plot" -2. Fit a LinearRegression from Scikit-learn on the generated data and give the equation of the fitted line. The expected output is: `y = coef * x + intercept` -3. Add the fitted line to the plot. the plot should look like this: +2. Fit a LinearRegression from Scikit-learn on the generated data and give the equation of the fitted line. The expected output is: `y = coef * x + intercept` + +3. Add the fitted line to the plot. the plot should look like this: ![alt text][q3] [q3]: images/day1/ex2/w2_day1_ex2_q3.png "Scatter plot + fitted line" 4. Predict on X + 5. Create a function that computes the Mean Squared Error (MSE) and compute the MSE on the data set. *The MSE is frequently used as well as other regression metrics that will be studied later this week.* ``` def compute_mse(y_true, y_pred): @@ -129,23 +129,21 @@ The goal of this exercise is to understand how the linear regression works in on Change the `noise` parameter of `make_regression` to 50 -6. Repeat question 2, 4 and compute the MSE on the new data. - +6. Repeat question 2, 4 and compute the MSE on the new data. https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html ## Correction -1. This question is validated if the plot looks like: +1. This question is validated if the plot looks like: ![alt text][q1] [q1]: images/day1/ex2/w2_day1_ex2_q1.png "Scatter plot" -2. This question is validated if the equation of the fitted line is: `y = 42.619430291366946 * x + 99.18581817296929 -` +2. This question is validated if the equation of the fitted line is: `y = 42.619430291366946 * x + 99.18581817296929` -3. This question is validated if the plot looks like: +3. This question is validated if the plot looks like: ![alt text][q3] @@ -153,38 +151,39 @@ https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_e 4. This question is validated if the outputted prediction for the first 10 values are: -``` +```python array([ 83.86186727, 140.80961751, 116.3333897 , 64.52998689, 61.34889539, 118.10301628, 57.5347917 , 117.44107847, 108.06237908, 85.90762675]) ``` + 5. This question is validated if the MSE returned is `114.17148616819485` 6. This question is validated if the MSE returned is `2854.2871542048706` # Exercise 3: Train test split -The goal of this exercise is to learn to split a data set. It is important to understand why we split the data in two sets. To put it in a nutshell: the Machine Learning algorithms learns on the training data and is evaluated on the that it hasn't seen before: the testing data. +The goal of this exercise is to learn to split a data set. It is important to understand why we split the data in two sets. To put it in a nutshell: the Machine Learning algorithms learns on the training data and is evaluates on the data that hasn't seen before: the testing data. This video gives a basic and nice explanation: https://www.youtube.com/watch?v=_vdMKioCXqQ This article explains the conditions to split the data and how to split it: https://machinelearningmastery.com/train-test-split-for-evaluating-machine-learning-algorithms/ - ``` - X = np.arange(1,21).reshape(10,-1) - y = np.arange(1,11) - ``` -1. Split the data using `train_test_split` with `shuffle=False`. The test set represents 20% of the total size of the data set. Print X_train, y_train, X_test, y_test. +```python +X = np.arange(1,21).reshape(10,-1) +y = np.arange(1,11) +``` -https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html +1. Split the data using `train_test_split` with `shuffle=False`. The test set represents 20% of the total size of the data set. Print X_train, y_train, X_test, y_test. +https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html ## Correction 1. This question is validated if X_train, y_train, X_test, y_test match this output: -``` -X_train: +```console +X_train: [[ 1 2] [ 3 4] [ 5 6] @@ -195,52 +194,50 @@ X_train: [15 16]] -y_train: +y_train: [1 2 3 4 5 6 7 8] -X_test: +X_test: [[17 18] [19 20]] -y_test: +y_test: [ 9 10] ``` # Exercise 4 Forecast diabetes progression - The goal of this exercise is to use Linear Regression to forecast the progression of diabetes. It will not always be precised, you should **ALWAYS** start doing an exploratory data analysis in order to have a good understanding of the data you model. As a reminder here an introduction to EDA: -https://towardsdatascience.com/exploratory-data-analysis-eda-a-practical-guide-and-template-for-structured-data-abfbf3ee3bd9 + +- https://towardsdatascience.com/exploratory-data-analysis-eda-a-practical-guide-and-template-for-structured-data-abfbf3ee3bd9 The data set used is described in https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_diabetes. -``` +```python from sklearn.datasets import load_diabetes diabetes = load_diabetes() X, y = diabetes.data, diabetes.target ``` -1. Using `train_test_split`, split the data set in a train set and test set (20%). Use `random_state=43` for results reproducibility. - -2. Fit the Linear Regression on all the variables. Give the coefficients and the intercept of the Linear Regression. What is then the equation ? - -3. Predict on the test set. Predicting on the test set is like having new patients for who, as a physician, need to forecast the disease progression in one year given the 10 baseline variables. -4. Compute the MSE on the train set and test set. Later this week we will learn about the R2 which will help us to evaluate the performance of this fitted Linear Regression. The MSE returns an arbitrary value depending on the range of error. +1. Using `train_test_split`, split the data set in a train set, and test set (20%). Use `random_state=43` for results reproducibility. +2. Fit the Linear Regression on all the variables. Give the coefficients and the intercept of the Linear Regression. What is the the equation ? +3. Predict on the test set. Predicting on the test set is like having new patients for who, as a physician, need to forecast the disease progression in one year given the 10 baseline variables. -WARNING: This will be explained later this week. But here, we are doing something "dangerous". As you may have read in the data documentation the data is scaled using the whole dataset whereas we should first scale the data on the training set and then use this scaling on the test set. This is a toy example, so let's ingore this detail for now. - +4. Compute the MSE on the train set and test set. Later this week we will learn about the R2 which will help us to evaluate the performance of this fitted Linear Regression. The MSE returns an arbitrary value depending on the range of error. +**WARNING**: This will be explained later this week. But here, we are doing something "dangerous". As you may have read in the data documentation the data is scaled using the whole dataset whereas we should first scale the data on the training set and then use this scaling on the test set. This is a toy example, so let's ignore this detail for now. https://scikit-learn.org/stable/datasets/toy_dataset.html#diabetes-dataset -## Correction +## Correction 1. This question is validated if the output of `y_train.values[:10]` and `y_test.values[:10]`are: - ``` + + ```console y_train.values[:10]: [[202.] [ 55.] @@ -264,11 +261,11 @@ https://scikit-learn.org/stable/datasets/toy_dataset.html#diabetes-dataset [ 78.] [ 66.] [192.]] - ``` + 2. This question is validated if the coefficients and the intercept are: - ``` + ```console [('age', -60.40163046086952), ('sex', -226.08740652083418), ('bmi', 529.383623302316), @@ -282,9 +279,9 @@ https://scikit-learn.org/stable/datasets/toy_dataset.html#diabetes-dataset ('intercept', 152.05314895029233)] ``` -3. This question is validated if the output of `predictions_on_test[:10]` is: +3. This question is validated if the output of `predictions_on_test[:10]` is: - ``` + ```console array([[111.74351759], [ 98.41335251], [168.36373195], @@ -295,141 +292,135 @@ https://scikit-learn.org/stable/datasets/toy_dataset.html#diabetes-dataset [126.28961941], [117.73121787], [224.83346984]]) - ``` -4. This question is validated if the mse on the **train set** is `2888.326888` and the mse on the **test set** is `2858.255153`. +4. This question is validated if the mse on the **train set** is `2888.326888` and the mse on the **test set** is `2858.255153`. ## Exercise 5 Gradient Descent -The goal of this exercise is to understand how the Linear Regression algorithm finds the optimal coefficients. +The goal of this exercise is to understand how the Linear Regression algorithm finds the optimal coefficients. -The goal is to fit a Linear Regression on a one dimensional features data **without using Scikit-learn**. Let's use the data set we generated for the exercise 1: +The goal is to fit a Linear Regression on a one dimensional features data **without using Scikit-learn**. Let's use the data set we generated for the exercise 2: +```python +X, y, coef = make_regression(n_samples=100, + n_features=1, + n_informative=1, + noise=10, + coef=True, + random_state=0, + bias=100.0) +``` - ``` - X, y, coef = make_regression(n_samples=100, - n_features=1, - n_informative=1, - noise=10, - coef=True, - random_state=0, - bias=100.0) - ``` *Warning: The shape of X is not the same as the shape of y. You may need (for some questions) to reshape X using: `X.reshape(1,-1)[0]`.* -1. Plot the data using matplotlib: +1. Plot the data using matplotlib: ![alt text][ex5q1] [ex5q1]: images/day1/ex5/w2_day1_ex5_q1.png "Scatter plot " -As a reminder, fitting a Linear Regression on this data means finding (a,b) that fits well the data points. +As a reminder, fitting a Linear Regression on this data means finding (a,b) that fits well the data points. - - y_pred = a*x +b +- `y_pred = a*x +b` Mathematically, it means finding (a,b) that minimizes the MSE, which is the loss used in Linear Regression. If we consider 3 data points: - - Loss(a,b) = MSE(a,b) = - 1/3 *((y_pred1 - y_true1)**2 + (y_pred2 - y_true2)**2) + (y_pred3 - y_true3)**2) +- `Loss(a,b) = MSE(a,b) = 1/3 *((y_pred1 - y_true1)**2 + (y_pred2 - y_true2)**2) + (y_pred3 - y_true3)**2)` + +and we know: - and we know: - y_pred1 = a*x1 + b - y_pred2 = a*x2 + b - y_pred3 = a*x3 + b +y_pred1 = a*x1 + b\ +y_pred2 = a*x2 + b\ +y_pred3 = a*x3 + b ### Greedy approach 2. Create a function `compute_mse`. Compute mse for `a = 1` and `b = 2`. **Warning**: `X.shape` is `(100, 1)` and `y.shape` is `(100, )`. Make sure that `y_preds` and `y` have the same shape before to compute `y_preds-y`. - ``` - def compute_mse(coefs, X, y): - ''' - coefs is a list that contains a and b: [a,b] - X is the features set - y is the target +```python +def compute_mse(coefs, X, y): + ''' + coefs is a list that contains a and b: [a,b] + X is the features set + y is the target - Returns a float which is the MSE - ''' - - #TODO - - y_preds = - mse = - - return mse - ``` + Returns a float which is the MSE + ''' + #TODO -3. Create a grid of **640000** points that combines a and b with. Check that the grid contains 640000 points. + y_preds = + mse = - - a between -200 and 200, step= 0.5 - - b between -200 and 200, step= 0.5 + return mse +``` - This is how to compute the grid with the combination of a and b: +3. Create a grid of **640000** points that combines a and b with. Check that the grid contains 640000 points. - ``` - aa, bb = np.mgrid[-200:200:0.5, -200:200:0.5] - grid = np.c_[aa.ravel(), bb.ravel()] +- a between -200 and 200, step= 0.5 +- b between -200 and 200, step= 0.5 - ``` +This is how to compute the grid with the combination of a and b: -4. Compute the MSE for all points in the grid. If possible, parallelize the computations. It may be needed to use `functools.partial` to parallelize a function with many parameters on a list. Put the result in a variable named `losses`. +```python +aa, bb = np.mgrid[-200:200:0.5, -200:200:0.5] +grid = np.c_[aa.ravel(), bb.ravel()] +``` +4. Compute the MSE for all points in the grid. If possible, parallelize the computations. It may be needed to use `functools.partial` to parallelize a function with many parameters on a list. Put the result in a variable named `losses`. 5. Use this chunk of code to plot the MSE in 2D: - ``` - aa, bb = np.mgrid[-200:200:.5, -200:200:.5] - grid = np.c_[aa.ravel(), bb.ravel()] - losses_reshaped = np.array(losses).reshape(aa.shape) - - f, ax = plt.subplots(figsize=(8, 6)) - contour = ax.contourf(aa, - bb, - losses_reshaped, - 100, - cmap="RdBu", - vmin=0, - vmax=160000) - ax_c = f.colorbar(contour) - ax_c.set_label("MSE") - - ax.set(aspect="equal", - xlim=(-200, 200), - ylim=(-200, 200), - xlabel="$a$", - ylabel="$b$") - ``` - The expected output is: - - ![alt text][ex5q5] - -[ex5q5]: images/day1/ex5/w2_day1_ex5_q5.png "MSE " +```python +aa, bb = np.mgrid[-200:200:.5, -200:200:.5] +grid = np.c_[aa.ravel(), bb.ravel()] +losses_reshaped = np.array(losses).reshape(aa.shape) + +f, ax = plt.subplots(figsize=(8, 6)) +contour = ax.contourf(aa, + bb, + losses_reshaped, + 100, + cmap="RdBu", + vmin=0, + vmax=160000) +ax_c = f.colorbar(contour) +ax_c.set_label("MSE") + +ax.set(aspect="equal", + xlim=(-200, 200), + ylim=(-200, 200), + xlabel="$a$", + ylabel="$b$") +``` +The expected output is: -6. From the `losses` list, find the optimal value of a and b and plot the line in the scatter point of question 1. +![alt text][ex5q5] +[ex5q5]: images/day1/ex5/w2_day1_ex5_q5.png "MSE " +6. From the `losses` list, find the optimal value of a and b and plot the line in the scatter point of question 1. -In this example we computed 160 000 times the MSE. It is frequent to deal with 50 features, which requires 51 parameters to fit the Linear Regression. If we try this approach with 50 features we would need to compute **5.07e+132** MSE. Even if we reduce the scope and try only 5 values per coefficients we would have to compute the MSE **4.4409e+35** times. This approach is not scalable and that is why is not used to find optimal coefficients for Linear Regression. +In this example we computed 160 000 times the MSE. It is frequent to deal with 50 features, which requires 51 parameters to fit the Linear Regression. If we try this approach with 50 features we would need to compute **5.07e+132** MSE. Even if we reduce the scope and try only 5 values per coefficients we would have to compute the MSE **4.4409e+35** times. This approach is not scalable and that is why is not used to find optimal coefficients for Linear Regression. -### Gradient Descent +### Gradient Descent -In a nutshel, Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient. In machine learning, we use gradient descent to update the parameters (a and b) of our model. Parameters refer to the coefficients used in Linear Regression. Before to start implementing the questions, take the time to read the article. https://jairiidriss.medium.com/gradient-descent-algorithm-from-scratch-using-python-2b36c1548917. It explains the gradient descent and how to implement it. The "tricky" part is the computation of the derivative of the mse. You can admit the formulas of the derivatives to implement the gradient descent (`d_theta_0` and `d_theta_1` in the article). +In a nutshel, Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient. In machine learning, we use gradient descent to update the parameters (a and b) of our model. Parameters refer to the coefficients used in Linear Regression. Before to start implementing the questions, take the time to read the article. https://jairiidriss.medium.com/gradient-descent-algorithm-from-scratch-using-python-2b36c1548917. It explains the gradient descent and how to implement it. The "tricky" part is the computation of the derivative of the mse. You can admit the formulas of the derivatives to implement the gradient descent (`d_theta_0` and `d_theta_1` in the article). 7. Implement the gradient descent to find optimal a and b with `learning rate = 0.1` and `nbr_iterations=100`. -8. Save the a and b through the iterations in a two dimensional numpy array. Add them to the plot of the previous part and observe a and b that converge towards the minimum. The plot should look like this: + +8. Save the a and b through the iterations in a two dimensional numpy array. Add them to the plot of the previous part and observe a and b that converge towards the minimum. The plot should look like this: ![alt text][ex5q8] [ex5q8]: images/day1/ex5/w2_day1_ex5_q8.png "MSE + Gradient descent" -9. Use Linear Regression from Scikit-learn. Compare the results. - +9. Use Linear Regression from Scikit-learn. Compare the results. -## Correction +## Correction 1. This question is validated if the outputted plot looks like: @@ -441,13 +432,13 @@ In a nutshel, Gradient descent is an optimization algorithm used to minimize som 3. This question is validated if `grid.shape` is `(640000,2)`. -4. This question is validated if the 10 first values of losses are: +4. This question is validated if the 10 first values of losses are: - ``` - array([158315.41493175, 158001.96852692, 157689.02212209, 157376.57571726, - 157064.62931244, 156753.18290761, 156442.23650278, 156131.79009795, - 155821.84369312, 155512.39728829]) - ``` +```console +array([158315.41493175, 158001.96852692, 157689.02212209, 157376.57571726, + 157064.62931244, 156753.18290761, 156442.23650278, 156131.79009795, + 155821.84369312, 155512.39728829]) +``` 5. This question is validated if the outputted plot looks like @@ -456,14 +447,14 @@ In a nutshel, Gradient descent is an optimization algorithm used to minimize som [ex5q5]: images/day1/ex5/w2_day1_ex5_q5.png "MSE" 6. This question is validated if the point returned is -`array([42.5, 99. ])`. It means that `a= 42.5` and `b=99`. +`array([42.5, 99. ])`. It means that `a= 42.5` and `b=99`. 7. This question is validated if the coefficients returned are - ``` - Coefficients (a): 42.61943031121358 - Intercept (b): 99.18581814447936 - ``` +```console +Coefficients (a): 42.61943031121358 +Intercept (b): 99.18581814447936 +``` 8. This question is validated if the outputted plot is @@ -471,11 +462,9 @@ In a nutshel, Gradient descent is an optimization algorithm used to minimize som [ex5q8]: images/day1/ex5/w2_day1_ex5_q8.png "MSE + Gradient descent" - 9. This question is validated if the coefficients and intercept returned are: - ``` - Coefficients: [42.61943029] - Intercept: 99.18581817296929 - - ``` \ No newline at end of file +```console +Coefficients: [42.61943029] +Intercept: 99.18581817296929 +``` From ef12c8da32cc2db243b01cb1b89c66b8616141bd Mon Sep 17 00:00:00 2001 From: "b.ghazlane" Date: Thu, 15 Apr 2021 22:38:06 +0200 Subject: [PATCH 2/2] fix: correct ex 3 and set ex 3 to optional --- one_md_per_day_format/piscine/Week2/day1.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/one_md_per_day_format/piscine/Week2/day1.md b/one_md_per_day_format/piscine/Week2/day1.md index 3d8aec1..bac15a4 100644 --- a/one_md_per_day_format/piscine/Week2/day1.md +++ b/one_md_per_day_format/piscine/Week2/day1.md @@ -163,7 +163,7 @@ array([ 83.86186727, 140.80961751, 116.3333897 , 64.52998689, # Exercise 3: Train test split -The goal of this exercise is to learn to split a data set. It is important to understand why we split the data in two sets. To put it in a nutshell: the Machine Learning algorithms learns on the training data and is evaluates on the data that hasn't seen before: the testing data. +The goal of this exercise is to learn to split a data set. It is important to understand why we split the data in two sets. To put it in a nutshell: the Machine Learning model learns on the training data and evaluates on the data the model hasn't seen before: the testing data. This video gives a basic and nice explanation: https://www.youtube.com/watch?v=_vdMKioCXqQ @@ -296,7 +296,7 @@ https://scikit-learn.org/stable/datasets/toy_dataset.html#diabetes-dataset 4. This question is validated if the mse on the **train set** is `2888.326888` and the mse on the **test set** is `2858.255153`. -## Exercise 5 Gradient Descent +## Exercise 5 Gradient Descent - Optional The goal of this exercise is to understand how the Linear Regression algorithm finds the optimal coefficients.