diff --git a/one_exercise_per_file/week02/day01/ex01/audit/readme.md b/one_exercise_per_file/week02/day01/ex01/audit/readme.md index de97558..bcf5109 100644 --- a/one_exercise_per_file/week02/day01/ex01/audit/readme.md +++ b/one_exercise_per_file/week02/day01/ex01/audit/readme.md @@ -1,17 +1,19 @@ -1. This question is validated if the output of the fitted model is: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the output of the fitted model is: ```python LinearRegression(copy_X=True, fit_intercept=[[1], [2.1], [3]], n_jobs=None, normalize=[[1], [2], [3]]) ``` -2. This question is validated if the output is: +##### The question 2 is validated if the output is: ```python array([[3.96013289]]) ``` -3. This question is validated if the output is: +##### The question 3 is validated if the output is: ```output Coefficients: [[0.99667774]] diff --git a/one_exercise_per_file/week02/day01/ex02/audit/readme.md b/one_exercise_per_file/week02/day01/ex02/audit/readme.md index c894775..b0df1f1 100644 --- a/one_exercise_per_file/week02/day01/ex02/audit/readme.md +++ b/one_exercise_per_file/week02/day01/ex02/audit/readme.md @@ -1,18 +1,20 @@ -1. This question is validated if the plot looks like: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the plot looks like: ![alt text][q1] [q1]: ../w2_day1_ex2_q1.png "Scatter plot" -2. This question is validated if the equation of the fitted line is: `y = 42.619430291366946 * x + 99.18581817296929` +##### The question 2 is validated if the equation of the fitted line is: `y = 42.619430291366946 * x + 99.18581817296929` -3. This question is validated if the plot looks like: +##### The question 3 is validated if the plot looks like: ![alt text][q3] [q3]: ../w2_day1_ex2_q3.png "Scatter plot + fitted line" -4. This question is validated if the outputted prediction for the first 10 values are: +##### The question 4 is validated if the outputted prediction for the first 10 values are: ```python array([ 83.86186727, 140.80961751, 116.3333897 , 64.52998689, @@ -20,6 +22,6 @@ array([ 83.86186727, 140.80961751, 116.3333897 , 64.52998689, 108.06237908, 85.90762675]) ``` -5. This question is validated if the MSE returned is `114.17148616819485` +##### The question 5 is validated if the MSE returned is `114.17148616819485` -6. This question is validated if the MSE returned is `2854.2871542048706` +##### The question 6 is validated if the MSE returned is `2854.2871542048706` diff --git a/one_exercise_per_file/week02/day01/ex03/audit/readme.md b/one_exercise_per_file/week02/day01/ex03/audit/readme.md index 62f6436..445f4e0 100644 --- a/one_exercise_per_file/week02/day01/ex03/audit/readme.md +++ b/one_exercise_per_file/week02/day01/ex03/audit/readme.md @@ -1,4 +1,4 @@ -1. This question is validated if X_train, y_train, X_test, y_test match this output: +##### The question 1 is validated if X_train, y_train, X_test, y_test match this output: ```console X_train: diff --git a/one_exercise_per_file/week02/day01/ex04/audit/readme.md b/one_exercise_per_file/week02/day01/ex04/audit/readme.md index bade0c9..799be0d 100644 --- a/one_exercise_per_file/week02/day01/ex04/audit/readme.md +++ b/one_exercise_per_file/week02/day01/ex04/audit/readme.md @@ -1,4 +1,6 @@ -1. This question is validated if the output of `y_train.values[:10]` and `y_test.values[:10]`are: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the output of `y_train.values[:10]` and `y_test.values[:10]`are: ```console y_train.values[:10]: @@ -26,7 +28,7 @@ [192.]] ``` -2. This question is validated if the coefficients and the intercept are: +##### The question 2 is validated if the coefficients and the intercept are: ```console [('age', -60.40163046086952), @@ -42,7 +44,7 @@ ('intercept', 152.05314895029233)] ``` -3. This question is validated if the output of `predictions_on_test[:10]` is: +##### The question 3 is validated if the output of `predictions_on_test[:10]` is: ```console array([[111.74351759], @@ -57,4 +59,4 @@ [224.83346984]]) ``` -4. This question is validated if the mse on the **train set** is `2888.326888` and the mse on the **test set** is `2858.255153`. +##### The question 4 is validated if the mse on the **train set** is `2888.326888` and the mse on the **test set** is `2858.255153`. diff --git a/one_exercise_per_file/week02/day01/ex05/audit/readme.md b/one_exercise_per_file/week02/day01/ex05/audit/readme.md index 372d75f..86d4831 100644 --- a/one_exercise_per_file/week02/day01/ex05/audit/readme.md +++ b/one_exercise_per_file/week02/day01/ex05/audit/readme.md @@ -1,14 +1,16 @@ -1. This question is validated if the outputted plot looks like: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the outputted plot looks like: ![alt text][ex5q1] [ex5q1]: ../w2_day1_ex5_q1.png "Scatter plot " -2. This question is validated if the output is: `11808.867339751561` +##### The question 2 is validated if the output is: `11808.867339751561` -3. This question is validated if `grid.shape` is `(640000,2)`. +##### The question 3 is validated if `grid.shape` is `(640000,2)`. -4. This question is validated if the 10 first values of losses are: +##### The question 4 is validated if the 10 first values of losses are: ```console array([158315.41493175, 158001.96852692, 157689.02212209, 157376.57571726, @@ -16,29 +18,29 @@ array([158315.41493175, 158001.96852692, 157689.02212209, 157376.57571726, 155821.84369312, 155512.39728829]) ``` -5. This question is validated if the outputted plot looks like +##### The question 5 is validated if the outputted plot looks like ![alt text][ex5q5] [ex5q5]: ../w2_day1_ex5_q5.png "MSE" -6. This question is validated if the point returned is +##### The question 6 is validated if the point returned is `array([42.5, 99. ])`. It means that `a= 42.5` and `b=99`. -7. This question is validated if the coefficients returned are +##### The question 7 is validated if the coefficients returned are ```console Coefficients (a): 42.61943031121358 Intercept (b): 99.18581814447936 ``` -8. This question is validated if the outputted plot is +##### The question 8 is validated if the outputted plot is ![alt text][ex5q8] [ex5q8]: ../w2_day1_ex5_q8.png "MSE + Gradient descent" -9. This question is validated if the coefficients and intercept returned are: +##### The question 9 is validated if the coefficients and intercept returned are: ```console Coefficients: [42.61943029] diff --git a/one_exercise_per_file/week02/day02/ex01/audit/readme.md b/one_exercise_per_file/week02/day02/ex01/audit/readme.md index 74bd1c4..1ab32ad 100644 --- a/one_exercise_per_file/week02/day02/ex01/audit/readme.md +++ b/one_exercise_per_file/week02/day02/ex01/audit/readme.md @@ -1,4 +1,6 @@ -1. This question is validated if the fitted logistic regression returns: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the fitted logistic regression returns: ```python LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True, @@ -8,11 +10,11 @@ LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True, warm_start=False) ``` -2. This question is validated if the predicted class is `0`. +##### The question 2 is validated if the predicted class is `0`. -3. This question is validated if the predicted probabilities are `[0.61450526 0.38549474]` +##### The question 3 is validated if the predicted probabilities are `[0.61450526 0.38549474]` -4. This question is validated if the output is: +##### The question 4 is validated if the output is: ```console Coefficient: diff --git a/one_exercise_per_file/week02/day02/ex02/audit/readme.md b/one_exercise_per_file/week02/day02/ex02/audit/readme.md index ea04485..98a87a9 100644 --- a/one_exercise_per_file/week02/day02/ex02/audit/readme.md +++ b/one_exercise_per_file/week02/day02/ex02/audit/readme.md @@ -1,4 +1,4 @@ -1. This question is validated if the plot looks like this: +##### The question 1 is validated if the plot looks like this: ![alt text][ex2q1] diff --git a/one_exercise_per_file/week02/day02/ex03/audit/readme.md b/one_exercise_per_file/week02/day02/ex03/audit/readme.md index 5716821..e5ed622 100644 --- a/one_exercise_per_file/week02/day02/ex03/audit/readme.md +++ b/one_exercise_per_file/week02/day02/ex03/audit/readme.md @@ -1,35 +1,35 @@ -1. This question is validated if the outputted plot looks like this: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the outputted plot looks like this: ![alt text][ex3q1] [ex3q1]: ../w2_day2_ex3_q1.png "Scatter plot" -2. This question is validated if the coefficient and the intercept of the Logistic Regression are: +##### The question 2 is validated if the coefficient and the intercept of the Logistic Regression are: ```console Intercept: [-0.98385574] Coefficient: [[1.18866075]] ``` -3. This question is validated if the plot looks like this: +##### The question 3 is validated if the plot looks like this: ![alt text][ex3q2] [ex3q2]: ../w2_day2_ex3_q3.png "Scatter plot" -4. This question is validated if `predict_probability` outputs the same probabilities as `predict_proba`. Note that the values have to match one of the class probabilities, not both. To do so, compare your output with: `clf.predict_proba(X)[:,1]`. The shape of the arrays is not important. +##### The question 4 is validated if `predict_probability` outputs the same probabilities as `predict_proba`. Note that the values have to match one of the class probabilities, not both. To do so, compare your output with: `clf.predict_proba(X)[:,1]`. The shape of the arrays is not important. -5. This question is validated if `predict_class` outputs the same classes as `cfl.predict(X)`. The shape of the arrays is not important. +##### The question 5 is validated if `predict_class` outputs the same classes as `cfl.predict(X)`. The shape of the arrays is not important. -6. This question is validated if the plot looks like this: +##### The question 6 is validated if the plot looks like the plot below. As mentioned, it is not required to shift the class prediction to make the plot easier to understand. ![alt text][ex3q6] [ex3q6]: ../w2_day2_ex3_q5.png "Scatter plot + Logistic regression + predictions" -As mentioned, it is not required to shift the class prediction to make the plot easier to understand. - -7. This question is validated if the plot looks like this: +##### The question 7 is validated if the plot looks like this: ![alt text][ex3q7] diff --git a/one_exercise_per_file/week02/day02/ex04/audit/readme.md b/one_exercise_per_file/week02/day02/ex04/audit/readme.md index 47fdaf3..e595219 100644 --- a/one_exercise_per_file/week02/day02/ex04/audit/readme.md +++ b/one_exercise_per_file/week02/day02/ex04/audit/readme.md @@ -1,5 +1,6 @@ +##### The exercice is validated is all questions of the exercice are validated -1. This question is validated if X_train, y_train, X_test, y_test match this output: +##### The question 1 is validated if X_train, y_train, X_test, y_test match the output below. The proportion of class `1` is **0.125** in the train set and **1.** in the test set. ```console X_train: @@ -26,6 +27,6 @@ y_test: [1. 1.] ``` -The proportion of class `1` is **0.125** in the train set and **1.** in the test set. -2. This question is validated if the proportion of class `1` is **0.3** for both sets. + +##### The question 2 is validated if the proportion of class `1` is **0.3** for both sets. diff --git a/one_exercise_per_file/week02/day02/ex05/audit/readme.md b/one_exercise_per_file/week02/day02/ex05/audit/readme.md index 21d3847..39e30d4 100644 --- a/one_exercise_per_file/week02/day02/ex05/audit/readme.md +++ b/one_exercise_per_file/week02/day02/ex05/audit/readme.md @@ -1,11 +1,13 @@ -1. This question is validated if the proportion of class `Benign` is 0.6552217453505007. It means that if you always predict `Benign` your accuracy would be 66%. +##### The exercice is validated is all questions of the exercice are validated -2. This question is validated if the proportion of one of the classes is the approximately the same on the train and test set: ~0.65. In my case: +##### The question 1 is validated if the proportion of class `Benign` is 0.6552217453505007. It means that if you always predict `Benign` your accuracy would be 66%. + +##### The question 2 is validated if the proportion of one of the classes is the approximately the same on the train and test set: ~0.65. In my case: - test: 0.6571428571428571 - train: 0.6547406082289803 -3. This question is validated if the output is: +##### The question 3 is validated if the output is: ```console # Train @@ -34,12 +36,11 @@ Score on test set: ``` Only the 10 first predictions are outputted. The score is computed on all the data in the folds. - For some reasons, you may have a different data splitting as mine. The requirement for this question is to have a score on the test set bigger than 92%. If the score is 1, congratulation you've leaked your first target. Drop the target from the X_train or X_test ;) ! -4. This question is validated if the confusion matrix on the train set is similar to: +##### The question 4 is validated if the confusion matrix on the train set is similar to: ```console array([[357, 9], diff --git a/one_exercise_per_file/week02/day02/ex06/audit/readme.md b/one_exercise_per_file/week02/day02/ex06/audit/readme.md index b38aa44..69ad7f2 100644 --- a/one_exercise_per_file/week02/day02/ex06/audit/readme.md +++ b/one_exercise_per_file/week02/day02/ex06/audit/readme.md @@ -1,4 +1,6 @@ -1. This question is validated if each classifier has as input a binary data as below: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if each classifier has as input a binary data as below: ```python def train(X_train, y_train): @@ -13,7 +15,7 @@ def train(X_train, y_train): return clf, clf1, clf2 ``` -2. This question is validated if the predicted classes on the test set are: +##### The question 2 is validated if the predicted classes on the test set are: ```console array([0, 0, 2, 1, 2, 0, 2, 1, 1, 1, 0, 1, 2, 0, 1, 1, 0, 0, 2, 2, 0, 0, diff --git a/one_exercise_per_file/week02/day03/ex01/audit/readme.md b/one_exercise_per_file/week02/day03/ex01/audit/readme.md index 8ecb4b3..34531e5 100644 --- a/one_exercise_per_file/week02/day03/ex01/audit/readme.md +++ b/one_exercise_per_file/week02/day03/ex01/audit/readme.md @@ -1,10 +1,12 @@ -1. This question is validated if the `imp_mean.statistics_` returns: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the `imp_mean.statistics_` returns: ```console array([ 4., 13., 6.]) ``` -2. This question is validated if the filled train set is: +##### The question 2 is validated if the filled train set is: ```console array([[ 7., 6., 5.], @@ -12,7 +14,7 @@ [ 1., 20., 8.]]) ``` -3. This question is validated if the filled test set is: +##### The question 3 is validated if the filled test set is: ```console array([[ 4., 1., 2.], diff --git a/one_exercise_per_file/week02/day03/ex02/audit/readme.md b/one_exercise_per_file/week02/day03/ex02/audit/readme.md index ec274ba..56aaf85 100644 --- a/one_exercise_per_file/week02/day03/ex02/audit/readme.md +++ b/one_exercise_per_file/week02/day03/ex02/audit/readme.md @@ -1,4 +1,6 @@ -1. This question is validated if the scaled train set is: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the scaled train set is as below. And by definition, the mean on the axis 0 should be `array([0., 0., 0.])` and the standard deviation on the axis 0 should be `array([1., 1., 1.])`. ```console array([[ 0. , -1.22474487, 1.33630621], @@ -6,12 +8,7 @@ array([[ 0. , -1.22474487, 1.33630621], [-1.22474487, 1.22474487, -1.06904497]]) ``` -- The mean on axis 0 should return: - - array([0., 0., 0.]) -- The std on axis 0 should return: - - array([1., 1., 1.]) - -2. This question is validated if the scaled test set is: +##### The question 2 is validated if the scaled test set is: ```console array([[ 1.22474487, -1.22474487, 0.53452248], diff --git a/one_exercise_per_file/week02/day03/ex03/audit/readme.md b/one_exercise_per_file/week02/day03/ex03/audit/readme.md index b9b7b2b..bf9e737 100644 --- a/one_exercise_per_file/week02/day03/ex03/audit/readme.md +++ b/one_exercise_per_file/week02/day03/ex03/audit/readme.md @@ -1,4 +1,6 @@ -1. This question is validated if the output is +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the output is | | ('C++',) | ('Java',) | ('Python',) | |---:|-----------:|------------:|--------------:| @@ -7,7 +9,7 @@ | 2 | 0 | 1 | 0 | | 3 | 1 | 0 | 0 | -2. This question is validated if the output is: +##### The question 2 is validated if the output is: | | ('C++',) | ('Java',) | ('Python',) | |---:|-----------:|------------:|--------------:| diff --git a/one_exercise_per_file/week02/day03/ex04/audit/readme.md b/one_exercise_per_file/week02/day03/ex04/audit/readme.md index b590567..d520817 100644 --- a/one_exercise_per_file/week02/day03/ex04/audit/readme.md +++ b/one_exercise_per_file/week02/day03/ex04/audit/readme.md @@ -1,4 +1,6 @@ -1. This question is validated if the output of the Ordinal Encoder on the train set is: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the output of the Ordinal Encoder on the train set is: ```console array([[2.], @@ -8,7 +10,7 @@ array([[2.], Check that `enc.categories_` returns`[array(['bad', 'neutral', 'good'], dtype=object)]`. -2. This question is validated if the output of the Ordinal Encoder on the test set is: +##### The question 2 is validated if the output of the Ordinal Encoder on the test set is: ```console array([[2.], diff --git a/one_exercise_per_file/week02/day03/ex05/audit/readme.md b/one_exercise_per_file/week02/day03/ex05/audit/readme.md index bbd1d94..a69e57b 100644 --- a/one_exercise_per_file/week02/day03/ex05/audit/readme.md +++ b/one_exercise_per_file/week02/day03/ex05/audit/readme.md @@ -1,4 +1,6 @@ -1. This question is validated if the number of unique values per feature outputted are: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the number of unique values per feature outputted are: ```console age 3 @@ -13,7 +15,7 @@ irradiat 2 dtype: int64 ``` -2. This question is validated if the transformed test set by the `OneHotEncoder` fitted on the train set is: +##### The question 2 is validated if the transformed test set by the `OneHotEncoder` fitted on the train set is: ```console First 10 rows: @@ -30,7 +32,7 @@ dtype: int64 [1., 0., 0., 1., 0., 1., 0., 0., 0., 1., 0., 0., 1.]]) ``` -3. This question is validated if the transformed test set by the `OrdinalEncoder` fitted on the train set is: +##### The question 3 is validated if the transformed test set by the `OrdinalEncoder` fitted on the train set is: ```console First 10 rows: @@ -47,7 +49,7 @@ dtype: int64 [1., 3., 0., 0.]]) ``` -4. This question is validated if the column transformer transformed that is fitted on the X_train, transformed the X_test as: +##### The question 3 is validated if the column transformer transformed that is fitted on the X_train, transformed the X_test as: ```console # First 2 rows: diff --git a/one_exercise_per_file/week02/day03/ex06/audit/readme.md b/one_exercise_per_file/week02/day03/ex06/audit/readme.md index c365502..3c0d0b9 100644 --- a/one_exercise_per_file/week02/day03/ex06/audit/readme.md +++ b/one_exercise_per_file/week02/day03/ex06/audit/readme.md @@ -1,4 +1,4 @@ -1. This question is validated if the prediction on the test set are: +##### The question 1 is validated if the prediction on the test set are: ```console array([0, 0, 2, 1, 2, 0, 2, 1, 1, 1, 0, 1, 2, 0, 1, 1, 0, 0, 2, 2, 0, 0, diff --git a/one_exercise_per_file/week02/day04/ex01/audit/readme.md b/one_exercise_per_file/week02/day04/ex01/audit/readme.md index 4b56476..0ab47a3 100644 --- a/one_exercise_per_file/week02/day04/ex01/audit/readme.md +++ b/one_exercise_per_file/week02/day04/ex01/audit/readme.md @@ -1 +1 @@ -1. This question is validated if the MSE outputted is **2.25**. \ No newline at end of file +##### The question 1 is validated if the MSE outputted is **2.25**. \ No newline at end of file diff --git a/one_exercise_per_file/week02/day04/ex02/audit/readme.md b/one_exercise_per_file/week02/day04/ex02/audit/readme.md index 3ce30e6..465a9f1 100644 --- a/one_exercise_per_file/week02/day04/ex02/audit/readme.md +++ b/one_exercise_per_file/week02/day04/ex02/audit/readme.md @@ -1 +1 @@ -1. This question is validated if the accuracy outputted is **0.5714285714285714**. +##### The question 1 is validated if the accuracy outputted is **0.5714285714285714**. diff --git a/one_exercise_per_file/week02/day04/ex03/audit/readme.md b/one_exercise_per_file/week02/day04/ex03/audit/readme.md index 10bfac2..5d318d8 100644 --- a/one_exercise_per_file/week02/day04/ex03/audit/readme.md +++ b/one_exercise_per_file/week02/day04/ex03/audit/readme.md @@ -1,28 +1,32 @@ -1. This question is validated if the predictions on the train set and test set are: +##### The exercice is validated is all questions of the exercice are validated - ```console - # 10 first values Train - array([1.54505951, 2.21338527, 2.2636205 , 3.3258957 , 1.51710076, - 1.63209319, 2.9265211 , 0.78080924, 1.21968217, 0.72656239]) - ``` +##### The question 1 is validated if the predictions on the train set and test set are: - ```console - #10 first values Test +```console + #10 first values Train +array([1.54505951, 2.21338527, 2.2636205 , 3.3258957 , 1.51710076, + 1.63209319, 2.9265211 , 0.78080924, 1.21968217, 0.72656239]) + +``` - array([ 1.82212706, 1.98357668, 0.80547979, -0.19259114, 1.76072418, - 3.27855815, 2.12056804, 1.96099917, 2.38239663, 1.21005304]) - ``` +```console +#10 first values Test -2. This question is validated if the results match this output: +array([ 1.82212706, 1.98357668, 0.80547979, -0.19259114, 1.76072418, + 3.27855815, 2.12056804, 1.96099917, 2.38239663, 1.21005304]) - ```console - r2 on the train set: 0.3552292936915783 - MAE on the train set: 0.5300159371615256 - MSE on the train set: 0.5210784446797679 +``` - r2 on the test set: 0.30265471284464673 - MAE on the test set: 0.5454023699809112 - MSE on the test set: 0.5537420654727396 - ``` +##### The question 2 is validated if the results match this output: + +```console +r2 on the train set: 0.3552292936915783 +MAE on the train set: 0.5300159371615256 +MSE on the train set: 0.5210784446797679 + +r2 on the test set: 0.30265471284464673 +MAE on the test set: 0.5454023699809112 +MSE on the test set: 0.5537420654727396 +``` This result shows that the model has slightly better results on the train set than the test set. That's frequent since it is easier to get a better grade on an exam we studied than an exam that is different from what was prepared. However, the results are not good: r2 ~ 0.3. Fitting non linear models as the Random Forest on this data may improve the results. That's the goal of the exercise 5. \ No newline at end of file diff --git a/one_exercise_per_file/week02/day04/ex04/audit/readme.md b/one_exercise_per_file/week02/day04/ex04/audit/readme.md index 1e6a20a..b240de8 100644 --- a/one_exercise_per_file/week02/day04/ex04/audit/readme.md +++ b/one_exercise_per_file/week02/day04/ex04/audit/readme.md @@ -1,4 +1,6 @@ -1. This question is validated if the predictions on the train set and test set are: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the predictions on the train set and test set are: ```console # 10 first values Train @@ -8,31 +10,31 @@ array([1, 1, 0, 0, 0, 1, 1, 1, 0, 0]) ``` -2. This question is validated if the results match this output: +##### The question 2 is validated if the results match this output: - ```console - F1 on the train set: 0.9911504424778761 - Accuracy on the train set: 0.989010989010989 - Recall on the train set: 0.9929078014184397 - Precision on the train set: 0.9893992932862191 - ROC_AUC on the train set: 0.9990161111794368 - - - F1 on the test set: 0.9801324503311258 - Accuracy on the test set: 0.9736842105263158 - Recall on the test set: 0.9866666666666667 - Precision on the test set: 0.9736842105263158 - ROC_AUC on the test set: 0.9863247863247864 - ``` +```console +F1 on the train set: 0.9911504424778761 +Accuracy on the train set: 0.989010989010989 +Recall on the train set: 0.9929078014184397 +Precision on the train set: 0.9893992932862191 +ROC_AUC on the train set: 0.9990161111794368 - The confusion matrix on the test set should be: - ```console - array([[37, 2], - [ 1, 74]]) - ``` +F1 on the test set: 0.9801324503311258 +Accuracy on the test set: 0.9736842105263158 +Recall on the test set: 0.9866666666666667 +Precision on the test set: 0.9736842105263158 +ROC_AUC on the test set: 0.9863247863247864 +``` + +##### The question 2 is validated if the results match the confusion matrix on the test set should be: + +```console +array([[37, 2], + [ 1, 74]]) +``` -3. The ROC AUC plot should look like: +##### The question 3 is validated if the ROC AUC plot looks like the plot below: ![alt text][logo_ex4] diff --git a/one_exercise_per_file/week02/day04/ex05/audit/readme.md b/one_exercise_per_file/week02/day04/ex05/audit/readme.md index e3bfa1c..3ab5891 100644 --- a/one_exercise_per_file/week02/day04/ex05/audit/readme.md +++ b/one_exercise_per_file/week02/day04/ex05/audit/readme.md @@ -1,72 +1,72 @@ -1. Some of the algorithms use random steps (random sampling used by the `RandomForest`). I used `random_state = 43` for the Random Forest, the Decision Tree and the Gradient Boosting. This question is validated of the scores you got are close to: +##### The question is validated of the scores you output are close to the scores below. Some of the algorithms use random steps (random sampling used by the `RandomForest`). I used `random_state = 43` for the Random Forest, the Decision Tree and the Gradient Boosting. - ```console - # Linear regression +```console +# Linear regression - TRAIN - r2 on the train set: 0.34823544284172625 - MAE on the train set: 0.533092001261455 - MSE on the train set: 0.5273648371379568 +TRAIN +r2 on the train set: 0.34823544284172625 +MAE on the train set: 0.533092001261455 +MSE on the train set: 0.5273648371379568 - TEST - r2 on the test set: 0.3551785428138914 - MAE on the test set: 0.5196420310323713 - MSE on the test set: 0.49761195027083804 +TEST +r2 on the test set: 0.3551785428138914 +MAE on the test set: 0.5196420310323713 +MSE on the test set: 0.49761195027083804 - # SVM +# SVM - TRAIN - r2 on the train set: 0.6462366150965996 - MAE on the train set: 0.38356451633259875 - MSE on the train set: 0.33464478671339165 +TRAIN +r2 on the train set: 0.6462366150965996 +MAE on the train set: 0.38356451633259875 +MSE on the train set: 0.33464478671339165 - TEST - r2 on the test set: 0.6162644671183826 - MAE on the test set: 0.3897680598426786 - MSE on the test set: 0.3477101776543003 +TEST +r2 on the test set: 0.6162644671183826 +MAE on the test set: 0.3897680598426786 +MSE on the test set: 0.3477101776543003 - # Decision Tree +# Decision Tree - TRAIN - r2 on the train set: 0.9999999999999488 - MAE on the train set: 1.3685733933909677e-08 - MSE on the train set: 6.842866883530944e-14 +TRAIN +r2 on the train set: 0.9999999999999488 +MAE on the train set: 1.3685733933909677e-08 +MSE on the train set: 6.842866883530944e-14 - TEST - r2 on the test set: 0.6263651902480918 - MAE on the test set: 0.4383758696244002 - MSE on the test set: 0.4727017198871596 +TEST +r2 on the test set: 0.6263651902480918 +MAE on the test set: 0.4383758696244002 +MSE on the test set: 0.4727017198871596 - # Random Forest +# Random Forest - TRAIN - r2 on the train set: 0.9705418471542886 - MAE on the train set: 0.11983836612191189 - MSE on the train set: 0.034538356420577995 +TRAIN +r2 on the train set: 0.9705418471542886 +MAE on the train set: 0.11983836612191189 +MSE on the train set: 0.034538356420577995 - TEST - r2 on the test set: 0.7504673649554309 - MAE on the test set: 0.31889891600404635 - MSE on the test set: 0.24096164834441108 +TEST +r2 on the test set: 0.7504673649554309 +MAE on the test set: 0.31889891600404635 +MSE on the test set: 0.24096164834441108 - # Gradient Boosting +# Gradient Boosting - TRAIN - r2 on the train set: 0.7395782392433273 - MAE on the train set: 0.35656543036682264 - MSE on the train set: 0.26167490389525294 +TRAIN +r2 on the train set: 0.7395782392433273 +MAE on the train set: 0.35656543036682264 +MSE on the train set: 0.26167490389525294 - TEST - r2 on the test set: 0.7157456298013534 - MAE on the test set: 0.36455447680396397 - MSE on the test set: 0.27058170064218096 +TEST +r2 on the test set: 0.7157456298013534 +MAE on the test set: 0.36455447680396397 +MSE on the test set: 0.27058170064218096 - ``` +``` -It is important to notice that the Decision Tree over fits very easily. It learns easily the training data but is not able to extrapolate on the test set. This algorithm is not used a lot. +It is important to notice that the Decision Tree overfits very easily. It learns easily the training data but is not able to extrapolate on the test set. This algorithm is not used a lot its overfitting ability. However, Random Forest and Gradient Boosting propose a solid approach to correct the over fitting (in that case the parameters `max_depth` is set to None that is why the Random Forest over fits the data). These two algorithms are used intensively in Machine Learning Projects. diff --git a/one_exercise_per_file/week02/day04/ex06/audit/readme.md b/one_exercise_per_file/week02/day04/ex06/audit/readme.md index 2244157..343031e 100644 --- a/one_exercise_per_file/week02/day04/ex06/audit/readme.md +++ b/one_exercise_per_file/week02/day04/ex06/audit/readme.md @@ -1,4 +1,6 @@ -1. This question is validated if the code that runs the `gridsearch` is (the parameters may change): +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the code that runs the `gridsearch` is (the parameters may change): ```python parameters = {'n_estimators':[10, 50, 75], @@ -13,7 +15,7 @@ gridsearch = GridSearchCV(rf, gridsearch.fit(X, y) ``` -2. This question is validated if the function is: +##### The question 2 is validated if the function is: ```python def select_model_verbose(gs): @@ -21,9 +23,9 @@ def select_model_verbose(gs): return gs.best_estimator_, gs.best_params_, gs.best_score_ ``` -In my case, the `gridsearch` parameters are not interesting. Even if I reduced the over fitting of the Random Forest, the score on the test is lower than the score on the test returned by the Gradient Boosting in the previous exercise without optimal parameters search. +In my case, the `gridsearch` parameters are not interesting. Even if I reduced the over-fitting of the Random Forest, the score on the test is lower than the score on the test returned by the Gradient Boosting in the previous exercise without optimal parameters search. -3. This question is validated if the code used is: +##### The question 3 is validated if the code used is: ```python model, best_params, best_score = select_model_verbose(gridsearch) diff --git a/one_exercise_per_file/week02/day05/ex01/audit/readme.md b/one_exercise_per_file/week02/day05/ex01/audit/readme.md index 3b945de..598c5d7 100644 --- a/one_exercise_per_file/week02/day05/ex01/audit/readme.md +++ b/one_exercise_per_file/week02/day05/ex01/audit/readme.md @@ -1,4 +1,4 @@ -1. This question is validated if the output of the 5-fold cross validation is: +##### The question 1 is validated if the output of the 5-fold cross validation is: ```console Fold: 1 diff --git a/one_exercise_per_file/week02/day05/ex02/audit/readme.md b/one_exercise_per_file/week02/day05/ex02/audit/readme.md index d1ca44d..f05ed66 100644 --- a/one_exercise_per_file/week02/day05/ex02/audit/readme.md +++ b/one_exercise_per_file/week02/day05/ex02/audit/readme.md @@ -1,4 +1,4 @@ -1. This question is validated if the output is: +##### The question 1 is validated if the output is: ```console Scores on validation sets: @@ -13,4 +13,4 @@ Standard deviation of scores on validation sets: ``` -The model is consistent across folds: it is stable. That's a first sign that the model is not over fitted. The average R2 is 60% that's a good start ! To be improved. +The model is consistent across folds: it is stable. That's a first sign that the model is not over-fitted. The average R2 is 60% that's a good start ! To be improved... diff --git a/one_exercise_per_file/week02/day05/ex03/audit/readme.md b/one_exercise_per_file/week02/day05/ex03/audit/readme.md index ff20dc2..573ff44 100644 --- a/one_exercise_per_file/week02/day05/ex03/audit/readme.md +++ b/one_exercise_per_file/week02/day05/ex03/audit/readme.md @@ -1,4 +1,6 @@ -1. This question is validated if the code that runs the grid search is similar to: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the code that runs the grid search is similar to: ```python parameters = {'n_estimators':[10, 50, 75], @@ -16,7 +18,7 @@ gridsearch.fit(X_train, y_train) The answers that uses another list of parameters are accepted too ! -2. This question is validated if you called this attributes: +##### The question 2 is validated if you called this attributes: ```python print(gridsearch.best_score_) @@ -30,4 +32,4 @@ The best models params are `{'max_depth': 10, 'n_estimators': 75}`. As you may must have a different parameters list than this one, you should have different results. -3. This question is validated if you used the fitted estimator to compute the score on the test set: `gridsearch.score(X_test, y_test)`. The MSE score is ~0.27. The score I got on the test set is close to the score I got on the validation sets. It means the models is not over fitted. \ No newline at end of file +##### The question 3 is validated if you used the fitted estimator to compute the score on the test set: `gridsearch.score(X_test, y_test)`. The MSE score is ~0.27. The score I got on the test set is close to the score I got on the validation sets. It means the models is not over fitted. \ No newline at end of file diff --git a/one_exercise_per_file/week02/day05/ex04/audit/readme.md b/one_exercise_per_file/week02/day05/ex04/audit/readme.md index b7dd013..0faa777 100644 --- a/one_exercise_per_file/week02/day05/ex04/audit/readme.md +++ b/one_exercise_per_file/week02/day05/ex04/audit/readme.md @@ -1,4 +1,6 @@ -1. This question is validated if the outputted plot looks like: +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the outputted plot looks like: ![alt text][logo_ex5q1] @@ -20,7 +22,7 @@ train_scores, test_scores = validation_curve(clf, n_jobs=-1) ``` -2. This question is validated if the output is +##### The question 2 is validated if the outputted plots looks like: ![alt text][logo_ex5q2] diff --git a/one_exercise_per_file/week03/day01/ex01/audit/readme.md b/one_exercise_per_file/week03/day01/ex01/audit/readme.md index 1af0ac9..a6220ad 100644 --- a/one_exercise_per_file/week03/day01/ex01/audit/readme.md +++ b/one_exercise_per_file/week03/day01/ex01/audit/readme.md @@ -1,4 +1,4 @@ -1. This question is validated if this code: +##### The question 1 is validated if this code: ``` neuron = Neuron(0,1,4) diff --git a/one_exercise_per_file/week03/day01/ex02/audit/readme.md b/one_exercise_per_file/week03/day01/ex02/audit/readme.md index 317675b..377c15a 100644 --- a/one_exercise_per_file/week03/day01/ex02/audit/readme.md +++ b/one_exercise_per_file/week03/day01/ex02/audit/readme.md @@ -1 +1 @@ -1. This question is validated the output is: **0.9524917424084265** \ No newline at end of file +##### The question 1 is validated the output is: **0.9524917424084265** \ No newline at end of file diff --git a/one_exercise_per_file/week03/day01/ex03/audit/readme.md b/one_exercise_per_file/week03/day01/ex03/audit/readme.md index e91e245..0c5c16d 100644 --- a/one_exercise_per_file/week03/day01/ex03/audit/readme.md +++ b/one_exercise_per_file/week03/day01/ex03/audit/readme.md @@ -1,2 +1,2 @@ -1. This question is validated if the output is: **0.5472899351247816**. +##### The question 1 is validated if the output is: **0.5472899351247816**. diff --git a/one_exercise_per_file/week03/day01/ex04/audit/readme.md b/one_exercise_per_file/week03/day01/ex04/audit/readme.md index a59a518..7d3df8c 100644 --- a/one_exercise_per_file/week03/day01/ex04/audit/readme.md +++ b/one_exercise_per_file/week03/day01/ex04/audit/readme.md @@ -1,8 +1,10 @@ -1. This question is validated if the output is: - ``` - Bob: 0.7855253278357536 - Eli: 0.7771516558846259 - Tom: 0.8067873659804015 - Ryan: 0.7892343955586032 - ``` -2. This question is validated if the logloss for the 4 students is **0.5485133607757963**. +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the output is: +``` +Bob: 0.7855253278357536 +Eli: 0.7771516558846259 +Tom: 0.8067873659804015 +Ryan: 0.7892343955586032 +``` +##### The question 2 is validated if the logloss for the 4 students is **0.5485133607757963**. diff --git a/one_exercise_per_file/week03/day01/ex05/audit/readme.md b/one_exercise_per_file/week03/day01/ex05/audit/readme.md index 6c06294..5623301 100644 --- a/one_exercise_per_file/week03/day01/ex05/audit/readme.md +++ b/one_exercise_per_file/week03/day01/ex05/audit/readme.md @@ -1,12 +1,14 @@ -1. This question is validated if the output is **7**. +##### The exercice is validated is all questions of the exercice are validated -2. This question is validated if the outputs are: +##### The question 1 is validated if the output is **7**. - ``` - Bob: 14.918863163724454 - Eli: 14.83137890625537 - Tom: 15.086662606964074 - Ryan: 14.939270885974128 - ``` +##### The question 2 is validated if the outputs are: -3. This question is validated if the MSE is **10.237608699909138** \ No newline at end of file +``` +Bob: 14.918863163724454 +Eli: 14.83137890625537 +Tom: 15.086662606964074 +Ryan: 14.939270885974128 +``` + +##### The question 3 is validated if the MSE is **10.237608699909138** \ No newline at end of file diff --git a/one_exercise_per_file/week03/day02/ex01/audit/readme.md b/one_exercise_per_file/week03/day02/ex01/audit/readme.md index 427dc0a..61ab1ec 100644 --- a/one_exercise_per_file/week03/day02/ex01/audit/readme.md +++ b/one_exercise_per_file/week03/day02/ex01/audit/readme.md @@ -1 +1 @@ -1. This question is validated if the output is: ` sentence 2 : 0.7073220863266589 - sentence_1 <=> sentence 3: 0.42663743263528325 - sentence_2 <=> sentence 3: 0.3336274235605957 +``` +sentence_1 <=> sentence 2 : 0.7073220863266589 +sentence_1 <=> sentence 3: 0.42663743263528325 +sentence_2 <=> sentence 3: 0.3336274235605957 - ``` +``` diff --git a/one_exercise_per_file/week03/day05/ex05/audit/readme.md b/one_exercise_per_file/week03/day05/ex05/audit/readme.md index d91d833..7d65866 100644 --- a/one_exercise_per_file/week03/day05/ex05/audit/readme.md +++ b/one_exercise_per_file/week03/day05/ex05/audit/readme.md @@ -1,4 +1,6 @@ -1. This question is validated if the ouptut of the NER is +##### The exercice is validated is all questions of the exercice are validated + +##### The question 1 is validated if the ouptut of the NER is ``` Apple Inc. ORG @@ -25,7 +27,7 @@ Apple ORG Apple II ORG ``` -2. This question is validated if the output shows that the first occurence of apple is not a named entity. In my case here is what the NER returns: +##### The question 2 is validated if the output shows that the first occurence of apple is not a named entity. In my case here is what the NER returns: ``` Paul 1 5 PERSON diff --git a/one_exercise_per_file/week03/day05/ex06/audit/readme.md b/one_exercise_per_file/week03/day05/ex06/audit/readme.md index 0543abe..6b5746b 100644 --- a/one_exercise_per_file/week03/day05/ex06/audit/readme.md +++ b/one_exercise_per_file/week03/day05/ex06/audit/readme.md @@ -1,18 +1,18 @@ -1. This question is validated if the sentences outputed are: +##### The question 1 is validated if the sentences outputed are: - ``` - INFO: Bezos PROPN NNP - Sentence: Amazon (AMZN) enters 2021 with plenty of big opportunities, but is losing its lauded Chief Executive Jeff Bezos, who announced his plan to step aside in the third quarter. +``` +INFO: Bezos PROPN NNP +Sentence: Amazon (AMZN) enters 2021 with plenty of big opportunities, but is losing its lauded Chief Executive Jeff Bezos, who announced his plan to step aside in the third quarter. - INFO: Bezos PROPN NNP - Sentence: Bezos will hand off his role as chief executive to Andy Jassy, the CEO of its cloud computing unit. +INFO: Bezos PROPN NNP +Sentence: Bezos will hand off his role as chief executive to Andy Jassy, the CEO of its cloud computing unit. - INFO: Bezos PROPN NNP - Sentence: He's not leaving, as Bezos will transition to the role of Executive Chairman and remain active. +INFO: Bezos PROPN NNP +Sentence: He's not leaving, as Bezos will transition to the role of Executive Chairman and remain active. - INFO: Bezos PROPN NNP - Sentence: "When you look at our financial results, what you're actually seeing are the long-run cumulative results of invention," Bezos said in written remarks with the Amazon earnings release. - ``` \ No newline at end of file +INFO: Bezos PROPN NNP +Sentence: "When you look at our financial results, what you're actually seeing are the long-run cumulative results of invention," Bezos said in written remarks with the Amazon earnings release. +``` \ No newline at end of file