Browse Source

fix: formatting of week 2 and 3

pull/42/head
Badr Ghazlane 2 years ago
parent
commit
ed08455209
  1. 8
      one_exercise_per_file/week02/day01/ex01/audit/readme.md
  2. 14
      one_exercise_per_file/week02/day01/ex02/audit/readme.md
  3. 2
      one_exercise_per_file/week02/day01/ex03/audit/readme.md
  4. 10
      one_exercise_per_file/week02/day01/ex04/audit/readme.md
  5. 20
      one_exercise_per_file/week02/day01/ex05/audit/readme.md
  6. 10
      one_exercise_per_file/week02/day02/ex01/audit/readme.md
  7. 2
      one_exercise_per_file/week02/day02/ex02/audit/readme.md
  8. 18
      one_exercise_per_file/week02/day02/ex03/audit/readme.md
  9. 7
      one_exercise_per_file/week02/day02/ex04/audit/readme.md
  10. 11
      one_exercise_per_file/week02/day02/ex05/audit/readme.md
  11. 6
      one_exercise_per_file/week02/day02/ex06/audit/readme.md
  12. 8
      one_exercise_per_file/week02/day03/ex01/audit/readme.md
  13. 11
      one_exercise_per_file/week02/day03/ex02/audit/readme.md
  14. 6
      one_exercise_per_file/week02/day03/ex03/audit/readme.md
  15. 6
      one_exercise_per_file/week02/day03/ex04/audit/readme.md
  16. 10
      one_exercise_per_file/week02/day03/ex05/audit/readme.md
  17. 2
      one_exercise_per_file/week02/day03/ex06/audit/readme.md
  18. 2
      one_exercise_per_file/week02/day04/ex01/audit/readme.md
  19. 2
      one_exercise_per_file/week02/day04/ex02/audit/readme.md
  20. 44
      one_exercise_per_file/week02/day04/ex03/audit/readme.md
  21. 46
      one_exercise_per_file/week02/day04/ex04/audit/readme.md
  22. 98
      one_exercise_per_file/week02/day04/ex05/audit/readme.md
  23. 10
      one_exercise_per_file/week02/day04/ex06/audit/readme.md
  24. 2
      one_exercise_per_file/week02/day05/ex01/audit/readme.md
  25. 4
      one_exercise_per_file/week02/day05/ex02/audit/readme.md
  26. 8
      one_exercise_per_file/week02/day05/ex03/audit/readme.md
  27. 6
      one_exercise_per_file/week02/day05/ex04/audit/readme.md
  28. 2
      one_exercise_per_file/week03/day01/ex01/audit/readme.md
  29. 2
      one_exercise_per_file/week03/day01/ex02/audit/readme.md
  30. 2
      one_exercise_per_file/week03/day01/ex03/audit/readme.md
  31. 18
      one_exercise_per_file/week03/day01/ex04/audit/readme.md
  32. 20
      one_exercise_per_file/week03/day01/ex05/audit/readme.md
  33. 2
      one_exercise_per_file/week03/day02/ex01/audit/readme.md
  34. 106
      one_exercise_per_file/week03/day02/ex02/audit/readme.md
  35. 14
      one_exercise_per_file/week03/day02/ex03/audit/readme.md
  36. 4
      one_exercise_per_file/week03/day02/ex04/audit/readme.md
  37. 2
      one_exercise_per_file/week03/day03/ex01/audit/readme.md
  38. 6
      one_exercise_per_file/week03/day03/ex02/audit/readme.md
  39. 15
      one_exercise_per_file/week03/day03/ex03/audit/readme.md
  40. 2
      one_exercise_per_file/week03/day03/ex04/audit/readme.md
  41. 6
      one_exercise_per_file/week03/day03/ex05/audit/readme.md
  42. 3
      one_exercise_per_file/week03/day05/ex01/audit/readme.md
  43. 2
      one_exercise_per_file/week03/day05/ex02/audit/readme.md
  44. 18
      one_exercise_per_file/week03/day05/ex03/audit/readme.md
  45. 12
      one_exercise_per_file/week03/day05/ex04/audit/readme.md
  46. 6
      one_exercise_per_file/week03/day05/ex05/audit/readme.md
  47. 22
      one_exercise_per_file/week03/day05/ex06/audit/readme.md

8
one_exercise_per_file/week02/day01/ex01/audit/readme.md

@ -1,17 +1,19 @@
1. This question is validated if the output of the fitted model is:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the output of the fitted model is:
```python
LinearRegression(copy_X=True, fit_intercept=[[1], [2.1], [3]], n_jobs=None,
normalize=[[1], [2], [3]])
```
2. This question is validated if the output is:
##### The question 2 is validated if the output is:
```python
array([[3.96013289]])
```
3. This question is validated if the output is:
##### The question 3 is validated if the output is:
```output
Coefficients: [[0.99667774]]

14
one_exercise_per_file/week02/day01/ex02/audit/readme.md

@ -1,18 +1,20 @@
1. This question is validated if the plot looks like:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the plot looks like:
![alt text][q1]
[q1]: ../w2_day1_ex2_q1.png "Scatter plot"
2. This question is validated if the equation of the fitted line is: `y = 42.619430291366946 * x + 99.18581817296929`
##### The question 2 is validated if the equation of the fitted line is: `y = 42.619430291366946 * x + 99.18581817296929`
3. This question is validated if the plot looks like:
##### The question 3 is validated if the plot looks like:
![alt text][q3]
[q3]: ../w2_day1_ex2_q3.png "Scatter plot + fitted line"
4. This question is validated if the outputted prediction for the first 10 values are:
##### The question 4 is validated if the outputted prediction for the first 10 values are:
```python
array([ 83.86186727, 140.80961751, 116.3333897 , 64.52998689,
@ -20,6 +22,6 @@ array([ 83.86186727, 140.80961751, 116.3333897 , 64.52998689,
108.06237908, 85.90762675])
```
5. This question is validated if the MSE returned is `114.17148616819485`
##### The question 5 is validated if the MSE returned is `114.17148616819485`
6. This question is validated if the MSE returned is `2854.2871542048706`
##### The question 6 is validated if the MSE returned is `2854.2871542048706`

2
one_exercise_per_file/week02/day01/ex03/audit/readme.md

@ -1,4 +1,4 @@
1. This question is validated if X_train, y_train, X_test, y_test match this output:
##### The question 1 is validated if X_train, y_train, X_test, y_test match this output:
```console
X_train:

10
one_exercise_per_file/week02/day01/ex04/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the output of `y_train.values[:10]` and `y_test.values[:10]`are:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the output of `y_train.values[:10]` and `y_test.values[:10]`are:
```console
y_train.values[:10]:
@ -26,7 +28,7 @@
[192.]]
```
2. This question is validated if the coefficients and the intercept are:
##### The question 2 is validated if the coefficients and the intercept are:
```console
[('age', -60.40163046086952),
@ -42,7 +44,7 @@
('intercept', 152.05314895029233)]
```
3. This question is validated if the output of `predictions_on_test[:10]` is:
##### The question 3 is validated if the output of `predictions_on_test[:10]` is:
```console
array([[111.74351759],
@ -57,4 +59,4 @@
[224.83346984]])
```
4. This question is validated if the mse on the **train set** is `2888.326888` and the mse on the **test set** is `2858.255153`.
##### The question 4 is validated if the mse on the **train set** is `2888.326888` and the mse on the **test set** is `2858.255153`.

20
one_exercise_per_file/week02/day01/ex05/audit/readme.md

@ -1,14 +1,16 @@
1. This question is validated if the outputted plot looks like:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the outputted plot looks like:
![alt text][ex5q1]
[ex5q1]: ../w2_day1_ex5_q1.png "Scatter plot "
2. This question is validated if the output is: `11808.867339751561`
##### The question 2 is validated if the output is: `11808.867339751561`
3. This question is validated if `grid.shape` is `(640000,2)`.
##### The question 3 is validated if `grid.shape` is `(640000,2)`.
4. This question is validated if the 10 first values of losses are:
##### The question 4 is validated if the 10 first values of losses are:
```console
array([158315.41493175, 158001.96852692, 157689.02212209, 157376.57571726,
@ -16,29 +18,29 @@ array([158315.41493175, 158001.96852692, 157689.02212209, 157376.57571726,
155821.84369312, 155512.39728829])
```
5. This question is validated if the outputted plot looks like
##### The question 5 is validated if the outputted plot looks like
![alt text][ex5q5]
[ex5q5]: ../w2_day1_ex5_q5.png "MSE"
6. This question is validated if the point returned is
##### The question 6 is validated if the point returned is
`array([42.5, 99. ])`. It means that `a= 42.5` and `b=99`.
7. This question is validated if the coefficients returned are
##### The question 7 is validated if the coefficients returned are
```console
Coefficients (a): 42.61943031121358
Intercept (b): 99.18581814447936
```
8. This question is validated if the outputted plot is
##### The question 8 is validated if the outputted plot is
![alt text][ex5q8]
[ex5q8]: ../w2_day1_ex5_q8.png "MSE + Gradient descent"
9. This question is validated if the coefficients and intercept returned are:
##### The question 9 is validated if the coefficients and intercept returned are:
```console
Coefficients: [42.61943029]

10
one_exercise_per_file/week02/day02/ex01/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the fitted logistic regression returns:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the fitted logistic regression returns:
```python
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
@ -8,11 +10,11 @@ LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
warm_start=False)
```
2. This question is validated if the predicted class is `0`.
##### The question 2 is validated if the predicted class is `0`.
3. This question is validated if the predicted probabilities are `[0.61450526 0.38549474]`
##### The question 3 is validated if the predicted probabilities are `[0.61450526 0.38549474]`
4. This question is validated if the output is:
##### The question 4 is validated if the output is:
```console
Coefficient:

2
one_exercise_per_file/week02/day02/ex02/audit/readme.md

@ -1,4 +1,4 @@
1. This question is validated if the plot looks like this:
##### The question 1 is validated if the plot looks like this:
![alt text][ex2q1]

18
one_exercise_per_file/week02/day02/ex03/audit/readme.md

@ -1,35 +1,35 @@
1. This question is validated if the outputted plot looks like this:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the outputted plot looks like this:
![alt text][ex3q1]
[ex3q1]: ../w2_day2_ex3_q1.png "Scatter plot"
2. This question is validated if the coefficient and the intercept of the Logistic Regression are:
##### The question 2 is validated if the coefficient and the intercept of the Logistic Regression are:
```console
Intercept: [-0.98385574]
Coefficient: [[1.18866075]]
```
3. This question is validated if the plot looks like this:
##### The question 3 is validated if the plot looks like this:
![alt text][ex3q2]
[ex3q2]: ../w2_day2_ex3_q3.png "Scatter plot"
4. This question is validated if `predict_probability` outputs the same probabilities as `predict_proba`. Note that the values have to match one of the class probabilities, not both. To do so, compare your output with: `clf.predict_proba(X)[:,1]`. The shape of the arrays is not important.
##### The question 4 is validated if `predict_probability` outputs the same probabilities as `predict_proba`. Note that the values have to match one of the class probabilities, not both. To do so, compare your output with: `clf.predict_proba(X)[:,1]`. The shape of the arrays is not important.
5. This question is validated if `predict_class` outputs the same classes as `cfl.predict(X)`. The shape of the arrays is not important.
##### The question 5 is validated if `predict_class` outputs the same classes as `cfl.predict(X)`. The shape of the arrays is not important.
6. This question is validated if the plot looks like this:
##### The question 6 is validated if the plot looks like the plot below. As mentioned, it is not required to shift the class prediction to make the plot easier to understand.
![alt text][ex3q6]
[ex3q6]: ../w2_day2_ex3_q5.png "Scatter plot + Logistic regression + predictions"
As mentioned, it is not required to shift the class prediction to make the plot easier to understand.
7. This question is validated if the plot looks like this:
##### The question 7 is validated if the plot looks like this:
![alt text][ex3q7]

7
one_exercise_per_file/week02/day02/ex04/audit/readme.md

@ -1,5 +1,6 @@
##### The exercice is validated is all questions of the exercice are validated
1. This question is validated if X_train, y_train, X_test, y_test match this output:
##### The question 1 is validated if X_train, y_train, X_test, y_test match the output below. The proportion of class `1` is **0.125** in the train set and **1.** in the test set.
```console
X_train:
@ -26,6 +27,6 @@ y_test:
[1. 1.]
```
The proportion of class `1` is **0.125** in the train set and **1.** in the test set.
2. This question is validated if the proportion of class `1` is **0.3** for both sets.
##### The question 2 is validated if the proportion of class `1` is **0.3** for both sets.

11
one_exercise_per_file/week02/day02/ex05/audit/readme.md

@ -1,11 +1,13 @@
1. This question is validated if the proportion of class `Benign` is 0.6552217453505007. It means that if you always predict `Benign` your accuracy would be 66%.
##### The exercice is validated is all questions of the exercice are validated
2. This question is validated if the proportion of one of the classes is the approximately the same on the train and test set: ~0.65. In my case:
##### The question 1 is validated if the proportion of class `Benign` is 0.6552217453505007. It means that if you always predict `Benign` your accuracy would be 66%.
##### The question 2 is validated if the proportion of one of the classes is the approximately the same on the train and test set: ~0.65. In my case:
- test: 0.6571428571428571
- train: 0.6547406082289803
3. This question is validated if the output is:
##### The question 3 is validated if the output is:
```console
# Train
@ -34,12 +36,11 @@ Score on test set:
```
Only the 10 first predictions are outputted. The score is computed on all the data in the folds.
For some reasons, you may have a different data splitting as mine. The requirement for this question is to have a score on the test set bigger than 92%.
If the score is 1, congratulation you've leaked your first target. Drop the target from the X_train or X_test ;) !
4. This question is validated if the confusion matrix on the train set is similar to:
##### The question 4 is validated if the confusion matrix on the train set is similar to:
```console
array([[357, 9],

6
one_exercise_per_file/week02/day02/ex06/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if each classifier has as input a binary data as below:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if each classifier has as input a binary data as below:
```python
def train(X_train, y_train):
@ -13,7 +15,7 @@ def train(X_train, y_train):
return clf, clf1, clf2
```
2. This question is validated if the predicted classes on the test set are:
##### The question 2 is validated if the predicted classes on the test set are:
```console
array([0, 0, 2, 1, 2, 0, 2, 1, 1, 1, 0, 1, 2, 0, 1, 1, 0, 0, 2, 2, 0, 0,

8
one_exercise_per_file/week02/day03/ex01/audit/readme.md

@ -1,10 +1,12 @@
1. This question is validated if the `imp_mean.statistics_` returns:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the `imp_mean.statistics_` returns:
```console
array([ 4., 13., 6.])
```
2. This question is validated if the filled train set is:
##### The question 2 is validated if the filled train set is:
```console
array([[ 7., 6., 5.],
@ -12,7 +14,7 @@
[ 1., 20., 8.]])
```
3. This question is validated if the filled test set is:
##### The question 3 is validated if the filled test set is:
```console
array([[ 4., 1., 2.],

11
one_exercise_per_file/week02/day03/ex02/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the scaled train set is:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the scaled train set is as below. And by definition, the mean on the axis 0 should be `array([0., 0., 0.])` and the standard deviation on the axis 0 should be `array([1., 1., 1.])`.
```console
array([[ 0. , -1.22474487, 1.33630621],
@ -6,12 +8,7 @@ array([[ 0. , -1.22474487, 1.33630621],
[-1.22474487, 1.22474487, -1.06904497]])
```
- The mean on axis 0 should return:
- array([0., 0., 0.])
- The std on axis 0 should return:
- array([1., 1., 1.])
2. This question is validated if the scaled test set is:
##### The question 2 is validated if the scaled test set is:
```console
array([[ 1.22474487, -1.22474487, 0.53452248],

6
one_exercise_per_file/week02/day03/ex03/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the output is
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the output is
| | ('C++',) | ('Java',) | ('Python',) |
|---:|-----------:|------------:|--------------:|
@ -7,7 +9,7 @@
| 2 | 0 | 1 | 0 |
| 3 | 1 | 0 | 0 |
2. This question is validated if the output is:
##### The question 2 is validated if the output is:
| | ('C++',) | ('Java',) | ('Python',) |
|---:|-----------:|------------:|--------------:|

6
one_exercise_per_file/week02/day03/ex04/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the output of the Ordinal Encoder on the train set is:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the output of the Ordinal Encoder on the train set is:
```console
array([[2.],
@ -8,7 +10,7 @@ array([[2.],
Check that `enc.categories_` returns`[array(['bad', 'neutral', 'good'], dtype=object)]`.
2. This question is validated if the output of the Ordinal Encoder on the test set is:
##### The question 2 is validated if the output of the Ordinal Encoder on the test set is:
```console
array([[2.],

10
one_exercise_per_file/week02/day03/ex05/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the number of unique values per feature outputted are:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the number of unique values per feature outputted are:
```console
age 3
@ -13,7 +15,7 @@ irradiat 2
dtype: int64
```
2. This question is validated if the transformed test set by the `OneHotEncoder` fitted on the train set is:
##### The question 2 is validated if the transformed test set by the `OneHotEncoder` fitted on the train set is:
```console
First 10 rows:
@ -30,7 +32,7 @@ dtype: int64
[1., 0., 0., 1., 0., 1., 0., 0., 0., 1., 0., 0., 1.]])
```
3. This question is validated if the transformed test set by the `OrdinalEncoder` fitted on the train set is:
##### The question 3 is validated if the transformed test set by the `OrdinalEncoder` fitted on the train set is:
```console
First 10 rows:
@ -47,7 +49,7 @@ dtype: int64
[1., 3., 0., 0.]])
```
4. This question is validated if the column transformer transformed that is fitted on the X_train, transformed the X_test as:
##### The question 3 is validated if the column transformer transformed that is fitted on the X_train, transformed the X_test as:
```console
# First 2 rows:

2
one_exercise_per_file/week02/day03/ex06/audit/readme.md

@ -1,4 +1,4 @@
1. This question is validated if the prediction on the test set are:
##### The question 1 is validated if the prediction on the test set are:
```console
array([0, 0, 2, 1, 2, 0, 2, 1, 1, 1, 0, 1, 2, 0, 1, 1, 0, 0, 2, 2, 0, 0,

2
one_exercise_per_file/week02/day04/ex01/audit/readme.md

@ -1 +1 @@
1. This question is validated if the MSE outputted is **2.25**.
##### The question 1 is validated if the MSE outputted is **2.25**.

2
one_exercise_per_file/week02/day04/ex02/audit/readme.md

@ -1 +1 @@
1. This question is validated if the accuracy outputted is **0.5714285714285714**.
##### The question 1 is validated if the accuracy outputted is **0.5714285714285714**.

44
one_exercise_per_file/week02/day04/ex03/audit/readme.md

@ -1,28 +1,32 @@
1. This question is validated if the predictions on the train set and test set are:
##### The exercice is validated is all questions of the exercice are validated
```console
# 10 first values Train
array([1.54505951, 2.21338527, 2.2636205 , 3.3258957 , 1.51710076,
1.63209319, 2.9265211 , 0.78080924, 1.21968217, 0.72656239])
```
##### The question 1 is validated if the predictions on the train set and test set are:
```console
#10 first values Test
```console
#10 first values Train
array([1.54505951, 2.21338527, 2.2636205 , 3.3258957 , 1.51710076,
1.63209319, 2.9265211 , 0.78080924, 1.21968217, 0.72656239])
```
array([ 1.82212706, 1.98357668, 0.80547979, -0.19259114, 1.76072418,
3.27855815, 2.12056804, 1.96099917, 2.38239663, 1.21005304])
```
```console
#10 first values Test
2. This question is validated if the results match this output:
array([ 1.82212706, 1.98357668, 0.80547979, -0.19259114, 1.76072418,
3.27855815, 2.12056804, 1.96099917, 2.38239663, 1.21005304])
```console
r2 on the train set: 0.3552292936915783
MAE on the train set: 0.5300159371615256
MSE on the train set: 0.5210784446797679
```
r2 on the test set: 0.30265471284464673
MAE on the test set: 0.5454023699809112
MSE on the test set: 0.5537420654727396
```
##### The question 2 is validated if the results match this output:
```console
r2 on the train set: 0.3552292936915783
MAE on the train set: 0.5300159371615256
MSE on the train set: 0.5210784446797679
r2 on the test set: 0.30265471284464673
MAE on the test set: 0.5454023699809112
MSE on the test set: 0.5537420654727396
```
This result shows that the model has slightly better results on the train set than the test set. That's frequent since it is easier to get a better grade on an exam we studied than an exam that is different from what was prepared. However, the results are not good: r2 ~ 0.3. Fitting non linear models as the Random Forest on this data may improve the results. That's the goal of the exercise 5.

46
one_exercise_per_file/week02/day04/ex04/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the predictions on the train set and test set are:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the predictions on the train set and test set are:
```console
# 10 first values Train
@ -8,31 +10,31 @@
array([1, 1, 0, 0, 0, 1, 1, 1, 0, 0])
```
2. This question is validated if the results match this output:
##### The question 2 is validated if the results match this output:
```console
F1 on the train set: 0.9911504424778761
Accuracy on the train set: 0.989010989010989
Recall on the train set: 0.9929078014184397
Precision on the train set: 0.9893992932862191
ROC_AUC on the train set: 0.9990161111794368
F1 on the test set: 0.9801324503311258
Accuracy on the test set: 0.9736842105263158
Recall on the test set: 0.9866666666666667
Precision on the test set: 0.9736842105263158
ROC_AUC on the test set: 0.9863247863247864
```
```console
F1 on the train set: 0.9911504424778761
Accuracy on the train set: 0.989010989010989
Recall on the train set: 0.9929078014184397
Precision on the train set: 0.9893992932862191
ROC_AUC on the train set: 0.9990161111794368
The confusion matrix on the test set should be:
```console
array([[37, 2],
[ 1, 74]])
```
F1 on the test set: 0.9801324503311258
Accuracy on the test set: 0.9736842105263158
Recall on the test set: 0.9866666666666667
Precision on the test set: 0.9736842105263158
ROC_AUC on the test set: 0.9863247863247864
```
##### The question 2 is validated if the results match the confusion matrix on the test set should be:
```console
array([[37, 2],
[ 1, 74]])
```
3. The ROC AUC plot should look like:
##### The question 3 is validated if the ROC AUC plot looks like the plot below:
![alt text][logo_ex4]

98
one_exercise_per_file/week02/day04/ex05/audit/readme.md

@ -1,72 +1,72 @@
1. Some of the algorithms use random steps (random sampling used by the `RandomForest`). I used `random_state = 43` for the Random Forest, the Decision Tree and the Gradient Boosting. This question is validated of the scores you got are close to:
##### The question is validated of the scores you output are close to the scores below. Some of the algorithms use random steps (random sampling used by the `RandomForest`). I used `random_state = 43` for the Random Forest, the Decision Tree and the Gradient Boosting.
```console
# Linear regression
```console
# Linear regression
TRAIN
r2 on the train set: 0.34823544284172625
MAE on the train set: 0.533092001261455
MSE on the train set: 0.5273648371379568
TRAIN
r2 on the train set: 0.34823544284172625
MAE on the train set: 0.533092001261455
MSE on the train set: 0.5273648371379568
TEST
r2 on the test set: 0.3551785428138914
MAE on the test set: 0.5196420310323713
MSE on the test set: 0.49761195027083804
TEST
r2 on the test set: 0.3551785428138914
MAE on the test set: 0.5196420310323713
MSE on the test set: 0.49761195027083804
# SVM
# SVM
TRAIN
r2 on the train set: 0.6462366150965996
MAE on the train set: 0.38356451633259875
MSE on the train set: 0.33464478671339165
TRAIN
r2 on the train set: 0.6462366150965996
MAE on the train set: 0.38356451633259875
MSE on the train set: 0.33464478671339165
TEST
r2 on the test set: 0.6162644671183826
MAE on the test set: 0.3897680598426786
MSE on the test set: 0.3477101776543003
TEST
r2 on the test set: 0.6162644671183826
MAE on the test set: 0.3897680598426786
MSE on the test set: 0.3477101776543003
# Decision Tree
# Decision Tree
TRAIN
r2 on the train set: 0.9999999999999488
MAE on the train set: 1.3685733933909677e-08
MSE on the train set: 6.842866883530944e-14
TRAIN
r2 on the train set: 0.9999999999999488
MAE on the train set: 1.3685733933909677e-08
MSE on the train set: 6.842866883530944e-14
TEST
r2 on the test set: 0.6263651902480918
MAE on the test set: 0.4383758696244002
MSE on the test set: 0.4727017198871596
TEST
r2 on the test set: 0.6263651902480918
MAE on the test set: 0.4383758696244002
MSE on the test set: 0.4727017198871596
# Random Forest
# Random Forest
TRAIN
r2 on the train set: 0.9705418471542886
MAE on the train set: 0.11983836612191189
MSE on the train set: 0.034538356420577995
TRAIN
r2 on the train set: 0.9705418471542886
MAE on the train set: 0.11983836612191189
MSE on the train set: 0.034538356420577995
TEST
r2 on the test set: 0.7504673649554309
MAE on the test set: 0.31889891600404635
MSE on the test set: 0.24096164834441108
TEST
r2 on the test set: 0.7504673649554309
MAE on the test set: 0.31889891600404635
MSE on the test set: 0.24096164834441108
# Gradient Boosting
# Gradient Boosting
TRAIN
r2 on the train set: 0.7395782392433273
MAE on the train set: 0.35656543036682264
MSE on the train set: 0.26167490389525294
TRAIN
r2 on the train set: 0.7395782392433273
MAE on the train set: 0.35656543036682264
MSE on the train set: 0.26167490389525294
TEST
r2 on the test set: 0.7157456298013534
MAE on the test set: 0.36455447680396397
MSE on the test set: 0.27058170064218096
TEST
r2 on the test set: 0.7157456298013534
MAE on the test set: 0.36455447680396397
MSE on the test set: 0.27058170064218096
```
```
It is important to notice that the Decision Tree over fits very easily. It learns easily the training data but is not able to extrapolate on the test set. This algorithm is not used a lot.
It is important to notice that the Decision Tree overfits very easily. It learns easily the training data but is not able to extrapolate on the test set. This algorithm is not used a lot its overfitting ability.
However, Random Forest and Gradient Boosting propose a solid approach to correct the over fitting (in that case the parameters `max_depth` is set to None that is why the Random Forest over fits the data). These two algorithms are used intensively in Machine Learning Projects.

10
one_exercise_per_file/week02/day04/ex06/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the code that runs the `gridsearch` is (the parameters may change):
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the code that runs the `gridsearch` is (the parameters may change):
```python
parameters = {'n_estimators':[10, 50, 75],
@ -13,7 +15,7 @@ gridsearch = GridSearchCV(rf,
gridsearch.fit(X, y)
```
2. This question is validated if the function is:
##### The question 2 is validated if the function is:
```python
def select_model_verbose(gs):
@ -21,9 +23,9 @@ def select_model_verbose(gs):
return gs.best_estimator_, gs.best_params_, gs.best_score_
```
In my case, the `gridsearch` parameters are not interesting. Even if I reduced the over fitting of the Random Forest, the score on the test is lower than the score on the test returned by the Gradient Boosting in the previous exercise without optimal parameters search.
In my case, the `gridsearch` parameters are not interesting. Even if I reduced the over-fitting of the Random Forest, the score on the test is lower than the score on the test returned by the Gradient Boosting in the previous exercise without optimal parameters search.
3. This question is validated if the code used is:
##### The question 3 is validated if the code used is:
```python
model, best_params, best_score = select_model_verbose(gridsearch)

2
one_exercise_per_file/week02/day05/ex01/audit/readme.md

@ -1,4 +1,4 @@
1. This question is validated if the output of the 5-fold cross validation is:
##### The question 1 is validated if the output of the 5-fold cross validation is:
```console
Fold: 1

4
one_exercise_per_file/week02/day05/ex02/audit/readme.md

@ -1,4 +1,4 @@
1. This question is validated if the output is:
##### The question 1 is validated if the output is:
```console
Scores on validation sets:
@ -13,4 +13,4 @@ Standard deviation of scores on validation sets:
```
The model is consistent across folds: it is stable. That's a first sign that the model is not over fitted. The average R2 is 60% that's a good start ! To be improved.
The model is consistent across folds: it is stable. That's a first sign that the model is not over-fitted. The average R2 is 60% that's a good start ! To be improved...

8
one_exercise_per_file/week02/day05/ex03/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the code that runs the grid search is similar to:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the code that runs the grid search is similar to:
```python
parameters = {'n_estimators':[10, 50, 75],
@ -16,7 +18,7 @@ gridsearch.fit(X_train, y_train)
The answers that uses another list of parameters are accepted too !
2. This question is validated if you called this attributes:
##### The question 2 is validated if you called this attributes:
```python
print(gridsearch.best_score_)
@ -30,4 +32,4 @@ The best models params are `{'max_depth': 10, 'n_estimators': 75}`.
As you may must have a different parameters list than this one, you should have different results.
3. This question is validated if you used the fitted estimator to compute the score on the test set: `gridsearch.score(X_test, y_test)`. The MSE score is ~0.27. The score I got on the test set is close to the score I got on the validation sets. It means the models is not over fitted.
##### The question 3 is validated if you used the fitted estimator to compute the score on the test set: `gridsearch.score(X_test, y_test)`. The MSE score is ~0.27. The score I got on the test set is close to the score I got on the validation sets. It means the models is not over fitted.

6
one_exercise_per_file/week02/day05/ex04/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the outputted plot looks like:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the outputted plot looks like:
![alt text][logo_ex5q1]
@ -20,7 +22,7 @@ train_scores, test_scores = validation_curve(clf,
n_jobs=-1)
```
2. This question is validated if the output is
##### The question 2 is validated if the outputted plots looks like:
![alt text][logo_ex5q2]

2
one_exercise_per_file/week03/day01/ex01/audit/readme.md

@ -1,4 +1,4 @@
1. This question is validated if this code:
##### The question 1 is validated if this code:
```
neuron = Neuron(0,1,4)

2
one_exercise_per_file/week03/day01/ex02/audit/readme.md

@ -1 +1 @@
1. This question is validated the output is: **0.9524917424084265**
##### The question 1 is validated the output is: **0.9524917424084265**

2
one_exercise_per_file/week03/day01/ex03/audit/readme.md

@ -1,2 +1,2 @@
1. This question is validated if the output is: **0.5472899351247816**.
##### The question 1 is validated if the output is: **0.5472899351247816**.

18
one_exercise_per_file/week03/day01/ex04/audit/readme.md

@ -1,8 +1,10 @@
1. This question is validated if the output is:
```
Bob: 0.7855253278357536
Eli: 0.7771516558846259
Tom: 0.8067873659804015
Ryan: 0.7892343955586032
```
2. This question is validated if the logloss for the 4 students is **0.5485133607757963**.
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the output is:
```
Bob: 0.7855253278357536
Eli: 0.7771516558846259
Tom: 0.8067873659804015
Ryan: 0.7892343955586032
```
##### The question 2 is validated if the logloss for the 4 students is **0.5485133607757963**.

20
one_exercise_per_file/week03/day01/ex05/audit/readme.md

@ -1,12 +1,14 @@
1. This question is validated if the output is **7**.
##### The exercice is validated is all questions of the exercice are validated
2. This question is validated if the outputs are:
##### The question 1 is validated if the output is **7**.
```
Bob: 14.918863163724454
Eli: 14.83137890625537
Tom: 15.086662606964074
Ryan: 14.939270885974128
```
##### The question 2 is validated if the outputs are:
3. This question is validated if the MSE is **10.237608699909138**
```
Bob: 14.918863163724454
Eli: 14.83137890625537
Tom: 15.086662606964074
Ryan: 14.939270885974128
```
##### The question 3 is validated if the MSE is **10.237608699909138**

2
one_exercise_per_file/week03/day02/ex01/audit/readme.md

@ -1 +1 @@
1. This question is validated if the output is: `<tensorflow.python.keras.engine.sequential.Sequential object at xxx`
##### The question 1 is validated if the output is: `<tensorflow.python.keras.engine.sequential.Sequential object at xxx`

106
one_exercise_per_file/week03/day02/ex02/audit/readme.md

@ -1,56 +1,58 @@
1. This question is validated if the fields`batch_input_shape`,`units` and `activation` match this output:
##### The exercice is validated is all questions of the exercice are validated
```
{'name': 'dense_7',
'trainable': True,
'batch_input_shape': (None, 5),
'dtype': 'float32',
'units': 8,
'activation': 'sigmoid',
'use_bias': True,
'kernel_initializer': {'class_name': 'GlorotUniform',
'config': {'seed': None}},
'bias_initializer': {'class_name': 'Zeros', 'config': {}},
'kernel_regularizer': None,
'bias_regularizer': None,
'activity_regularizer': None,
'kernel_constraint': None,
'bias_constraint': None}
```
##### The question 1 is validated if the fields `batch_input_shape`, `units` and `activation` match this output:
2. This question is validated if the fields`units` and `activation` match this output:
```
{'name': 'dense_7',
'trainable': True,
'batch_input_shape': (None, 5),
'dtype': 'float32',
'units': 8,
'activation': 'sigmoid',
'use_bias': True,
'kernel_initializer': {'class_name': 'GlorotUniform',
'config': {'seed': None}},
'bias_initializer': {'class_name': 'Zeros', 'config': {}},
'kernel_regularizer': None,
'bias_regularizer': None,
'activity_regularizer': None,
'kernel_constraint': None,
'bias_constraint': None}
```
```
{'name': 'dense_8',
'trainable': True,
'dtype': 'float32',
'units': 4,
'activation': 'sigmoid',
'use_bias': True,
'kernel_initializer': {'class_name': 'GlorotUniform',
'config': {'seed': None}},
'bias_initializer': {'class_name': 'Zeros', 'config': {}},
'kernel_regularizer': None,
'bias_regularizer': None,
'activity_regularizer': None,
'kernel_constraint': None,
'bias_constraint': None}
```
3. This question is validated if the fields`units` and `activation` match this output:
##### The question 2 is validated if the fields `units` and `activation` match this output:
```
{'name': 'dense_9',
'trainable': True,
'dtype': 'float32',
'units': 1,
'activation': 'sigmoid',
'use_bias': True,
'kernel_initializer': {'class_name': 'GlorotUniform',
'config': {'seed': None}},
'bias_initializer': {'class_name': 'Zeros', 'config': {}},
'kernel_regularizer': None,
'bias_regularizer': None,
'activity_regularizer': None,
'kernel_constraint': None,
'bias_constraint': None}
```
```
{'name': 'dense_8',
'trainable': True,
'dtype': 'float32',
'units': 4,
'activation': 'sigmoid',
'use_bias': True,
'kernel_initializer': {'class_name': 'GlorotUniform',
'config': {'seed': None}},
'bias_initializer': {'class_name': 'Zeros', 'config': {}},
'kernel_regularizer': None,
'bias_regularizer': None,
'activity_regularizer': None,
'kernel_constraint': None,
'bias_constraint': None}
```
##### The question 3 is validated if the fields `units` and `activation` match this output:
```
{'name': 'dense_9',
'trainable': True,
'dtype': 'float32',
'units': 1,
'activation': 'sigmoid',
'use_bias': True,
'kernel_initializer': {'class_name': 'GlorotUniform',
'config': {'seed': None}},
'bias_initializer': {'class_name': 'Zeros', 'config': {}},
'kernel_regularizer': None,
'bias_regularizer': None,
'activity_regularizer': None,
'kernel_constraint': None,
'bias_constraint': None}
```

14
one_exercise_per_file/week03/day02/ex03/audit/readme.md

@ -1,11 +1,11 @@
1. This question is validated if the code that creates the neural network is:
##### The question 1 is validated if the code that creates the neural network is:
```
model = keras.Sequential()
model.add(Dense(8, input_shape=(5,), activation= 'sigmoid'))
model.add(Dense(4, activation= 'sigmoid'))
model.add(Dense(1, activation= 'linear'))
```
model = keras.Sequential()
model.add(Dense(8, input_shape=(5,), activation= 'sigmoid'))
model.add(Dense(4, activation= 'sigmoid'))
model.add(Dense(1, activation= 'linear'))
```
```
The first two layers could use another activation function that sigmoid (eg: relu)

4
one_exercise_per_file/week03/day02/ex04/audit/readme.md

@ -1,4 +1,4 @@
1. This question is validated if the output of `model.get_config()['layers']` matches the fields `batch_input_shape`, `units` and `activation`.
##### The question 1 is validated if the output of `model.get_config()['layers']` matches the fields `batch_input_shape`, `units` and `activation`.
```
[{'class_name': 'InputLayer',
@ -59,4 +59,4 @@ You should notice that the neural network is struggling to learn. By luck the in
`Epoch 50/50
2/2 [==============================] - 0s 1ms/step - loss: 0.6559 - accuracy: 0.6274`
2. This solution is validated if the the accuracy at epoch 50 is higher than 95%.
##### The question 2 is validated if the the accuracy at epoch 50 is higher than 95%.

2
one_exercise_per_file/week03/day03/ex01/audit/readme.md

@ -1,4 +1,4 @@
1. This question is validated if the chunk of code is:
##### The question 1 is validated if the chunk of code is:
```
model.compile(

6
one_exercise_per_file/week03/day03/ex02/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the input DataFrames are:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the input DataFrames are:
X_train_scaled shape is (313, 5) and the first 5 rows are:
@ -41,7 +43,7 @@ The test target is:
| 318 | 29.8 |
| 319 | 31.3 |
2. This question is validated if the mean absolute error on the test set is smaller than 10. Here is an architecture that works:
##### The question 2 is validated if the mean absolute error on the test set is smaller than 10. Here is an architecture that works:
```
# create model

15
one_exercise_per_file/week03/day03/ex03/audit/readme.md

@ -1,9 +1,8 @@
1. This question is validated if the code that creates the neural network is:
##### The question 1 is validated if the code that creates the neural network is:
```
model = keras.Sequential()
model.add(Dense(16, input_shape=(5,), activation= 'sigmoid'))
model.add(Dense(8, activation= 'sigmoid'))
model.add(Dense(5, activation= 'softmax'))
```
```
model = keras.Sequential()
model.add(Dense(16, input_shape=(5,), activation= 'sigmoid'))
model.add(Dense(8, activation= 'sigmoid'))
model.add(Dense(5, activation= 'softmax'))
```

2
one_exercise_per_file/week03/day03/ex04/audit/readme.md

@ -1,4 +1,4 @@
1. This question is validated if the chunk of code is:
##### The question 1 is validated if the chunk of code is:
```
model.compile(loss='categorical_crossentropy',

6
one_exercise_per_file/week03/day03/ex05/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the output of the first ten values of the train labels are:
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the output of the first ten values of the train labels are:
```
array([[0, 1, 0],
@ -13,7 +15,7 @@ array([[0, 1, 0],
[0, 0, 1]])
```
2. This question is validated if the accuracy on the test set is bigger than 90%. To evaluate the accuracy on the test set you can use: `model.evaluate(X_test_sc, y_test_multi_class)`.
##### The question 2 is validated if the accuracy on the test set is bigger than 90%. To evaluate the accuracy on the test set you can use: `model.evaluate(X_test_sc, y_test_multi_class)`.
Here is an implementation that gives 96% accuracy on the test set.

3
one_exercise_per_file/week03/day05/ex01/audit/readme.md

@ -1,5 +1,4 @@
1. This question is validated if the embedding's shape is `(96,)`
and the vector 20 first values are:
##### The question 1 is validated if the embedding's shape is `(96,)` and the vector 20 first values are:
```
array([ 1.0522802e+00, 1.4806499e+00, 7.7402556e-01, 1.0373484e+00,

2
one_exercise_per_file/week03/day05/ex02/audit/readme.md

@ -1,4 +1,4 @@
1. The question is validated if the tokens printed are:
##### The question 1 is validated if the tokens printed are:
```
Tokenize

18
one_exercise_per_file/week03/day05/ex03/audit/readme.md

@ -1,14 +1,16 @@
1. This question is validated if the embeddings of each word has a shape of `(300,)` and if the first 20 values of the embedding of laptop are:
##### The exercice is validated is all questions of the exercice are validated
```
array([-0.37639 , -0.075521, 0.4908 , 0.19863 , -0.11088 , -0.076145,
-0.30367 , -0.69663 , 0.87048 , 0.54388 , 0.42523 , 0.18045 ,
-0.4358 , -0.32606 , -0.70702 , -0.069127, -0.42674 , 2.4147 ,
0.26806 , 0.46584 ], dtype=float32)
##### The question 1 is validated if the embeddings of each word has a shape of `(300,)` and if the first 20 values of the embedding of laptop are:
```
```
array([-0.37639 , -0.075521, 0.4908 , 0.19863 , -0.11088 , -0.076145,
-0.30367 , -0.69663 , 0.87048 , 0.54388 , 0.42523 , 0.18045 ,
-0.4358 , -0.32606 , -0.70702 , -0.069127, -0.42674 , 2.4147 ,
0.26806 , 0.46584 ], dtype=float32)
2. This question is validated if the output is
```
##### The question 2 is validated if the output is
![alt text][logo]

12
one_exercise_per_file/week03/day05/ex04/audit/readme.md

@ -1,8 +1,8 @@
1. This question is validated if the similarities between the sentences are:
##### The question 1 is validated if the similarities between the sentences are:
```
sentence_1 <=> sentence 2 : 0.7073220863266589
sentence_1 <=> sentence 3: 0.42663743263528325
sentence_2 <=> sentence 3: 0.3336274235605957
```
sentence_1 <=> sentence 2 : 0.7073220863266589
sentence_1 <=> sentence 3: 0.42663743263528325
sentence_2 <=> sentence 3: 0.3336274235605957
```
```

6
one_exercise_per_file/week03/day05/ex05/audit/readme.md

@ -1,4 +1,6 @@
1. This question is validated if the ouptut of the NER is
##### The exercice is validated is all questions of the exercice are validated
##### The question 1 is validated if the ouptut of the NER is
```
Apple Inc. ORG
@ -25,7 +27,7 @@
Apple ORG
Apple II ORG
```
2. This question is validated if the output shows that the first occurence of apple is not a named entity. In my case here is what the NER returns:
##### The question 2 is validated if the output shows that the first occurence of apple is not a named entity. In my case here is what the NER returns:
```
Paul 1 5 PERSON

22
one_exercise_per_file/week03/day05/ex06/audit/readme.md

@ -1,18 +1,18 @@
1. This question is validated if the sentences outputed are:
##### The question 1 is validated if the sentences outputed are:
```
INFO: Bezos PROPN NNP
Sentence: Amazon (AMZN) enters 2021 with plenty of big opportunities, but is losing its lauded Chief Executive Jeff Bezos, who announced his plan to step aside in the third quarter.
```
INFO: Bezos PROPN NNP
Sentence: Amazon (AMZN) enters 2021 with plenty of big opportunities, but is losing its lauded Chief Executive Jeff Bezos, who announced his plan to step aside in the third quarter.
INFO: Bezos PROPN NNP
Sentence: Bezos will hand off his role as chief executive to Andy Jassy, the CEO of its cloud computing unit.
INFO: Bezos PROPN NNP
Sentence: Bezos will hand off his role as chief executive to Andy Jassy, the CEO of its cloud computing unit.
INFO: Bezos PROPN NNP
Sentence: He's not leaving, as Bezos will transition to the role of Executive Chairman and remain active.
INFO: Bezos PROPN NNP
Sentence: He's not leaving, as Bezos will transition to the role of Executive Chairman and remain active.
INFO: Bezos PROPN NNP
Sentence: "When you look at our financial results, what you're actually seeing are the long-run cumulative results of invention," Bezos said in written remarks with the Amazon earnings release.
```
INFO: Bezos PROPN NNP
Sentence: "When you look at our financial results, what you're actually seeing are the long-run cumulative results of invention," Bezos said in written remarks with the Amazon earnings release.
```
Loading…
Cancel
Save