You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1.0 KiB

The question 1 is validated if the outputted plot looks like the plot below. The two important points to check are: The training score has to converge towards 1 and the cross-validation score reaches a plateau around 0.9 from max_depth = 10

alt text

The code that generated the data in the plot is:

from sklearn.model_selection import validation_curve

clf = RandomForestClassifier()
param_range = np.arange(1,30,2)
train_scores, test_scores = validation_curve(clf,
                                            X,
                                            y,
                                            param_name="max_depth",
                                            param_range=param_range,
                                            scoring="roc_auc",
                                            n_jobs=-1)
The question 2 is validated if the outputted plots looks like:

alt text