Best way to qualify your machine learning model.

Selection of your final machine learning model is a vital part of your project. Using the accurate metric and the selection paradigm might give very good results even you use very simple or even wrong learning algorithm. Here, I explain a very parsimonious and plane way.

The metric you choose is depended to your problem end expectations. Some common alternatives are F1 score (combination of precision and recall), accuracy (ratio of correctly classified instances to all instances), ROC curve or error rate (1-accuracy).

For being an example I use error rate (at the below figure). First divide the data into 3 as train set, held-out set, test set. We will use held-out set as an objective guidance of hyper-parameters of your algorithm. You might also prefer to use K-fold X-validation but my choice is to keep a held-out set, if I have enough number of instances.

Following procedure can be used for parameter selection and the selection of the final model. The idea is, plotting the performance of the model with the lines of test fold accuracy (held-out set) and the train fold accuracy. This plot should be met at a certain point where both of the curves consistent in some sense (training fold and test fold scores are at reasonable levels) and after a slight step they start to be stray away from each other (train fold score increases still and test fold score starts to be dropped down). This straying effect might be underfitting or after a numerous learning iterations likely to be overfitting.  Choice the best trade-off point on the plot as the correct model.


Example with error rate so not confused by the decreasing values so lower is better in that sense. The signed point is the saturation point where the data starts to over-fit.

Another caveat, do not use so much folds for x-validation since some of the papers (that cannot come up the name right now:( ), asymptotic behaviour of cross validation is likely to tout over-fitting therefore use of leave-multiple out procedure instead of leave-one out if you propose to use large fold number.