Multiple choices questions in Machine learning. Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions, question bank in machine learning, What is stratification in cross validation? Why do we stratify in cross validation? why do we use stratification for?
Machine Learning MCQ - Leave One Out Cross validation does not permit stratification - WHY?
1. Which of the following cross validation strategies cannot be stratified?
a) k-fold cross validation
b) hold out cross validation
c) leave one out cross validation
d) shuffle split cross validation
Answer: (c) leave one out cross validation (LOOCV) Leave one out cross validation (LOOCV) does not permit stratification.
What is stratification?Stratification is the process of rearranging the data as to ensure each fold is a good representative of all strata (group of data based on characteristic) of the data. Generally this is done in a supervised way for classification and aims to ensure each class is (approximately) equally represented across each test fold (which are of course combined in a complementary way to form training folds). For example in a binary classification problem where each class comprises 50% of the data, it is best to arrange the data such that in every fold, each class comprises around half the instances.
Why do we need stratification in cross validation?Classification problems can exhibit a large imbalance in the distribution of the target classes: for instance there could be several times more negative samples than positive samples. In such cases it is recommended to use stratified sampling to ensure that relative class frequencies is approximately preserved in each train and validation fold. Also, stratification reduces the variance slightly and thus seems to be uniformly better than cross validation both for bias and variance.
How does LOOCV work?Leave-one-out cross-validation is a special case of cross-validation where the number of folds equals the number of instances in the data set. Thus, the learning algorithm is applied once for each instance, using all other instances as a training set and using the selected instance as a single-item test set.
Why leave one out CV does not permit stratification?LOOCV uses 1 instance of n instance dataset as test set, and remaining n-1 instances as training set. It repeats this process n number of times. We cannot divide the data as a representative of a class. Hence, stratification cannot be done. All the other given strategies can be stratified. |
Related links:
What is stratification cross validation?
How does leave one out cross validation work?
Can we use stratification on leave one out cross validation?
Machine learning solved mcq, machine learning solved mcq
No comments:
Post a Comment