Showing posts with label cross validation. Show all posts
Showing posts with label cross validation. Show all posts

Sunday, December 19, 2021

Machine Learning MCQ - Leave one out Cross validation does not permit stratification - WHY

Multiple choices questions in Machine learning. Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions, question bank in machine learning, What is stratification in cross validation? Why do we stratify in cross validation? why do we use stratification for?

Machine Learning MCQ - Leave One Out Cross validation does not permit stratification - WHY?

< Previous                      

Next >

 

1. Which of the following cross validation strategies cannot be stratified?

a) k-fold cross validation

b) hold out cross validation

c) leave one out cross validation

d) shuffle split cross validation

Answer: (c) leave one out cross validation (LOOCV)

Leave one out cross validation (LOOCV) does not permit stratification.

 

What is stratification?

Stratification is the process of rearranging the data as to ensure each fold is a good representative of all strata (group of data based on characteristic) of the data. Generally this is done in a supervised way for classification and aims to ensure each class is (approximately) equally represented across each test fold (which are of course combined in a complementary way to form training folds). For example in a binary classification problem where each class comprises 50% of the data, it is best to arrange the data such that in every fold, each class comprises around half the instances.

 

Why do we need stratification in cross validation?

Classification problems can exhibit a large imbalance in the distribution of the target classes: for instance there could be several times more negative samples than positive samples. In such cases it is recommended to use stratified sampling to ensure that relative class frequencies is approximately preserved in each train and validation fold.

Also, stratification reduces the variance slightly and thus seems to be uniformly better than cross validation both for bias and variance.

 

How does LOOCV work?

Leave-one-out cross-validation is a special case of cross-validation where the number of folds equals the number of instances in the data set. Thus, the learning algorithm is applied once for each instance, using all other instances as a training set and using the selected instance as a single-item test set.

 

Why leave one out CV does not permit stratification?

LOOCV uses 1 instance of n instance dataset as test set, and remaining n-1 instances as training set. It repeats this process n number of times. We cannot divide the data as a representative of a class. Hence, stratification cannot be done.

All the other given strategies can be stratified.

 

< Previous                      

Next >


************************

Related links:

What is stratification cross validation?

How does leave one out cross validation work?

Can we use stratification on leave one out cross validation?

Machine learning solved mcq, machine learning solved mcq

 

Saturday, December 11, 2021

Machine Learning MCQ - Cross validation in machine learning

Multiple choices questions in Machine learning. Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions, question bank in machine learning, What is cross validation? What is k-fold cross validation? why do we use cross validation for?

Machine Learning MCQ - Cross validation in machine learning

< Previous                      

Next >

 

1. Suppose you have picked the parameter for a model using 10-fold cross validation (CV). Which of the following is the best way to pick a final model to use and estimate its error?

a) Pick any of the 10 models you built for your model; use its error estimate on the held-out data

b) Train a new model on the full data set, using the parameter you found; use the average CV error as its error estimate

c) Average all of the 10 models you got; use the average CV error as its error estimate

d) Average all of the 10 models you got; use the error the combined model gives on the full training set

Answer: (b) Train a new model on the full data set, using the parameter you found; use the average CV error as its error estimate

The best way to pick a final model is to train a new machine learning model on the full data set using the parameter learnt and to use the average cross-validation error as its error estimate.

 

k-fold cross validation

k-fold cross validation allows you to train and test your model k-times on different subsets of training data and build up an estimate of the performance of a machine learning model on unseen data.

 

We can compare different models using cross-validation

Cross Validation is mainly used for the comparison of different models. For each model, you may get the average generalization error on the k validation sets. Then you will be able to choose the model with the lowest average generation error as your optimal model.

 

Cross-validation is for model checking

The purpose of cross-validation is model checking, not model building, because it allows to repeatedly train and test on a single set of data. Let us suppose we have a linear regression model and a neural network. To select the best one among these, we can do K-fold cross-validation and see which one proves better at predicting the test set points. But once we have used cross-validation to select the better performing model, we train that model (whether it be the linear regression or the neural network) on all the data. We don't use the actual model instances we trained during cross-validation for our final predictive model.

More information on training, validation, and test sets.

 

< Previous                      

Next >


************************

Related links:

What is cross validation?

Why do we use k-fold cross validation for?

Can we compare the performance of different machine learning models using k-fold cross validation?

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents

data recovery