How do I choose the right machine learning algorithm for a problem?

The choice of a machine learning algorithm depends on the problem type (regression, classification, clustering), data nature (labeled or unlabeled), output variable, and domain constraints such as interpretability and scalability.

Which algorithm is best for predicting continuous values like house prices?

For predicting continuous values such as house prices or sales amounts, regression algorithms like Linear Regression or advanced models such as Random Forest Regressor are commonly used.

What algorithm should be used for customer segmentation?

Customer segmentation is an unsupervised learning problem, and clustering algorithms such as K-Means or hierarchical clustering are widely used.

Which machine learning algorithm is suitable for spam detection?

Spam detection is a text classification problem. Algorithms like Naive Bayes, Logistic Regression, and Support Vector Machines are commonly used.

Why is PCA used before training a machine learning model?

PCA is used to reduce dimensionality by removing redundant features while preserving maximum variance, which improves training speed and model performance.

Are these MCQs useful for machine learning interviews?

Yes. These MCQs are scenario-based and reflect real-world datasets, making them highly relevant for machine learning interviews, exams, and practical understanding.

In a generative classification model, how is the decision rule obtained after estimating P(X|Y) and P(Y)?

The decision rule is obtained by evaluating P(Y|X) using Bayes’ rule, where classification is based on P(Y|X) ∝ P(X|Y)P(Y).

When do logistic regression and Gaussian Naive Bayes produce identical decision boundaries?

They are identical when all class covariances are equal and identity, resulting in a linear discriminant function matching logistic regression.

Why is the training error of 1-NN always zero?

Because every training sample is its own nearest neighbor, 1-NN always predicts the correct label for training points.

For which type of prior does the MAP estimate not converge to the MLE even with infinite data?

MAP does not converge to MLE when the prior assigns probability 1 to a single fixed parameter value, making it a degenerate prior.

Why is cross-validation useful in boosting?

Boosting has no natural stopping point and may overfit if allowed to run indefinitely; cross-validation helps choose the optimal number of boosting rounds.

How does Kernel Density Estimation (KDE) differ from kernel regression?

KDE estimates the probability density P(X), while kernel regression estimates the function value ŷ(x) using weighted averages.

How does boosting affect decision boundaries?

Boosting combines weak learners into a weighted ensemble, often producing highly nonlinear and more complex decision boundaries.

How can a decision tree exceed the number of training samples in depth?

When many training samples share identical feature values but have conflicting labels, the tree keeps splitting and may exceed depth n.

What happens to 1-NN as dataset size n → ∞ while keeping k = 1?

The test error of 1-NN approaches the Bayes optimal error under mild smoothness assumptions.

What is the key difference between generative and discriminative models?

Discriminative models directly learn P(Y|X) or decision boundaries, whereas generative models model the joint distribution P(X,Y).

What happens if preprocessing is applied after data splitting in cross-validation?

Applying preprocessing after splitting can cause data leakage from validation folds into training folds, resulting in optimistically biased accuracy. Always split data first, then preprocess within each fold.

After tuning using cross-validation, how should final accuracy be reported?

Retrain the model on the full dataset using the best hyperparameters and evaluate it on a held-out test set. Reporting only the cross-validation average can lead to biased results.

Why does Leave-One-Out Cross-Validation (LOOCV) lead to high variance?

Because LOOCV trains on nearly all data each time, even small changes in a single observation can cause large fluctuations in the error, resulting in high variance estimates.

When should Time Series Cross-Validation be used?

Use Time Series Cross-Validation when data points are ordered chronologically. It trains on past data and validates on future data, preventing lookahead bias.

What is Monte Carlo Cross-Validation in machine learning?

Monte Carlo Cross-Validation, or Repeated Random Subsampling, involves performing many random 80/20 splits and averaging accuracy to estimate model performance more reliably.

Why might a model perform well in cross-validation but poorly on the test set?

Repeated hyperparameter tuning on the same CV folds can overfit the validation data. The model learns fold-specific patterns that don’t generalize to truly unseen test data.

Which validation method provides the most reliable generalization estimate with extensive tuning?

Nested Cross-Validation gives the most unbiased generalization estimate by separating hyperparameter tuning (inner loop) and model evaluation (outer loop), preventing overfitting and optimistic bias.

What is the major advantage of k-fold cross-validation over simple hold-out validation?

K-fold Cross-Validation uses the entire dataset for both training and validation, leading to more reliable performance estimates and better use of limited data.

What is the main purpose of model validation in machine learning?

The purpose of model validation is to measure how well a trained model generalizes to unseen data, helping detect overfitting before final testing or deployment.

Where can I find Machine Learning MCQs with answers?

You can find topic-wise Machine Learning MCQs with answers including SVM, Regression, Deep Learning, and Model Testing on this page.

Are these MCQs suitable for Data Science and AI certification exams?

Yes, the MCQs listed here cover key concepts from popular AI and ML certification syllabi including Python, SQL, and Deep Learning topics.

What topics are covered in these Machine Learning MCQs?

Our MCQ index covers SVM, Decision Trees, Regression, Feature Selection, Model Evaluation, Deep Learning, and Python for Data Science.

In soft-margin SVM, what does the parameter C control?

Parameter C controls the balance between margin width and classification errors. A large C means fewer violations (smaller margin), while a small C allows more violations (larger margin).

Showing posts with label Machine Learning Quiz. Show all posts

Monday, November 3, 2025

Model Validation in Machine Learning – 10 HOT MCQs with Answers

✔ Scroll down and test yourself — answers are hidden under the “View Answer” button.

Model Validation in Machine Learning – 10 HOT MCQs with Answers | Cross-Validation, Hold-Out & Nested CV Explained

1. A data scientist performs 10-fold cross-validation and reports 95% accuracy. Later, they find that data preprocessing was applied after splitting. What does this imply?

A. Accuracy is still valid
B. Accuracy may be optimistically biased
C. Folds were too small
D. It prevents data leakage

Answer: B

Explanation: Preprocessing after splitting can leak info from validation folds into training folds, inflating accuracy. That is, preprocessing after splitting can systematically overestimate model performance due to data leakage.

When data preprocessing—such as scaling, normalization, or feature selection—is applied after splitting (i.e., on the entire dataset before dividing into folds), information from the validation/test set can inadvertently leak into the training process. This leakage inflates the measured performance, causing results like the reported 95% accuracy to be higher than what the model would achieve on truly unseen data. This is a well-known issue in cross-validation and machine learning validation.

Correct procedure of data preprocessing in cross-validation

Proper practice is to split the data first, then apply preprocessing separately to each fold to avoid biasing results.

For each fold:

Split → Training and Validation subsets
Fit preprocessing only on training data
Transform both training and validation sets
Train model
Evaluate

2. Which validation strategy most likely overestimates model performance?

A. Nested cross-validation
B. Random train/test split without stratification
C. Cross-validation on dataset used for feature selection
D. Stratified k-fold

Answer: C

Explanation: Feature selection before CV leaks validation data info, inflating scores. If you perform feature selection on the entire dataset before cross-validation, the model has already “seen” information from all samples (including what should be test data).

This causes data leakage,
which makes accuracy look higher than it truly is,
hence the performance is overestimated.

More explanation: This happens because when feature selection is carried out on the entire dataset before performing cross-validation, information from test folds leaks into the training process. This makes accuracy estimates unrealistically high and not representative of unseen data. Feature selection should always be nested inside the cross-validation loop — i.e., done within each training subset.

3. After tuning using 5-fold CV, how should you report final accuracy?

A. CV average
B. Retrain on full data and test on held-out test set
C. Best fold score
D. Validation score after tuning

4. Why might Leave-One-Out CV lead to high variance?

A. Too little training data
B. Needs resampling
C. Fold too large
D. Almost all data used for training

Answer: D
Explanation: Small change in one sample affects result → high variance.

What is LOOCV?

In Leave-One-Out Cross-Validation (LOOCV), we use n folds, where n = number of samples. For each iteration, we train the model on n − 1 samples and test it on the single remaining sample. This is repeated n times and the results are averaged.

Why is it high variance?

Each training set is almost the same with only one sample changing between folds. That means the model sees nearly all the data each time, so each trained model is very similar, but each test case (the one left-out point) can cause a big swing in the error if the model slightly mispredicts it.

As a result, the estimated performance for each fold fluctuates heavily depending on which single observation is left out. When you average them, the mean may still vary a lot between datasets — hence high variance in the performance estimate.

Example

Suppose you have a dataset of 100 SUVs (Toyota Fortuner, Volkswagen Taigun, Mercedes-Benz GLS, etc.). You train your model 99 times, each time leaving one SUV out as the test case. For the first run you train on 99 SUVs and test on 1 (say, a Range Rover Evoque), then repeat for all 100 cars so every car gets to be the “left-out” test case once.

Most SUVs in your dataset might be mid-range models (₹20–30 lakhs), but a few might be luxury SUVs (like a Range Rover at ₹90 lakhs). Because LOOCV tests on just one car at a time, if that single car happens to be a rare or unusual model (e.g., the only electric SUV), or has outlier features (very high horsepower, unique brand, etc.), the model trained on the other 99 cars may not generalize well to that one.

That single prediction will produce a large error, which strongly affects that fold’s test score. The average of these highly variable fold scores becomes unstable — small changes in the dataset (or presence/absence of a few outliers) can lead to large changes in the reported CV score.

5. When should Time Series CV be used?

A. Independent samples
B. Predicting future from past
C. Imbalanced data
D. Faster training

Answer: B

Explanation:

Time Series CV preserves temporal order to avoid lookahead bias. Use Time Series Cross-Validation when the data have a temporal order, and you want to predict future outcomes from past patterns without data leakage.

Time Series Cross-Validation (TSCV) is used when data points are ordered over time — for example, stock prices, weather data, or sensor readings.

The order of data matters.
Future values depend on past patterns.
You must not shuffle the data, or it will leak future information.

Unlike standard k-fold cross-validation, TSCV respects the chronological order and ensures that the model is trained only on past data and evaluated on future data, mimicking real-world forecasting scenarios.

6. Performing many random 80/20 splits and averaging accuracy is called:

A. Bootstrapping
B. Leave-p-out
C. Monte Carlo Cross-Validation
D. Nested CV

Answer: C

Explanation: Monte Carlo validation averages performance over multiple random splits.

Monte Carlo Cross-Validation (also known as Repeated Random Subsampling Validation) involves randomly splitting the dataset into training and testing subsets multiple times (e.g., 80% training and 20% testing).

The model is trained and evaluated on these splits repeatedly, and the results (such as accuracy) are averaged to estimate the model's performance.

This differs from k-fold cross-validation because the splits are random and may overlap — some data points might appear in multiple test sets or not appear at all in some iterations.

When is Monte Carlo Cross-Validation useful?

You have limited data but want a more reliable performance estimate.
You want flexibility in training/test split sizes.
The dataset is large, and full k-fold CV is too slow.
You don’t need deterministic folds.
The data are independent and identically distributed (i.i.d.).

7. Model performs well in CV but poorly on test set. Why?

A. Too many folds
B. Overfitting during tuning
C. Underfitted model
D. Large test set

8. Which gives most reliable generalization estimate with extensive tuning?

A. Single 80/20 split
B. Nested CV
C. Stratified 10-fold
D. Leave-One-Out

Answer: B

Explanation: Nested CV separates tuning and evaluation, avoiding bias. When you perform extensive hyperparameter tuning, use Nested Cross-Validation to get the most reliable, unbiased estimate of true generalization performance.

How does Nested CV handle optimistic bias?

In standard cross-validation, if the same data is used both to tune hyperparameters and to estimate model performance, it can lead to an optimistic bias. That is, the model "sees" the validation data during tuning, which inflates performance estimates but does not truly represent how the model will perform on new unseen data.

Nested CV solves this by separating the tuning and evaluation processes into two loops:

Inner loop: Used exclusively to tune the model's hyperparameters by cross-validation on the training data.
Outer loop: Used to evaluate the generalized performance of the model with the tuned hyperparameters on a held-out test fold that was never seen during the inner tuning.

This structure ensures no data leakage between tuning and testing phases, providing a less biased, more honest estimate of how the model will perform in real-world scenarios.

When to use Nested Cross-Validation?

Nested CV is computationally expensive. It is recommended especially when you do extensive hyperparameter optimization to avoid overfitting in model selection and get a realistic estimate of true model performance.

9. Major advantage of k-fold CV over simple hold-out?

A. Ensures higher accuracy
B. Eliminates overfitting
C. Uses full dataset efficiently
D. Requires less computation

10. What best describes the purpose of model validation?

A. Improve training accuracy
B. Reduce dataset size
C. Reduce training time
D. Measure generalization to unseen data

Answer: D
Explanation: Validation estimates generalization performance before final testing.

Go to TOP 10 MCQs in Machine Learning - Home page

Sunday, November 2, 2025

Machine Learning training MCQs

Training of machine learning models - Gradient descent, optimization
Machine Learning testing MCQs

Testing of machine learning models - accuracy, generalization
Linear regression MCQs
Linear regression - Concepts quiz
Linear regression - Technical quiz
Linear regression - Python technical quiz
Decision tree MCQs
Decision tree - Construction and entropy basics
Decision tree - Splitting, overfitting, and pruning
Decision tree - Gain ratio, continuous attributes
Support Vector Machine (SVM) MCQs

SVM - Concepts
Machine Learning - model validation MCQs
Cross-validation - cross-validation, hold-out, and nested cv
Neural network MCQs
Testing and evaluation MCQs
Feature selection MCQs
Principal Component Analysis MCQs
Clustering MCQs

Wednesday, October 29, 2025

Top 10 ML MCQs on SVM Concepts (2025 Edition)

✔ Scroll down and test yourself — answers are hidden under the “View Answer” button.

Top 10 New MCQs on SVM Concepts (2025 Edition) | Explore Database

Top 10 New MCQs on SVM Concepts (2025 Edition)

1. Which of the following best describes the margin in an SVM classifier?

A. Distance between two closest support vectors
B. Distance between support vectors of opposite classes
C. Distance between decision boundary and the nearest data point of any class
D. Width of the separating hyperplane

2. In soft-margin SVM, the penalty parameter C controls what?

A. The kernel function complexity
B. The balance between margin width and classification errors
C. The learning rate during optimization
D. The dimensionality of transformed space

3. Which of the following statements about the kernel trick in SVM is true?

A. It explicitly computes higher-dimensional feature mappings
B. It avoids computing transformations by using inner products in the feature space
C. It can only be applied to linear SVMs
D. It reduces the number of support vectors required

4. Which step is unique to non-linear SVMs?

A. Feature normalization
B. Slack variable introduction
C. Kernel trick application
D. Margin maximization

5. If the data is perfectly linearly separable, what is the ideal value of C?

A. Very small (close to 0)
B. Moderate (around 1)
C. Very large (→ ∞)
D. Exactly equal to margin value

6. Which optimization problem does SVM solve during training?

A. Minimization of loss function via gradient descent
B. Maximization of likelihood function
C. Quadratic optimization with linear constraints
D. Linear programming without constraints

7. What is the primary reason for using a kernel function in SVM?

A. To increase training speed
B. To handle non-linear relationships efficiently
C. To reduce the number of features
D. To minimize overfitting automatically

8. In SVM, support vectors are:

A. All training samples
B. Only samples lying on the margin boundaries
C. Samples inside the margin or misclassified
D. Both B and C

9. When the gamma (γ) parameter of an RBF kernel is too high, what typically happens?

A. The decision boundary becomes smoother
B. Model generalizes better
C. Model overfits by focusing on nearby points
D. Model underfits with large bias

10. Which of the following metrics is most relevant for evaluating SVM on imbalanced datasets?

A. Accuracy
B. Precision and Recall
C. Log-loss
D. Margin width

For deeper understanding, learners can explore machine learning training with placement opportunities or online SVM courses.

Machine learning specialization courses

SVM interview questions 2025

These questions are ideal for those preparing for machine learning certification exams or AI engineer job interviews.

AI engineer skills and salary

AI engineers with expertise in SVM and deep learning earn competitive salaries in 2025, especially in data-driven industries.

Go to TOP 10 MCQs in Machine Learning - Home page

Major links

Quicklinks

Tuesday, January 6, 2026

Choosing the Right Machine Learning Algorithm – Real-World MCQs

Thursday, December 4, 2025

Hidden Markov Model - MCQs - Problem-based Practice Questions

Machine Learning - Advanced MCQs

What is 1-NN?

Why is the training error of 1-NN always zero?

What is boosting?

What is cross-validation?

Why does boosting need cross-validation?

Differences between KDE and Kernel regression

How does boosting affect the complexity of the final decision boundary?

Why can a decision tree have depth greater than the number of training samples?

Why does this happen with repeated features?

Monday, November 3, 2025

Model Validation in Machine Learning – 10 HOT MCQs with Answers | Cross-Validation, Hold-Out & Nested CV Explained

Correct procedure of data preprocessing in cross-validation

What is 5-fold cross validation?

Steps

Why Retrain on Full Data while using cross-validation?

Why Use a Held-Out Test Set?

What is LOOCV?

In Leave-One-Out Cross-Validation (LOOCV), we use n folds, where n = number of samples. For each iteration, we train the model on n − 1 samples and test it on the single remaining sample. This is repeated n times and the results are averaged.

Why is it high variance?

When is Monte Carlo Cross-Validation useful?

How overfitting might be caused in cross-validation?

How does Nested CV handle optimistic bias?

When to use Nested Cross-Validation?

Sunday, November 2, 2025

Top Machine Learning MCQs with Answers | AI, Data Science & Python Interview Questions

Machine Learning training MCQs

Training of machine learning models - Gradient descent, optimizationMachine Learning testing MCQs

SVM - ConceptsMachine Learning - model validation MCQsCross-validation - cross-validation, hold-out, and nested cvNeural network MCQsTesting and evaluation MCQsFeature selection MCQsPrincipal Component Analysis MCQsClustering MCQs

Machine Learning - model validation MCQs

Wednesday, October 29, 2025

Top 10 New MCQs on SVM Concepts (2025 Edition)

What is kernel trick?

Why kernel trick is unique to non-linear SVMs only?

Why the optimization problem is "Quadratic optimization with linear constraints"?

What is gamma (γ) parameter?

For deeper understanding, learners can explore machine learning training with placement opportunities or online SVM courses.

SVM interview questions 2025

AI engineer skills and salary

Featured Content

All time most popular contents

Training of machine learning models - Gradient descent, optimization
Machine Learning testing MCQs

SVM - Concepts
Machine Learning - model validation MCQs
Cross-validation - cross-validation, hold-out, and nested cv
Neural network MCQs
Testing and evaluation MCQs
Feature selection MCQs
Principal Component Analysis MCQs
Clustering MCQs