TOPICS (Click to Navigate)

Pages

Friday, May 8, 2020

Machine Learning Multiple Choice Questions and Answers 05

Top 5 Machine Learning Quiz Questions with Answers explanation, Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions



Machine learning MCQ - Set 05



1. Which of the following is a clustering algorithm in machine learning?

a) Expectation Maximization
b) CART
c) Gaussian Naïve Bayes
d) Apriori

View Answer

Answer: (a) expectation maximzation
Expectation Maximization (EM) is a clustering algorithm that relies on maximizing the likelihood to find the statistical parameters of the underlying sub-populations in the dataset. Expectation maximization provides an iterative solution to maximum likelihood estimation with latent variables.

CART is a decision tree algorithm

Gaussian Naïve Bayes is Bayesian algorithm

Apriori is a association rule learning algorithm


2. The model obtained by applying linear regression on the identified subset of features may differ from the model obtained at the end of the process of identifying the subset during
a) Best-subset selection
b) Forward stepwise selection
c) Forward stage wise selection
d) All of the above

View Answer

Answer: (c) Forward stage wise selection
Let us assume that the data set has p features among which each method is used to select k; 0 < k < p, features. If we use the selected k features identified by forward stage wise selection, and apply linear regression, the model we obtain may differ from the model obtained at the end of the process of applying forward stage wise selection to identify the k features. This is due to the manner in which the coefficients are built in this method where at each step the algorithm computes the simple linear regression coefficient of the residual on the variable identified as having the largest correlation with the residual, and adds it to the current coefficient for that variable. Note that there will be no difference in the other two methods, because in both forward and backward stepwise selection, at each step of removing/adding a feature, linear regression is performed on the retained subset of features to learn the coefficients.
[source: Introduction to machine learning, IITM]

3. You trained a binary classifier model which gives very high accuracy on the training data, but much lower accuracy on validation data. Which of the following may be true?

a) This is an instance of overfitting.
b) This is an instance of underfitting.
c) The training was not well regularized.
d) The training and testing examples are sampled from different distributions.

View Answer

Answer: (a), (c) and (d)
Any of these three options are valid reasons for lower accuracy on test data.

4. What are support vectors?

a) The examples farthest from the decision boundary.
b) The only examples necessary to compute f(x) in an SVM.
c) The class centroids.
d) All the examples that have a non-zero weight αk in a SVM.

View Answer

Answer: (b) and (d)
Only the support vectors (on the gutters or margin) will have nonzero weights or a’s – this reduces the dimensionality of the solution.
A support vector machine attempts to find the line that "best" separates two classes of points.  By "best", we mean the line that result in the largest margin between the two classes.  The points that lie on this margin are the support vectors.

A Support Vector Machine (SVM) performs classification by finding the hyperplane that maximizes the margin between the two classes. The vectors that define the hyperplane are the support vectors.


5. Which of the following is the joint probability of H, U, P, and W described by the given Bayesian Network? [note: as the product of the conditional probabilities]

a) P(H, U, P, W) = P(H) * P(W) * P(P) * P(U)
b) P(H, U, P, W) = P(H) * P(W) * P(P | W) * P(W | H, P)
c) P(H, U, P, W) = P(H) * P(W) * P(P | W) * P(U | H, P)
d) None of the above

View Answer

Answer: (c) P(H, U, P, W) = P(H) * P(W) * P(P | W) * P(U | H, P)
In the given Bayesian network, H and W do not depend on any other nodes. Hence, we consider the start probabilities P(H) and P(W).
Node P has a transition from W, hence conditional probability P(P|W).
Node U has transitions from H and P, hence the conditional probability is P(U|H, P).
For Bayesian network, the join probability is the product of above said probabilities.


**********************

Related links:


top 5 questions in machine learning

quiz questions for data scientists

data science quiz online

online quiz questions on machine learning

MCQs on machine learning and data science

data science interview questions

data science previous question papers

machine learning multiple choice questions

test on machine learning skills

top 5 machine learning interview questions

machine learning exam questions

No comments:

Post a Comment