Top 5 Machine Learning Quiz Questions with Answers explanation, Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions
Machine learning MCQ - Set 05
1. Which of the following is a clustering algorithm in machine learning?
a) Expectation Maximization
b) CART
c) Gaussian Naïve Bayes
d) Apriori
View Answer
Answer: (a) expectation maximzation
Expectation
Maximization (EM) is a clustering algorithm that relies on maximizing the likelihood
to find the statistical parameters of the underlying sub-populations in the
dataset. Expectation maximization provides an iterative solution to maximum
likelihood estimation with latent variables.
CART is a decision tree algorithmGaussian Naïve Bayes is Bayesian algorithmApriori is a association rule learning algorithm |
2. The model
obtained by applying linear regression on the identified subset of features may
differ from the model obtained at the end of the process of identifying the
subset during
a) Best-subset
selection
b) Forward stepwise
selection
c) Forward stage wise selection
d) All of the above
View Answer
Answer: (c) Forward stage wise selection
Let us assume
that the data set has p features among which each method is used to select k;
0 < k < p, features. If we use the selected k features identified by
forward stage wise selection, and apply linear regression, the model we
obtain may differ from the model obtained at the end of the process of
applying forward stage wise selection to identify the k features. This is due
to the manner in which the coefficients are built in this method where at
each step the algorithm computes the simple linear regression coefficient of
the residual on the variable identified as having the largest correlation
with the residual, and adds it to the current coefficient for that variable.
Note that there will be no difference in the other two methods, because in
both forward and backward stepwise selection, at each step of removing/adding
a feature, linear regression is performed on the retained subset of features
to learn the coefficients.
[source:
Introduction to machine learning, IITM]
|
3. You trained a binary classifier model which gives very high accuracy on the training data, but much lower accuracy on validation data. Which of the following may be true?
a) This is an instance of overfitting.
b) This is an
instance of underfitting.
c) The training was not well regularized.
d) The training and testing examples are sampled from different distributions.
View Answer
Answer: (a), (c) and (d)
Any of these
three options are valid reasons for lower accuracy on test data.
|
4. What are support vectors?
a) The examples
farthest from the decision boundary.
b) The only examples necessary to compute f(x) in an SVM.
c) The class
centroids.
d) All the examples that have a non-zero weight αk in a SVM.
View Answer
Answer: (b) and (d)
Only the support
vectors (on the gutters or margin) will have nonzero weights or a’s – this
reduces the dimensionality of the solution.
A support vector
machine attempts to find the line that "best" separates two classes
of points. By "best", we mean the line that result in the largest margin between the two
classes. The points that lie on this margin are the support vectors.
A Support Vector Machine (SVM) performs classification by finding the hyperplane that maximizes the margin between the two classes. The vectors that define the hyperplane are the support vectors. |
5. Which of the
following is the joint probability of H, U, P, and W described by the given Bayesian
Network? [note: as the
product of the conditional probabilities]
a) P(H, U, P, W) =
P(H) * P(W) * P(P) * P(U)
b) P(H, U, P, W) =
P(H) * P(W) * P(P | W) * P(W | H, P)
c) P(H, U, P, W) = P(H) * P(W) * P(P | W) * P(U | H, P)
d) None of the
above
View Answer
Answer: (c) P(H, U, P, W) = P(H) * P(W) * P(P | W) * P(U | H,
P)
In the given
Bayesian network, H and W do not depend on any other nodes. Hence, we consider
the start probabilities P(H) and P(W).
Node P has a
transition from W, hence conditional probability P(P|W).
Node U has
transitions from H and P, hence the conditional probability is P(U|H, P).
For Bayesian
network, the join probability is the product of above said probabilities.
|
**********************
No comments:
Post a Comment