Top 5 Machine Learning Quiz Questions with Answers explanation, Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions
Machine learning MCQ - Set 07
1. Which of the following can only be used when training data are linearly separable?
a) Linear hard-margin SVM.
b) Linear Logistic
Regression.
c) Linear Soft
margin SVM.
d) The centroid
method.
View Answer
Answer: (a) Linear hard-margin SVM
Hard margin SVM
can work only when data is completely linearly separable without any errors
(noise or outliers). This is called as hard margin SVM since we have very
strict constraints to correctly classify each and every data points.
|
2. Consider the
Bayesian network given below. How many independent parameters would we need if
we made no assumptions about independence or conditional independence?
a) 3
b) 4
c) 7
d) 15
View Answer
Answer: (d) 15
A model which
makes no conditional independence assumptions would need 24−1 = 15
parameters.
Parameter
estimation:
A straightforward
representation of the join probability distribution over n binary variables
requires us to represent the probability of every combination of states of
these variables. For n binary variables, for example, we have 2n-1
such combinations.
|
3. The K-means algorithm:
a) Requires the
dimension of the feature space to be no bigger than the number of samples
b) Has the smallest
value of the objective function when K = 1
c) Minimizes the within class variance for a given number of
clusters
d) Converges to the
global optimum if and only if the initial means are chosen as some of the
samples themselves
View Answer
Answer: (c) Minimizes the within class variance for a given
number of clusters
The objective of K-Means clustering is to minimize total intra-cluster variance.
Within-cluster-variance
is a simple to understand measure of
compactness (compact partitioning).
K-means minimizes
intra-cluster variance (tighter clusters); that is, the discovered clusters
minimize the sum of the squared distances between data points and the center
(centroid) of their containing cluster.
|
4. Which one of
the following is equal to P(A, B, C) given Boolean random variables A, B and C,
and no independence or conditional independence assumptions between any of
them?
a) P(A | B) * P(B |
C) * P(C | A)
b) P(C | A, B) * P(A)
* P(B)
c) P(A, B | C) * P(C)
d) P(A | B, C) * P(B
| A, C) * P(C | A, B)
View Answer
Answer: (c) P(A, B | C) * P(C)
P(A, B, C) = P(A,
B | C) * P(C).
|
5. For polynomial regression, which one of these structural assumptions is the one that most affects the trade-off between underfitting and overfitting:
a) The polynomial degree
b) Whether we learn
the weights by matrix inversion or gradient descent
c) The assumed
variance of the Gaussian noise
d) The use of a
constant-term unit input
View Answer
Answer: (a) the polynomial degree
Choosing the
right degree of polynomial plays a critical role in fit of regression. Higher-order
polynomials can be a serious abuse of regression analysis. If we choose
higher degree of polynomial, chances of overfit increase significantly. And
the model with higher degree of polynomial will fail to generalize on unseen
data.
A high degree
polynomial closely fits more number of points, hence the bias is low. While a
low degree polynomial does not have this expressivity leading to high bias.
[Refer here for
more: Polynomial regression bias-variance tradeoff playground]
|
**********************
No comments:
Post a Comment