Saturday, December 5, 2020

Machine Learning TRUE or FALSE Questions with Answers 18

Machine learning exam questions, ML solved quiz questions, Machine Learning TRUE or FALSE questions, TOP 5 machine learning quiz questions with answers

Machine Learning TRUE / FALSE Questions - SET 18

1. For linearly separable data, can a small slack penalty (“C") hurt the training accuracy when using a linear SVM without kernel.

(a) TRUE                                                   (b) FALSE

Answer: TRUE

If the optimal values of α's (say in the dual formulation) are greater than C, we may end up with a sub-optimal decision boundary with respect to the training examples. Alternatively, a small C can allow large slacks, thus the resulting classifier will have a small value of w2 but can have non-zero training error.

 

C is a regularization parameter that controls the trade-off between the achieving a low training error and a low testing error that is the ability to generalize your classifier to unseen data. If your C is too small then you give your objective function a certain freedom to increase |w| a lot, which will lead to large training error.

C Parameter is used for controlling the outliers — low C implies we are allowing more outliers, high C implies we are allowing fewer outliers.

 

2. Ridge regression, weight decay, and Gaussian processes use the same regularizer.

(a) TRUE                                                   (b) FALSE

Answer: TRUE

Ridge regression, weight decay, and Gaussian processes use the same regularizer ǁwǁ2.

Regularization

In the context of machine learning, regularization is the process which regularizes or shrinks the coefficients towards zero. In simple words, regularization discourages learning a more complex or flexible model, to prevent overfitting. [For more, refer here please]

Regularization may be defined as any change we make to the training algorithm in order to reduce the generalization error but not the training error.

Ridge regression is like least-square regression with an additional penalty term ǁwǁ2.

Weight decay means decreasing the weights at every learning step.

A Gaussian process is a generative model in which the weights of the target function are drawn according to a Gaussian distribution (for a linear model).

 

3. Linear soft-margin SVM can only be used when training data are linearly separable.

(a) TRUE                                                   (b) FALSE

Answer: FALSE

Hard margin SVM can work only when data is completely linearly separable without any errors (noise or outliers). In case of errors either the margin is smaller or hard margin SVM fails. On the other hand soft margin SVM was proposed to solve this problem by introducing slack variables. It is an extended version of hard-margin SVM

 

4. In linear regression, using an L2 regularization penalty term results in sparser solutions than using an L1 regularization penalty term.

(a) TRUE                                                   (b) FALSE

Answer: FALSE

In linear regression, using an L1 regularization penalty term results in sparser solutions than using an L2 regularization penalty term.

 

L1 regularization adds an L1 penalty equal to the absolute value of the magnitude of coefficients. In other words, it limits the size of the coefficients. L1 can yield sparse models (i.e. models with few coefficients).

L2 regularization adds an L2 penalty equal to the square of the magnitude of coefficients. L2 will not yield sparse models and all coefficients are shrunk by the same factor. [For more, please refer here] 

 

5. Maximum likelihood estimation gives us not only a point estimate, but a distribution over the parameters that we are estimating.

(a) TRUE                                                   (b) FALSE

Answer: FALSE

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of a probability distribution by maximizing a likelihood function, so that under the assumed statistical model the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. [Refer here]

MLE is a method of estimating the parameters of a statistical model by picking the parameters that maximize the likelihood function.

 

*********************

Related links:

 

Maximum Likelihood Estimation

L1 and L2 regularization

Difference between hard-margin and soft-margin SVM

Regularization in ridge regression

What is slack variable

Differentiate between L1 and L2 regularization 

No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents