Multiple Choice Questions (MCQ) in Natural Language Processing (NLP) with answers, NLP Quiz questions with answers

NLP MCQ with answers

1. N-grams are defined as the combination of N keywords together. How many bi-grams can be generated from the given sentence: Gandhiji is the father of our nation

a) 7

b) 6

c) 8

d) 9

Answer: (b)

Bigrams are sequence of two words that are appearing adjacent in a sentence.

In the given sentence, we have 6 bigrams, ‘Gandhiji is’, ‘is the’, ‘the father’, ‘father of’, ‘of our’, and ‘our nation’.

2. Which of the following techniques can be used for the purpose of keyword normalization, the process of converting a keyword into its meaningful base form?

a) Lemmatization

b) Levenshtein distance

c) Morphing

d) Stemming

Answer: (a)

Lemmatization is the process of mapping an inflected or derived word to its base form (root word). The base form is the meaningful stem.

Stemming is the process like lemmatization but need not end up in a meaningful word as the base form.

3. Which of the following areas where NLP can be useful?

a) Automatic text summarization

b) Automatic question answering systems

c) Information retrieval

d) All of the above

Answer: (d)

The given options are some of the natural language processing applications which are common for all natural languages.

4. Which of the following is the recognized statement by the Maximum Matching algorithm (Greedy - forward pass only) for string thetabledownthere?

a) the table down there

b) theta bled own there

c) both (a) and (b)

d) None of the above

Answer: (b)

Maximum matching algorithm is a greedy algorithm that requires a dictionary of the language. It starts by pointing at the beginning of a string. It chooses the longest word in the dictionary that matches the input at the current position. The pointer is then advanced past each character in that word. If no word matches, the pointer is instead advanced one character (creating a one-character word). The algorithm is then iteratively applied again starting from the new pointer position.

[Source: Speech and Language Processing by Jurafsky and Martin]

5. You have collected a data of about 10,000 rows of tweet text and no other information. You want to create a tweet classification model that categorizes each of the tweets in three buckets – positive, negative and neutral. Which of the following models can perform tweet classification with regards to context mentioned above?

a) Naïve Bayes

b) Support Vector Machine

c) Language model

d) None of the above

Answer: (d)

As per the question, we are given only the data, not the classes of each row of tweet text. Without the corpus categorized with the categories positive, negative and neutral, we cannot perform tweet classification.

***********

Go to Natural Langugage Processing home page

Go to Natural Language Processing - Glossary

Go to NLP - MCQ Quiz Home page