✅ Short Description:
Explore Week 2 assignment answers for the Natural Language Processing NPTEL course. This week covers concepts like Zipf’s Law, n-gram models, Levenshtein distance, smoothing techniques, and perplexity. Each question is accompanied by an explanation to support your learning.
1. According to Zipf’s law which statement(s) is/are correct?
(i) A small number of words occur with high frequency.
(ii) A large number of words occur with low frequency.
Options:
a. Both (i) and (ii) are correct
b. Only (ii) is correct
c. Only (i) is correct
d. Neither (i) nor (ii) is correct
Answer:✅a
Explanation: Zipf’s Law states that in a corpus, a few words dominate in frequency, while the majority occur rarely. Hence, both (i) and (ii) are true.
2. (Question missing)
Answer:✅a
Explanation:(No question provided to explain.)
3. A 4-gram model is a ____________ order Markov Model.
Options:
a. Two
b. Five
c. Four
d. Three
Answer:✅ d
Explanation: A 4-gram model depends on the previous three words, which makes it a third-order Markov model.
📚Want Full NPTEL Assignment Support?
If you’re looking for expert-verified solutions with regular updates,
👉Click here to visit Answer GPT – your trusted NPTEL learning partner.
4. Which of these is/are valid Markov assumptions?
Options:
a. Depends only on the current word
b. Depends only on the previous word
c. Depends only on the next word
d. Depends only on the current and the previous word
Answer:✅ b
Explanation: The standard Markov assumption in NLP states that the next word depends only on the previousword(s).
5. For the string ‘mash’, which of the following strings have Levenshtein distance 1?
Options:
a. smash, mas, lash, mushy, hash
b. bash, stash, lush, flash, dash
c. smash, mas, lash, mush, ash
d. None of the above
Answer:✅ c
Explanation:Levenshtein distance 1 means one character edit (insert, delete, or substitute). All words in (c) qualify.
6. Which strings have edit distance 1 from ‘clash’ if insertion/deletion = 1, substitution = 2?
Options:
a. ash, slash, clash, flush
b. flash, stash, lush, blush
c. slash, last, bash, ash
d. None of the above
Answer:✅ d
Explanation: With substitution costing 2, none of these options yield a total edit distance of exactly 1 from ‘clash’.
7. Given MLE = 0.45 for “dried berries”, count(“dried”) = 720, smoothed likelihood = 0.05. Vocabulary size?
Options:
a. 4780
b. 3795
c. 4955
d. 5780
Answer:✅ d
Explanation: Using add-one smoothing formula:
P = (count(w1 w2) + 1) / (count(w1) + V)
Solving gives V ≈ 5780.
8. Calculate P(they play in a big garden) using a bi-gram model.
Options:
a. 1/8
b. 1/12
c. 1/24
d. None of the above
Answer:✅ b
Explanation: Probability is computed as the product of each bi-gram. Total gives 1/12.
9. Perplexity of sequence: <s> they play in a big garden </s>?
Options:
a. 2.289
b. 1.426
c. 1.574
d. 2.178
Answer:✅ b
Explanation: Perplexity = 2^(-avg log probability). Lower perplexity = better language model fit.
10. Bi-gram model with add-one smoothing: P(they play in a beautiful garden)?
Options:
a. 4.472 × 10⁻⁶
b. 2.236 × 10⁻⁶
c. 3.135 × 10⁻⁶
d. None of the above
Answer:✅ b
Explanation: Add-one smoothing reduces zero probabilities for unseen n-grams. Calculated result ≈ 2.236 × 10⁻⁶.
✅ Conclusion:
This week’s NLP assignment focused on statistical language modeling concepts. You learned how to apply Zipf’s law, Markov assumptions, bi-gram models, and smoothing techniques effectively to language tasks.
👉 For all updated answers weekly, visit the Natural Language Processing Week 2 NPTEL Answersand stay ahead in your NPTEL journey.
