Introduction to Large Language Models Week 3 Answers

Last updated: August 8, 2025

Are you looking for Introduction to Large Language Models Week 3 Answers. All weeks of Introduction To Internet Of Things available here.

Introduction to Large Language Models Week 3 Answers (July-Dec 2025)

Course link: Click here

Question 1. In backpropagation, which method is used to compute the gradients?
a) Gradient descent
b) Chain rule of derivatives
c) Matrix factorization
d) Linear regression

View Answers

Question 2. Which of the following functions is not differentiable at zero?
a) Sigmoid
b) Tanh
c) ReLU
d) Linear

View Answers

Question 3. In the context of regularization, which of the following statements is true?
a) L2 regularization tends to produce sparse weights
b) Dropout is applied during inference to improve accuracy
c) L1 regularization adds the squared weight penalties to the loss function
d) Dropout prevents overfitting by randomly disabling neurons during training

View Answers

Question 4. Which activation function is least likely to suffer from vanishing gradients?
a) Tanh
b) Sigmoid
c) ReLU

View Answers

Question 5. Which of the following equations correctly represents the derivative of the sigmoid function?
a) σ(x) · (1 + σ(x))
b) σ(x)²
c) σ(x) · (1 − σ(x))
d) 1 / (1 + e^x)

View Answers

Question 6. What condition must be met for the Perceptron learning algorithm to converge?
a) Learning rate must be zero
b) Data must be non-linearly separable
c) Data must be linearly separable
d) Activation function must be sigmoid

View Answers

Question 7. Which of the following logic functions requires a network with at least one hidden layer to model?
a) AND
b) OR
c) NOT
d) XOR

View Answers

Question 8. Why is it necessary to include non-linear activation functions between layers in an MLP?
a) Without them, the network is just a linear function
b) They prevent overfitting
c) They allow backpropagation to work

View Answers

Question 9. What is typically the output activation function for an MLP solving a binary classification task?
a) Tanh
b) ReLU
c) Sigmoid
d) Softmax

View Answers

Question 10. Which type of regularization encourages sparsity in the weights?
a) L1 regularization
b) L2 regularization
c) Dropout
d) Early stopping

View Answers

Click here for all nptel assignment answers

These are Introduction to Large Language Models Week 3 Answers

#Nptel Week 3 Assignment Answers and Solutions

Table of Contents

Introduction to Large Language Models Week 3 Answers (July-Dec 2025)