Deep Learning Week 9 Nptel Assignment Answers
Are you looking for Deep Learning Week 9 Nptel Assignment Answers ? You’ve come to the right place! Access the most accurate answers at Progiez.
Table of Contents

NPTEL Deep Learning Week 9 Assignment 9 Answers (Jan-Apr 2025)
Course Link: Click Here
1) What can be a possible consequence of choosing a very small learning rate?
a. Slow convergence
b. Overshooting minima
c. Oscillations around the minima
d. All of the above
2) The following is the equation of update vector for momentum optimizer. Which of the following is true for y?
Ve=yVe−1+nV1V_e = yV_{e-1} + nV_1
a. y is the momentum term which indicates acceleration
b. y is the step size
c. y is the first order moment
d. y is the second order moment
3) Which of the following is true about momentum optimizer?
a. It helps accelerating Stochastic Gradient Descent in the right direction
b. It helps prevent unwanted oscillations
c. It helps to know the direction of the next step with knowledge of the previous step
d. All of the above
NPTEL Deep Learning Week 9 Assignment 9 Answers
4) Let J(θ) be the cost function. Let the gradient descent update rule for θ be, θ₁ = θ – αVθ. What is the correct expression of Vθ? α is the learning rate.
a. ∂J(θ)∂θ\frac{\partial J(\theta)}{\partial \theta}
b. α∂J(θ)∂θ\alpha \frac{\partial J(\theta)}{\partial \theta}
c. ∂J(θ)∂θα\frac{\partial J(\theta)}{\partial \theta} \alpha
d. None of the above
5) A given cost function is of the form J(θ)=θ2−6θ+6J(\theta) = \theta^2 – 6\theta + 6. What is the weight update rule for gradient descent optimization at step t+1? Consider α to be the learning rate.
a. θt+1=θt−α(2θ−1)\theta_{t+1} = \theta_t – \alpha(2\theta -1)
b. θt+1=θt+α(2θ)\theta_{t+1} = \theta_t + \alpha(2\theta)
c. θt+1=θt−α(12θ−6+6)\theta_{t+1} = \theta_t – \alpha(12\theta -6 + 6)
d. θt+1=θt−α(2θ+1)\theta_{t+1} = \theta_t – \alpha(2\theta +1)
6) If the first few iterations of gradient descent cause the function J(θ₀, θ₁) to increase rather than decrease, then what could be the most likely cause for this?
a. We have set the learning rate to too large a value
b. We have set the learning rate to zero
c. We have set the learning rate to a very small value
d. Learning rate is gradually decreased by a constant value after every epoch
7) For a function J(θ₀, θ₁), if θ₀ and θ₁ are initialized at a global minimum, then what should be the values of θ₀ and θ₁ after a single iteration of gradient descent?
a. θ0\theta_0 and θ1\theta_1 will update as per gradient descent rule
b. θ0\theta_0 and θ1\theta_1 will remain the same
c. Depends on the values of θ0\theta_0 and θ1\theta_1
d. Depends on the learning rate
8) What can be one of the practical problems of exploding gradient?
a. Too large update of weight values leading to an unstable network
b. Too small update of weight values inhibiting the network to learn
c. Too large update of weight values leading to faster convergence
d. Too small update of weight values leading to slower convergence
9) What are the steps for using a gradient descent algorithm?
- Calculate the error between the actual value and the predicted value
- Update the weights and biases using the gradient descent formula
- Pass an input through the network and get values from the output layer
- Initialize weights and biases of the network with random values
- Calculate the gradient value corresponding to each weight and bias
a. 1, 5, 2, 3, 4
b. 5, 1, 3, 2
c. 3, 2, 1, 5, 4
d. 4, 3, 1, 5, 2
10) You run gradient descent for 15 iterations with learning rate α = 0.3 and compute error after each iteration. You find that the value of error decreases very slowly. Based on this, which of the following conclusions seems most plausible?
a. Rather than using the current value of α, use a larger value of α
b. Rather than using the current value of α, use a smaller value of α
c. Keep α = 0.3
d. None of the above
These are NPTEL Deep Learning Week 9 Assignment 9 Answers
More weeks of Deep Learning: Click Here
More Nptel Courses: https://progiez.com/nptel
NPTEL Deep Learning Week 9 Assignment 9 Answers