Deep Learning IIT Ropar Week 2 Nptel Assignment Answers 2026

Last updated: January 23, 2026

Are you looking for the Deep Learning IIT Ropar Week 2 NPTEL Assignment Answers 2026? You’ve come to the right place! Access the most accurate and up-to-date solutions for your Week 2 assignment in the Deep Learning course offered by IIT Ropar.

Image for Deep Learning IIT Ropar Week 2 Nptel Assignment Answers boxed image — Deep Learning IIT Ropar Week 2 Nptel Assignment Answers

Deep Learning IIT Ropar Week 2 Nptel Assignment Answers (Jan-Apr 2026)

Que.1
Which property of the sigmoid function makes it suitable for modeling fraud probability?

a) It produces only binary outputs
b) It is non-monotonic
c) It is continuous and differentiable
d) It does not depend on model parameters

View Answer

Que.2
For transaction T2, compute the value of z.

a) Option a
b) Option b
c) Option c
d) Option d

View Answer

Que.3
The decision rule p̂ ≥ 0.5 is equivalent to which condition?

a) z ≥ 1
b) z ≥ 0
c) z ≤ 0
d) z ≤ −1

View Answer

Que.4
Which transactions trigger step-up authentication?
(Select all that apply.)

a) T1
b) T2
c) T3
d) T4

View Answer

Que.5
Holding x₂ constant, increasing b shifts the sigmoid activation curve along the x₁-axis in which direction?

a) Right
b) Left
c) Up
d) Down

View Answer

Que.6
What is the objective of training this model?

a) Maximize the loss
b) Minimize the loss
c) Keep the loss constant
d) None of the above

View Answer

Que.7
If y = 1 and p̂ = 0.2, what is the loss?

a) 0.4
b) 0.5
c) 0.32
d) 0.8

View Answer

Que.8
Which statement about the sigmoid output is correct?

a) It can be interpreted as a probability
b) It always equals 0 or 1
c) It is undefined for negative inputs
d) It is non-differentiable

View Answer

Que.9
Fill in the blank:
If wᵀx + b = 0, then p̂ = ______.

a) Option a
b) Option b
c) Option c
d) Option d

View Answer

Que.10
Which statements are correct for this learning setup?

a) Model parameters are learned from labeled data
b) Gradient descent can be used
c) Sigmoid activation is differentiable
d) Outputs are always binary
e) Loss is minimized during training

View Answer

Que.11
How many trainable weights connect the input layer to the hidden layer (excluding biases)?

a) 9
b) 20
c) 25
d) 10

View Answer

Que.12
How many bias parameters exist in the hidden layer?

a) 1
b) 4
c) 5
d) 0

View Answer

Que.13
How many trainable weights connect the hidden layer to the output neuron (excluding bias)?

a) 1
b) 4
c) 5
d) 20

View Answer

Que.14
Why is a sigmoid activation function used in the hidden layer?

a) It produces binary outputs
b) It guarantees exact rank prediction
c) It is differentiable and supports gradient-based learning
d) It removes the need for a loss function

View Answer

Que.15
Which statements about representational capacity are correct?

a) A perceptron network with one hidden layer can represent any Boolean function
b) A single sigmoid neuron can represent XOR exactly
c) A sigmoid network with one hidden layer can approximate any continuous function (given enough neurons)
d) The approximation guarantee holds with a fixed small number of neurons
e) Increasing hidden neurons can increase approximation accuracy

View Answer

Deep Learning IIT Ropar Week 2 Nptel Assignment Answers (July-Dec 2025)

Question 1. Consider a single perceptron shown below. w = 1, b = −0.5. The perceptron uses a step activation function defined as f(x) = {1 if wx + b ≥ 0, 0 otherwise}. Predict the output for input values 0.51 and 0.49.
a) 1, 0
b) 0, 1
c) 1, 1
d) 0, 0

View Answers

Question 2. You are given a Boolean function that is not linearly separable. Which of the following is true regarding its representation using a perceptron-based network?
a) It can be represented using a single-layer perceptron if you increase the number of perceptrons.
b) It requires at least one hidden layer in the network.
c) It cannot be represented by any feedforward neural network.
d) It can only be represented by a network with more than 2ⁿ perceptrons.

View Answers

Question 3. As n increases, representing all Boolean functions using a 2-layer perceptron becomes impractical due to:
a) Increase in training data size
b) Exponential increase in required hidden layer neurons
c) Limitation in backpropagation algorithm
d) Decrease in classification accuracy

View Answers

Question 4. Which of the following statements are true?

I. A single-layer perceptron can represent all linearly separable Boolean functions.
II. XOR requires at least one hidden layer to be represented.
III. A network with 2ⁿ hidden neurons and one output neuron can represent all Boolean functions over n inputs.
IV. A single-layer perceptron can represent the XOR function if the learning rate is set appropriately.

Correct options:
✔️ I, II, III

View Answers

Question 5. Which of the following are true about such a network?
a) It can represent linearly non-separable functions like XOR.
b) The network uses hidden neurons to convert a non-linearly separable function into linearly separable subproblems.
c) Removing any one hidden neuron will not affect the network’s ability to represent XOR.
d) This network must use sigmoid activation in the hidden layer to implement XOR.

View Answers

Question 6. You are designing a spam filter using a perceptron. Some input features (like the presence of the word “FREE”) are not linearly separable from others. Which architecture is most appropriate for learning from such data?
a) Single-layer perceptron with more training data
b) Multi-layer perceptron with hidden neurons
c) Removing the non-linearly separable features
d) Output layer with more neurons

View Answers

Question 7. You are given an arbitrary Boolean function defined over 4 binary inputs. Which of the following neural network architectures is guaranteed to represent this function?
a) One perceptron
b) A network with 4 hidden neurons
c) A network with 16 hidden neurons and one output perceptron
d) A network with 5 output neurons

View Answers

Question 8. For a single input value x = 1.5, w = 2, b = −1, compute the output of the sigmoid neuron up to 2 decimal places.
a) 0.73
b) 0.82
c) 0.67
d) 0.91

View Answers

Question 9. In a sigmoid neuron defined as f(x) = σ(wx + b), where σ(z) = 1 / (1 + e^−z). Suppose the weight w is positive. If the bias b is increased, in which direction does the sigmoid curve shift along the x-axis?
a) Upwards
b) Leftwards
c) Downwards
d) Rightwards

View Answers

Question 10. Which of the following statements are true?

I. Logistic function is smooth and continuous
II. Logistic function is differentiable.

a) Only Statement I is true
b) Only Statement II is true
c) Both statements I and II are true
d) None of the above

View Answers

Question 11. Which of the following statements are true about learning algorithms?

I. Learning algorithms always maximize a loss function
II. Learning algorithms learn parameters from data

a) Only Statement I is true
b) Only Statement II is true
c) Both statements I and II are true
d) None of the above

View Answers

Question 12. Consider a neural network with 12 input features, a hidden layer with 8 neurons, and a single output neuron. All layers are fully connected, and biases are included in both the hidden and output layers. How many gradients must be computed during backpropagation?
a) 101
b) 113
c) 110
d) 105

View Answers

Question 13. You are evaluating a regression model on a dataset of 3 points. The actual target values and predicted outputs from your model are given. What is the MSE for this model?
a) 1.00
b) 0.67
c) 0.33
d) 2.00

View Answers

Question 14. You are given a model defined as ŷ = f(x) = 2x + 3. For three input-output pairs, the inputs are x = 0,1,2 and targets y = 4,6,9. What is the MSE?
a) 1.00
b) 1.33
c) 2.00
d) 3.00

View Answers

Question 15. A regression model is tested on 4 data points. Given actual vs predicted values, compute MAE.
a) 0.75
b) 1.00
c) 1.25
d) 1.00

View Answers

Question 16. You are comparing two models for different function learning tasks. Model A: multilayer perceptrons. Model B: multilayer sigmoid neurons. Task 1: Learn Boolean function. Task 2: Learn sin(x). Which is most appropriate?
a) Model A can represent both tasks with high precision
b) Model A is better for Task 1, Model B is better for Task 2
c) Model B can approximate both Task 1 and Task 2 outputs, but not represent Task 1 exactly
d) Both models are equivalent in their representation abilities

Course Link: Click Here

View Answers

Question 17. A trained model for customer churn shows a near-zero weight for monthly charges. What is the most reasonable inference?
a) Monthly charges had missing values
b) Monthly charges were not normalized
c) Monthly charges may not have contributed significantly
d) Learning rate was too high for that feature

View Answers

Question 18. In a fraud detection system using a sigmoid neuron with weights: w1 = 3.2, w2 = 0.05, w3 = –0.02. Which feature is most influential?
a) Time of transaction
b) Number of transactions
c) Transaction amount
d) Sigmoid is not functioning properly

View Answers

Question 19. You’re optimizing f(x) = x² – x + 2 using gradient descent with η = 0.01. What is the update rule?
a) xt+1 = xt − 0.01(2xt − 1)
b) xt+1 = xt + 0.01(2xt)
c) xt+1 = xt − (2xt − 1)
d) xt+1 = xt − 0.01(xt − 1)

View Answers

Question 20. For f(x) = x³ – 4x + 1, using η = 0.1, what is the update rule in gradient descent?
a) xt+1 = xt − 0.1(3x²t − 4)
b) xt+1 = xt − 0.1(3x²t + 4)
c) xt+1 = xt + 0.1(3x²t − 4)
d) xt+1 = xt + 0.1(3x²t + 4)

View Answers

Question 21. For f(T, x) = T² + 5x + 20, using gradient descent with η = 1 from (T, x) = (0, 0), what is the value of T after 10 iterations?
a) 50
b) –10
c) 5
d) 0

View Answers

Question 22. Minimize f(x₁, x₂) = 4x₁² + 5x₂ + 9 using η = 0.5 from (0, 0). What is x₂ after 5 iterations?
a) –2.5
b) –12.5
c) –1
d) –0.5

View Answers

Question 23. Apply gradient descent on f(x₁, x₂) = x₁² + x₂² with η = 0.1 from (1, 2). What’s the updated (x₁, x₂)?
a) (0.9, 1.9)
b) (0.8, 1.6)
c) (1.1, 2.1)
d) (1.0, 2.0)

View Answers

Question 24. A logistic regression model gives wᵀx = 2.5. What does this imply?
a) Predicted probability of class 1 > 0.5
b) Predicted label is 0
c) wᵀx is irrelevant
d) Prediction is not possible

View Answers

Question 25. Logistic regression model with w = [–3, 4], no bias, input x = [1, 1]. What is the output prediction?
a) Predicted label is 1
b) Predicted label is 0
c) Output can’t be determined
d) Output undefined

View Answers

Question 26. Which output function is least appropriate for predicting housing prices?
a) ŷ = wᵀx
b) ŷ = log(1 + e^wᵀx)
c) ŷ = sin(wᵀx)
d) ŷ = 1 / (1 + e^–wᵀx)

View Answers

Question 27. The logistic curve transitions sharply from 0 to 1. What does this indicate about w and b?
a) w is close to 0, b is large
b) w is large, b is small
c) w is large, b is large
d) w is small, b is negative

View Answers

Question 28. You observe the sigmoid curve transitions very gradually. What change makes it sharper?
a) Increase b
b) Increase w
c) Decrease w
d) Set b = 0

View Answers

Question 29. Why is SSE (Sum of Squared Errors) preferred over SE (Sum of Errors)?
a) Prevents error cancellation
b) Emphasizes large errors
c) Simple derivative
d) All of the above

View Answers

Question 30. Which statement(s) are correct?

I. Any linearly separable function can be represented using a single-layer perceptron.
II. A single sigmoid neuron can approximate any Boolean function with zero error.

a) Only I
b) Only II
c) Both I and II
d) None

View Answers

Question 31. You are given a multi-layer perceptron with one hidden layer consisting of 8 perceptrons and a single output neuron. Each perceptron in the hidden layer outputs either 0 or 1 based on its input. Which of the following statements is true about the function capacity of this network?
a) The network is capable of implementing 2⁸ Boolean functions
b) The network is capable of implementing 2⁶⁴ Boolean functions
c) The output neuron receives a continuous-valued input
d) Each hidden neuron produces 64 possible outputs

View Answers

Deep Learning IIT Ropar Week 2 Nptel Assignment Answers (Jan-Apr 2025)

1) Which of the following statements is(are) true about the following function?
Options:
[a] The function is monotonic.
[b] The function is continuously differentiable.
[c] The function is bounded between 0 and 1.
[d] The function attains its maximum when z → ∞.

View Answer

2) How many weights does a neural network have if it consists of an input layer with 2 neurons, two hidden layers each with 5 neurons, and an output layer with 2 neurons?
Assume there are no bias terms in the network.

View Answer

3) A function f(z) is approximated using 100 tower functions. What is the minimum number of neurons required to construct the network that approximates the function?
Options:
[a] 99
[b] 100
[c] 101
[d] 200
[e] 201
[f] 251

View Answer

4) Suppose we have a Multi-layer Perceptron with an input layer, one hidden layer, and an output layer. The hidden layer contains 32 perceptrons. The output layer contains one perceptron. Choose the statement(s) that are true about the network.
Options:
[a] Each perceptron in the hidden layer can take in only 32 Boolean inputs.
[b] Each perceptron in the hidden layer can take in only 5 Boolean inputs.
[c] The network is capable of implementing 2^5 Boolean functions.
[d] The network is capable of implementing 2^32.

View Answer

5) Consider a function f(z) = -5z² + 5. What is the updated value of z after the 2nd iteration of the gradient descent update if the learning rate is 0.1 and the initial value of z is 5?

View Answer

6) Consider the sigmoid function f(z) = 1 / (1 + e^(-wz + b)), where w is a positive value. Select all the correct statements regarding this function.
Options:
[a] Increasing the value of b shifts the sigmoid function to the right (i.e., towards positive infinity).
[b] Increasing the value of b shifts the sigmoid function to the left (i.e., towards negative infinity).
[c] Increasing the value of w decreases the slope of the sigmoid function.
[d] Increasing the value of w increases the slope of the sigmoid function.

View Answer

7) You are training a model using the gradient descent algorithm and notice that the loss decreases and then increases after each successive epoch (pass through the data). Which of the following techniques would you employ to enhance the likelihood of the gradient descent algorithm converging?
Options:
[a] Set η = 1
[b] Set η = 0
[c] Decrease the value of η
[d] Increase the value of η

View Answer

8) The diagram below shows three functions f, g, and h. The function h is obtained by combining the functions f and g. Choose the right combination that generated h.
Options:
[a] h=f-g
[b] h = 0.5 * (f + g)
[c] h = 0.5 * (f – g)
[d] h = 0.5 * (g – f)

View Answer

9) Consider the data points as shown in the figure below:
(0.4, 0.95),
(0.06, 0.4)
Suppose that the sigmoid function is used to fit these data points. Compute the Mean Square Error (MSE) loss L(w, b).
Options:
[a] 0
[b] 0.126
[c] 1.23
[d] 1.0

View Answer

10) Suppose that we implement the XOR Boolean function using the network shown below. Consider the statement that “A hidden layer with two neurons is sufficient to implement XOR.” The statement is
Options:
[a] True
[b] False

View Answer

Session: JULY – DEC 2024

Deep Learning IIT Ropar Week 2 Nptel Assignment Answers

Q1. Which of the following statements is(are) true about the following function?
σ(z)=1/_1+e−(z).

The function is bounded between 0 and 1

The function attains its maximum when z→∞
The function is continuously differentiable
The function is monotonic

Answer: Updating Soon (in progress)

Q2. How many weights does a neural network have if it consists of an input layer with 2 neurons, three hidden layers each with 4 neurons, and an output layer with 2 neurons? Assume there are no bias terms in the network.

Answer: 2.025

For answers or latest updates join our telegram channel: Click here to join

These are Deep Learning IIT Ropar Week 2 Nptel Assignment Answers

Q3.Suppose we have a Multi-layer Perceptron with an input layer, one hidden layer and an output layer. The hidden layer contains 64 perceptrons. The output layer contains one perceptron. Choose the statement(s) that are true about the network.

The network is capable of implementing 2⁶ Boolean functions

The network is capable of implementing 2⁶⁴ Boolean functions
Each perceptron in the hidden layer can take in only 64 Boolean inputs
Each perceptron in the hidden layer can take in only 6 Boolean inputs

Answer: Updating Soon (in progress)

Q4. Consider a function f(x)=x3−4×2+7m. What is the updated value of x after 2nd iteration of the gradient descent update, if the learning rate is 0.1 and the initial value of x is 5?

Answer: Updating Soon (in progress)

For answers or latest updates join our telegram channel: Click here to join

These are Deep Learning IIT Ropar Week 2 Assignment 1 Nptel Answers

Q5.You are training a model using the gradient descent algorithm and notice that the loss decreases and then increases after each successive epoch (pass through the data). Which of the following techniques would you employ to enhance the likelihood of the gradient descent algorithm converging? (Here, η
refers to the step size.)

Decrease the value of η

Increase the value of η

Set η=1

Set η=0

Answer: Increase the value of η

Q6. Which of the following statements is true about the representation power of a multilayer network of perceptions?
A multilayer network of perceptrons can represent any function.
A multilayer network of perceptrons can represent any linear function.
A multilayer network of perceptrons can represent any boolean function.
A multilayer network of perceptrons can represent any continuous function.

Answer:A multilayer network of perceptrons can represent any boolean function

For answers or latest updates join our telegram channel: Click here to join

These are Deep Learning IIT Ropar Week 2 Nptel Assignment Answers

Q7.We have a function that we want to approximate using 150 rectangles (towers). How many neurons are required to construct the required network?
301
451
150
500

Answer: 301

Q8.We have a classification problem with labels 0 and 1. We train a logistic model and find out that ω0
learned by our model is -17. We are to predict the label of a new test point x
using this trained model. If ωTx=1, which of the following statements is True?

We cannot make any prediction as the value of ωTx
does not make sense
The label of the test point is 0.
The label of the test point is 1.

We cannot make any prediction as we do not know the value of x.

Answer: The label of the test point is 0.

For answers or latest updates join our telegram channel: Click here to join

These are Deep Learning IIT Ropar Week 2 Nptel Assignment Answers

Q9.Suppose we have a function f(x1,x2)=x21+3×2+25 which we want to minimize the given function using the gradient descent algorithm. We initialize (x1,x2)=(0,0) .What will be the value of x1 after ten updates in the gradient descent process?(Let η be 1)
0
-3
−4.5
−3

Answer:0

Q10.What is the purpose of the gradient descent algorithm in machine learning?
To minimize the loss function
To maximize the loss function
To minimize the output function
To maximize the output function

Answer: Updating Soon (in progress)

For answers or latest updates join our telegram channel: Click here to join

These are Deep Learning IIT Ropar Week 2 Nptel Assignment Answers

Check here all Deep Learning IIT Ropar Nptel Assignment Answers : Click here

For answers to additional Nptel courses, please refer to this link: NPTEL Assignment Answers

#Nptel Week 2 Assignment Answers and Solutions

Deep Learning IIT Ropar Week 2 Nptel Assignment Answers (Jan-Apr 2026)

Archived Old Sessions+

Deep Learning IIT Ropar Week 2 Nptel Assignment Answers (July-Dec 2025)

Deep Learning IIT Ropar Week 2 Nptel Assignment Answers (Jan-Apr 2025)

Deep Learning IIT Ropar Week 2 Nptel Assignment Answers