NPTEL Deep Learning Week 8 Assignment 8 Answers

**Q1. Which of the following functions can be used as an activation function in the output layer if we**

wish to predict the probabilities of n classes such that the sum of p over all n equals to 1?

a. Softmax

b. RelU

c. Sigmoid

d. Tanh

wish to predict the probabilities of n classes such that the sum of p over all n equals to 1?

**Answer: a. Softmax**

**Q2. The input image has been converted into a matrix of size 256 X 256 and a kernel/filter of size 5×5 with a stride of 1 and no padding. What will be the size of the convoluted matrix?**

a. 252×252

b. 3×3

c 254×254

d. 256×256

**Answer: a. 252×252**

**Q3. What will be the range of output if we apply ReLU non-linearity and then Sigmoid Nonlinearity subsequently after a convolution layer?**

a. [1,1]

b. [0,1]

c. [0.5,1]

d. [1,-0.5]

**Answer: b. [0,1]**

**Q4. The figure below shows image of a face which is input to a convolutional neural net and the other three images shows different levels of features extracted from the network. Can you identify from the following options which one is correct?**

a. Label 3: Low-level features, Label 2: High-level features, Label 1: Mid-level features

b. Label 1: Low-level features, Label 3: High-level features, Label 2: Mid-level features

c. Label 2: Low-level features, Label 1: High-level features, Label 3: Mid-level features

d. Label 3: Low-level features, Label 1: High-level features, Label 2: Mid-level features

**Answer: a. Label 3: Low-level features, Label 2: High-level features, Label 1: Mid-level features**

**Q5. Suppose you have 8 convolutional kernel of size 5 x 5 with no padding and stride 1 in the first layer of a convolutional neural network. You pass an input of dimension 228 x 228 x 3 through athis layer. What are the dimensions of the data which the next layer will receive?**

a. 224x224x3

b. 224x224x8

c. 226x226x8

d. 225x225x3

**Answer: b. 224x224x8**

**Q6. What is the mathematical form of the Leaky RelU layer?**

a. f(x)=max(0,x)

b. f(x)=min(0,x)

c. f(x)=min(0, ax), where a is a small constant

d. f(x)=1(x<0)(ax)+1(x>=0)(x), where a is a small constant

**Answer: c. f(x)=min(0, ax), where a is a small constant**

**Q7. The input image has been converted into a matrix of size 224 x 224 and convolved with a kernel/filter of size FxF with a stride of s and padding P to produce a feature map of dimension 222×222. Which among the following is true?**

a. F=3×3,s=1,P=1

b. F=3×3,s=0, P=1

c. F=3×3,s=1,P=0

d. F=2×2,s=0, P=0

**Answer: a. F=3×3,s=1,P=1**

**Q8. Statement 1: For a transfer learning task, lower layers are more generally transferred to another taskStatement 2: For a transfer learning task, last few layers are more generally transferred to another taskWhich of the following option is correct?**

a. Statement 1 is correct and Statement 2 is incorrect

b. Statement 1 is incorrect and Statement 2 is correct

c. Both Statement 1 and Statement 2 are correct

d. Both Statement 1 and Statement 2 are incorrect

**Answer: a. Statement 1 is correct and Statement 2 is incorrect**

**Q9. Statement 1: Adding more hidden layers will solve the vanishing gradient problem for a 2-layer neural networkStatement 2: Making the network deeper will increase the chance of vanishing gradients.**

a. Statement 1 is correct

b. Statement 2 is correct

c. Neither Statement 1 nor Statement 2 is correct

d. Vanishing gradient problem is independent of number of hidden layers of the neural network.

**Answer: a. Statement 1 is correct**

**Q10. How many convolution layers are there in a LeNet-5 architecture?**

a. 2

b. 3

c 4

d. 5

**Answer: a. 2**

