# An Introduction to Artificial Intelligence Week 12

Course Name: An Introduction to Artificial Intelligence

#### Q1. Consider the following perceptron network. Number on edge represents weight of corresponding input and number beside node represents threshold for that node. x1, x2 and x3 are Boolean variables. What is the representation of y in terms of the inputs?

a. (x1Vx2Vx3)
b. (x1 -> ¬x2) -> x3
c. (x1->x2)->x3
d. (x1^x2^x3)

Answer: b. (x1 -> ¬x2) -> x3

Q2. Which of the following are correct regarding the various non-linear activation functions?
a. For sigmoid activation, strongest signal will flow to the weights of a neuron when the activation is near 0 or 1.
b. Tanh activation is a scaled transformation of the sigmoid activation.
c. ReLU activation can output a larger range of values than sigmoid activation
d. ReLU activation can accelerate the convergence of Stochastic Gradient Descent

These are An Introduction to Artificial Intelligence Answers Week 12

Q3. Assume that an input image of size 64 x 64 is passed through a convolution filter of size 6 x 6 and stride 2 followed by a max pooling filter of size 2 x 2 and stride 2. If the size of the resulting image is d x d, what is the value of d?

Q4. Which of the following statements are correct?
a. As long as a neural network has at least one hidden layer and sufficient number of parameters, it can represent any continuous function to arbitrary accuracy.
b. If we are given a fixed number of parameters, the optimal network in terms of learnability will have one neuron per hidden layer and |number of parameters| hidden layers.
c. Fat and short networks make better use of the compositionality of the classification task as compared to tall and thin networks.
d. Fat and short networks may struggle more when some of the desired output classes are under-represented in the training data.

These are An Introduction to Artificial Intelligence Answers Week 12

Q5. Which of the following are correct regarding some of the issues faced by AI systems?
a. The robustness issue discussed in class characterises the tendency of the system to give completely incorrect outputs when tiny adversarial perturbations are applied to the weights of the network
b. The transparency issue discussed in class characterises the opaqueness of the system regarding why a particular network configuration fails or succeeds at a particular task
c. The bias issue discussed in class characterises the tendency of the system to amplify racial or gender bias present in the training data
d. The privacy issue discussed in class characterises the ability of a system to infer private information from the training data provided to it.

Q6. Consider one layer of a neural network with the weight and bias matrix as

Suppose the activation function used is ReLU. What will be the output for the input x?

a.

b.

c.

d.

These are An Introduction to Artificial Intelligence Answers Week 12

Q7. Which of the following are correct for CNNs, the popular deep learning architectures for image recognition?
a. Convolutional neural networks consist of hand-designed convolutional filters along with fully connected neural networks layers.
b. The output of CNNs is invariant to minor translational shifts in input.
c. The first major success of deep learning in computer vision was based upon the performance of CNNs on the ImageNet challenge.
d. The layer before the softmax/sigmoid in CNNs for image recognition is usually a fully connected neural network layer.

Q8. Which of the following is the reason for employing replay buffers in the training of deep Q networks?
a. Sparsity in rewards during the episode.
b. Violation of the i.i.d assumption between consecutive samples of an episode.
c. Extremely large state space.
d. Extremely large action space.

Answer: b. Violation of the i.i.d assumption between consecutive samples of an episode.

These are An Introduction to Artificial Intelligence Answers Week 12

Q9. If g(z) is the sigmoid function, which of the following is the correct expression for its derivative with respect to z?
a. g(z)
b. 1-g(z)
c. g(z) * (1 – g(z))
d. g(z) / (1 – g(z))

Answer: c. g(z) * (1 – g(z))

These are An Introduction to Artificial Intelligence Answers Week 12

Q10. Consider a neural network that performs the following operations:
z = Wx + b
y = tanh(z),
J=Σi yi2

number of training examples. For this problem, assume that there is only one training example.
Which of the represents the derivative 𝜕J / 𝜕b ?

a. y^2
b. 2*(1-tanh^2(z))
c. 2tanh(z)*(1-tanh^2(z))
d. tanh(z)*(1-tanh^2(z))