# Introduction to Machine Learning | Week 6

Session: JAN-APR 2024

Course name: Introduction to Machine Learning

#### Q1. From the given dataset, choose the optimal decision tree learned by a greedy approach:a)b)c)d) None of the above.

Q2. Which of the following properties are characteristic of decision trees?
High bias
High variance
Lack of smoothness of prediction surfaces
Unbounded parameter set

These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q3. Entropy for a 50−50 split between two classes is:
0
0.5
1
None of the above

Q4. Having built a decision tree, we are using reduced error pruning to reduce the size of the tree. We select a node to collapse. For this particular node, on the left branch, there are 3 training data points with the following feature values: 5, 7, 9.6 and for the right branch, there are four training data points with the following feature values: 8.7, 9.8, 10.5, 11. What were the original responses for data points along the two branches (left & right respectively) and what is the new response after collapsing the node?
10.8,13.33,14.48
10.8,13.33,12.06
7.2,10,8.8
7.2,10,8.6

These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q5. Given that we can select the same feature multiple times during the recursive partitioning of the input space, is it always possible to achieve 100% accuracy on the training data (given that we allow for trees to grow to their maximum size) when building decision trees?
Yes
No

Q6. Suppose on performing reduced error pruning, we collapsed a node and observed an improvement in the prediction accuracy on the validation set. Which among the following statements are possible in light of the performance improvement observed?
The collapsed node helped overcome the effect of one or more noise affected data points in the training set
The validation set had one or more noise affected data points in the region corresponding to the collapsed node
The validation set did not have any data points along at least one of the collapsed branches
The validation set did not contain data points which were adversely affected by the collapsed node.

These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q7. Consider the following data set:
Considering ‘profitable’ as the binary values attribute we are trying to predict, which of the attributes would you select as the root in a decision tree with multi-way splits using the cross-entropy impurity measure?

price
maintenance
capacity
airbag

Q8. For the same data set, suppose we decide to construct a decision tree using binary splits and the Gini index impurity measure. Which among the following feature and split point combinations would be the best to use as the root node assuming that we consider each of the input features to be unordered?
price – {low, med}|{high}
maintenance – {high}|{med, low}
maintenance – {high, med}|{low}
capacity – {2}|{4, 5}

These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Session: JULY-DEC 2023

Course Name: Introduction to Machine Learning

#### These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q1. Which of the following is/are major advantages of decision trees over other supervised learning techniques? (Note that more than one choices may be correct)
Theoretical guarantees of performance
Higher performance
Interpretability of classifier
More powerful in its ability to represent complex functions

Q2. Increasing the pruning strength in a decision tree by reducing the maximum depth:
Will always result in improved validation accuracy.
Might lead to underfitting if set too aggressively.
Will have no impact on the tree’s performance.
Will eliminate the need for validation data.

#### These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q3. Consider the following statements:
Statement 1: Decision Trees are linear non-parametric models.
Statement 2: A decision tree may be used to explain the complex function learned by a neural network.

Both the statements are True.
Statement 1 is True, but Statement 2 is False.
Statement 1 is False, but Statement 2 is True.
Both the statements are False.

Answer: Statement 1 is False, but Statement 2 is True.

#### These are Introduction to Machine Learning Week 5 Assignment 5 Answers

Q4. Consider the following dataset:
What is the initial entropy of Malignant?

0.543
0.9798
0.8732
1

#### These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q5. For the same dataset, what is the info gain of Vaccination?
0.4763
0.2102
0.1134
0.9355

#### These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q6. Which of the following machine learning models can solve the XOR problem without any transformations on the input space?
Linear Perceptron
Neural Networks
Decision Trees
Logistic Regression

#### These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q7. Statement: Decision Tree is an unsupervised learning algorithm.
Reason: The splitting criterion use only the features of the data to calculate their respective measures

Statement is True. Reason is True.
Statement is True. Reason is False.
Statement is False. Reason is True.
Statement is False. Reason is False.

Answer: Statement is False. Reason is False.

#### These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q8. ____ is a measurement of likelihood of an incorrect classification of a new instance for a random variable, if the new instance is randomly classified as per the distribution of class labels from the data set.
Gini impurity.
Entropy.
Information gain.
None of the above.

Q9. What is a common indicator of overfitting in a decision tree?
The training accuracy is high while the validation accuracy is low.
The tree is shallow.
The tree has only a few leaf nodes.
The tree’s depth matches the number of attributes in the dataset.
The tree’s predictions are consistently biased.

Answer: The training accuracy is high while the validation accuracy is low.

Q10. Consider a dataset with only one attribute(categorical). Suppose, there are 10 unordered values in this attribute, how many possible combinations are needed to find the best split-point for building the decision tree classifier? (considering only binary splits)
10
511
1023
512

#### These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Session: JAN-APR 2023

Course Name: Introduction to Machine Learning

#### Q1. When building models using decision trees we essentially split the entire input space usinga. axis parallel hyper-rectanglesb. polynomials curves of order greater than twoc. polynomial curves of the same order as the length of decision treed. none of the above

Q2. In building a decision tree model, to control the size of the tree, we need to control the number of regions. One approach to do this would be to split tree nodes only if the resultant decrease in the sum of squares error exceeds some threshold. For the described method, which among the following are true?
a. it would, in general, help restrict the size of the trees
b. it has the potential to affect the performance of the resultant regression/classification model
c. it is computationally infeasible

These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q3. Suppose we use the decision tree model for solving a multi-class classification problem. As we continue building the tree, w.r.t. the generalisation error of the model,
a. the error due to bias increases
b. the error due to bias decreases
c. the error due to variance increases
d. the error due to variance decreases

Q4. Having built a decision tree, we are using reduced error pruning to reduce the size of the tree. We select a node to collapse. For this particular node, on the left branch, there are 3 training data points with the following outputs: 5, 7, 9.6 and for the right branch, there are four training data points with the following outputs: 8.7, 9.8, 10.5, 11. The average value of the outputs of data points denotes the response of a branch.

The original responses for data points 1along the two branches (left right respectively) were response _left and, response_right and the new response after collapsing the node is response_new. What are the values for response_left, response_right and response_new (numbers in the option are given in the same order)?
a. 21.6, 40, 61.6
b. 7.2; 10; 8.8
c. 3, 4, 7
d. depends on the tree height.

These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q5. Consider the following dataset:

Which among the following split-points for the feature1 would give the best split according to the information gain measure?
a. 14.6
b. 16.05
c. 16.85
d. 17.35

These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q6 .For the same dataset, which among the following split-points for feature2 would give the best split according to the gini index measure?
a. 172.6
b. 176.35
c. 178.45
d. 185.4

These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q7. In which of the following situations is it appropriate to introduce a new category ’Missing’ for missing values? (multiple options may be correct)
a. When values are missing because the 108 emergency operator is sometimes attending a very urgent distress call.
b. When values are missing because the attendant spilled coffee on the papers from which the data was extracted.
c. When values are missing because the warehouse storing the paper records went up in flames and burnt parts of it.
d. When values are missing because the nurse/doctor finds the patient’s situation too urgent.

These are Introduction to Machine Learning Week 6 Assignment 6 Answers

More Nptel courses: https://progiez.com/nptel

Session: JUL-DEC 2022

#### These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Course Name: INTRODUCTION TO MACHINE LEARNING

Q1. Which of the following properties are characteristic of decision trees?
a. Low bias
b. High variance
c. Lack of smoothness of prediction surfaces
d. Unbounded parameter set

Q2. Consider the following dataset :
What is the initial entropy of Malignant?

a. 0.543
b. 0.9798
c. 0.8732
d. 1

These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q3. For the same dataset, what is the info gain of Vaccination?
a. 0.4763
b. 0.2102
c. 0.1134
d. 0.9355

Q4. Consider the following statements:
Statement 1: Decision Trees are linear non-parametric models.
Statement 2: A decision tree may be used to explain the complex function learned by a neural network.
a. Both the statements are True.
b. Statement 1 is True, but Statement 2 is False.
c. Statement 1 is False, but Statement 2 is True.
d. Both the statements are False.

Answer: c. Statement 1 is False, but Statement 2 is True.

These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q5. Which of the following machine learning models can solve the XOR problem without any transformations on the input space?
a. Linear Perceptron
b. Neural Networks
b. Decision Trees
d. Logistic Regression

Q6. Which of the following is/are major advantages of decision trees over other supervised learning techniques (Note that more than one choices may be correct)
a. Theoretical guarantees of performance
b. Higher performance
c. Interpretability of classifier
d. More powerful in its ability to represent complex functions

These are Introduction to Machine Learning Week 6 Assignment 6 Answers

Q7. Consider a dataset with only one attribute(categorical). Suppose there are q unordered values in this attribute. How many possible combinations are needed to find the best split-point for building the decision tree classifier?
a. q
b. q2
c. 2q-1
d. 2q-1 – 1