# Introduction to Machine Learning Nptel Week 8 Answers

Are you looking for Introduction to Machine Learning Nptel Week 8 Answers? You’ve come to the right place! Access the latest and most accurate solutions for your Week 8 assignment in the Introduction to Machine Learning course.

**Course Link: Click Here**

## Table of Contents

**Introduction to Machine Learning Nptel Week 8 Answers (July-Dec 2024)**

**In Bagging technique, the reduction of variance is maximum if:**

A) The correlation between the classifiers is minimum

B) Does not depend on the correlation between the classifiers

C) Similar features are used in all classifiers

D) The number of classifiers in the ensemble is minimized

**Answer:** A) The correlation between the classifiers is minimum

**If using squared error loss in gradient boosting for a regression problem, what does the gradient correspond to?**

A) The absolute error

B) The log-likelihood

C) The residual error

D) The exponential loss

**Answer:** C) The residual error

**In a random forest, if T (number of features considered at each split) is set equal to P (total number of features), how does this compare to standard bagging with decision trees?**

A) It’s exactly the same as standard bagging

B) It will always perform better than standard bagging

C) It will always perform worse than standard bagging

D) Can not be determined

**Answer:** A) It’s exactly the same as standard bagging

**Multiple Correct: Consider the following graphical model, which of the following are true about the model? (multiple options may be correct)**

- A) d is independent of b when c is known
- B) a is independent of c when e is known
- C) a is independent of b when e is known
- D) a is independent of b when c is known

**Answer:** A) d is independent of b when c is known

D) a is independent of b when c is known

**Consider the Bayesian network given in the previous question. Let “a”, “b”, “c”, “d” and “e” denote the random variables shown in the network. Which of the following can be inferred from the network structure?**

A) “a” causes “d”

B) “e” causes “d”

C) Both (a) and (b) are correct

D) None of the above

**Answer:** A) “a” causes “d”

**These are Introduction to Machine Learning Nptel Week 8 Answers**

**A single box is randomly selected from a set of three. Two pens are then drawn from this container. These pens happen to be blue and green colored. What is the probability that the chosen box was Box A?**

A) 37/18

B) 15/56

C) 18/37

D) 56/15

**Answer:** C) 18/37

**State True or False: The primary advantage of the tournament approach in multiclass classification is its effectiveness even when using weak classifiers.**

A)**True**

B)**False**

**Answer:** A) **True**

- **A data scientist is using a Naive Bayes classifier to categorize emails as either “spam” or “not spam”. The features used for classification include:

- Number of recipients (To, Cc, Bcc)
- Presence of “spam” keywords (e.g., ”URGENT”, ”offer”, ”free”)
- Time of day the email was sent
- Length of the email in words

Which of the following scenarios, if true, is most likely to violate the key assumptions of Naive Bayes and potentially impact its performance?**

A) The length of the email follows a non-Gaussian distribution

B) The time of day is discretized into categories (morning, afternoon, evening, night)

C) The proportion of spam emails in the training data is lower than in real-world email traffic

D) There’s a strong correlation between the presence of the word ”free” and the length of the email

**Answer:** D) There’s a strong correlation between the presence of the word ”free” and the length of the email

- **Consider the two statements:

Statement 1: Bayesian Networks are inherently structured as Directed Acyclic Graphs (DAGs).

Statement 2: Each node in a bayesian network represents a random variable, and each edge represents conditional dependence.

Which of these are true?**

A) Both the statements are True.

B) Statement 1 is true, and statement 2 is false.

C) Statement 1 is false, and statement 2 is true.

D) Both the statements are false.

**Answer:** A) Both the statements are True.

**These are Introduction to Machine Learning Nptel Week 8 Answers**

All weeks of Introduction to Machine Learning: Click Here

For answers to additional Nptel courses, please refer to this link: NPTEL Assignment Answers

**Introduction to Machine Learning Nptel Week 8 Answers (JAN-APR 2024**)

**Course name: Introduction to Machine Learning**

**Course Link: Click Here**

**For answers or latest updates join our telegram channel: Click here to join**

These are Introduction to Machine Learning Week 8 Assignment Answers

Q1. Consider the Bayesian network given below. Which of the following statement(s) is/are correct?

B is independent of F, given D.

A is independent of E, given C.

E and F are not independent, given D.

A and B are not independent, given D.

**Answer: a), d)**

**Q2. Select the correct statement(s) from the ones given below.**

Naive Bayes models are a special case of Bayesian networks.

Naive Bayes models are a generalization of Bayesian networks.

With no independence among the variables, a Bayesian network representing a distribution over n

variables would have n(n−1)2 edges.

With no independence among the variables, a Bayesian network representing a distribution over n variables would have n−1 edges.

**Answer: a), c)**

**For answers or latest updates join our telegram channel: Click here to join**

**These are Introduction to Machine Learning Week 8 Assignment Answers**

**Q3. A decision tree classifier learned from a fixed training set achieves 100% accuracy. Which of the following models trained using the same training set will also achieve 100% accuracy? (Assume P(xi|c)as Gaussians)I Logistic Regressor.II A polynomial of degree one kernel SVM.III A linear discriminant function.IV Naive Bayes classifier.**

I

I and II

IV

III

None of the above.

**Answer: None of the above.**

**Q4. Which of the following points would Bayesians and frequentists disagree on?**

The use of a non-Gaussian noise model in probabilistic regression.

The use of probabilistic modelling for regression.

The use of prior distributions on the parameters in a probabilistic model.

The use of class priors in Gaussian Discriminant Analysis.

The idea of assuming a probability distribution over models

**Answer: c), e)**

**For answers or latest updates join our telegram channel: Click here to join**

**These are Introduction to Machine Learning Week 8 Assignment Answers**

**Q5. Consider the following data for 500 instances of home, 600 instances of office and 700 instances of factory type buildingsSuppose a building has a balcony and power-backup but is not multi-storied. According to the Naive Bayes algorithm, it is of type**

Home

Office

Factory

**Answer: Factory**

**Q6. In AdaBoost, we re-weight points giving points misclassified in previous iterations more weight. Suppose we introduced a limit or cap on the weight that any point can take (for example, say we introduce a restriction that prevents any point’s weight from exceeding a value of 10). Which among the following would be an effect of such a modification? (Multiple options may be correct)**

We may observe the performance of the classifier reduce as the number of stages increase

It makes the final classifier robust to outliers

It may result in lower overall performance

It will make the problem computationally infeasible

**Answer: b), c)**

**For answers or latest updates join our telegram channel: Click here to join**

**These are Introduction to Machine Learning Week 8 Assignment Answers**

**Q7. While using Random Forests, if the input data is such that it contains a large number (> 80%) of irrelevant features (the target variable is independent of the these features), which of the following statements are TRUE?**

Random Forests have reduced performance as the fraction of irrelevant features increases.

Random forests have increased performance as the fraction of irrelevant features increases.

The fraction of irrelevant features doesn’t impact the performance of random forest.

**Answer: a) Random Forests have reduced performance as the fraction of irrelevant features increases.**

**Q8. Suppose you have a 6 class classification problem with one input variable. You decide to use logistic regression to build a predictive model. What is the minimum number of (β0,β) parameter pairs that need to be estimated?**

6

12

5

10

**Answer: 5**

**For answers or latest updates join our telegram channel: Click here to join**

**These are Introduction to Machine Learning Week 8 Assignment Answers**

More Weeks of Introduction to Machine Learning: Click here

More Nptel Courses: https://progiez.com/nptel-assignment-answers

**Introduction to Machine Learning Nptel Week 8 Answers (JULY-DEC 2023**)

**Course Name: Introduction to Machine Learning**

**Course Link: Click Here**

**These are Introduction to Machine Learning Week 8 Assignment Answers**

**Q1. The figure below shows a Bayesian Network with 9 variables, all of which are binary.Which of the following is/are always true for the above Bayesian Network?**

P(A,B|G)=P(A|G)P(B|G)

P(A,I)=P(A)P(I)

P(B,H|E,G)=P(B|E,G)P(H|E,G)

P(C|B,F)=P(C|F)

**Answer: P(A,I)=P(A)P(I)**

**Q2. Consider the following data for 20 budget phones, 30 mid-range phones, and 20 high-end phones:Consider a phone with 2 SIM card slots and NFC but no 5G compatibility. Calculate the probabilities of this phone being a budget phone, a mid-range phone, and a high-end phone using the Naive Bayes method. The correct ordering of the phone type from the highest to the lowest probability is?**

Budget, Mid-Range, High End

Budget, High End, Mid-Range

Mid-Range, High End, Budget

High End, Mid-Range, Budget

**Answer: Mid-Range, High End, Budget**

**These are Introduction to Machine Learning Week 8 Assignment Answers**

**Q3. A dataset with two classes is plotted below.Does the data satisfy the Naive Bayes assumption?**

Yes

No

The given data is insufficient

None of these

**Answer: No**

**These are Introduction to Machine Learning Week 8 Assignment Answers**

**Q4. A company hires you to look at their classification system for whether a given customer would potentially buy their product. When you check the existing classifier on different folds of the training set, you find that it manages a low accuracy of usually around 60%. Sometimes, it’s barely above 50%.With this information in mind, and without using additional classifiers, which of the following ensemble methods would you use to increase the classification accuracy effectively?**

Committee Machine

AdaBoost

Bagging

Stacking

**Answer: AdaBoost**

**These are Introduction to Machine Learning Week 8 Assignment Answers**

**Q5. Which of the following algorithms don’t use learning rate as a hyperparameter?**

Random Forests

Adaboost

KNN

PCA

**Answer: A, C, D**

**These are Introduction to Machine Learning Week 8 Assignment Answers**

**Q6. Consider the two statements:Statement 1: Bayesian Networks need not always be Directed Acyclic Graphs (DAGs)Statement 2: Each node in a bayesian network represents a random variable, and each edge represents conditional dependence.Which of these are true?**

Both the statements are True.

Statement 1 is true, and statement 2 is false.

Statement 1 is false, and statement 2 is true.

Both the statements are false.

**Answer: Statement 1 is false, and statement 2 is true.**

**These are Introduction to Machine Learning Week 8 Assignment Answers**

**Q7. A dataset with two classes is plotted below.Does the data satisfy the Naive Bayes assumption?**

Yes

No

The given data is insufficient

None of these

**Answer: Yes**

**These are Introduction to Machine Learning Week 8 Assignment Answers**

**Q8. Consider the below dataset:Suppose you have to classify a test example “The ball won the race to the boundary” and are asked to compute P(Cricket |“The ball won the race to the boundary”), what is an issue that you will face if you are using Naive Bayes Classifier, and how will you work around it? Assume you are using word frequencies to estimate all the probabilities.**

There won’t be a problem, and the probability of P(Cricket |“The ball won the race to the boundary”) will be equal to 1.

Problem: A few words that appear at test time do not appear in the dataset.

Solution: Smoothing.

Problem: A few words that appear at test time appear more than once in the dataset.

Solution: Remove those words from the dataset.

None of these

**Answer: Problem: A few words that appear at test time do not appear in the dataset.Solution: Smoothing.**

**These are Introduction to Machine Learning Week 8 Assignment Answers**

More Weeks of INTRODUCTION TO MACHINE LEARNING: Click here

More Nptel Courses: Click here