Introduction to Machine Learning Nptel Week 7 Answers

Are you looking for Introduction to Machine Learning Week 7 Assignment 7 Answers? You’ve come to the right place! Access the latest and most accurate solutions for your Week 7 assignment in the Introduction to Machine Learning course.

Introduction to Machine Learning Nptel Week 6 Answers (Jan-Apr 2025)

Que. 1
Which of the following statement(s) regarding the evaluation of Machine Learning models is/are true?

a) A model with a lower training loss will perform better on a validation dataset.
b) A model with a higher training accuracy will perform better on a validation dataset.
c) The train and validation datasets can be drawn from different distributions.
d) The train and validation datasets must accurately represent the real distribution of data.

View Answer

Que. 2
Suppose we have a classification dataset comprising of 2 classes A and B with 200 and 40 samples respectively. Suppose we use stratified sampling to split the data into train and test sets. Which of the following train-test splits would be appropriate?

a) Train-{A:50samples, B:10samples}, Test-{A:150samples, B:30samples}
b) Train-{A:50samples, B:30samples}, Test-{A:150samples, B:10samples}
c) Train-{A:150samples, B:30samples}, Test-{A:50samples, B:10samples}
d) Train-{A:150samples, B:10samples}, Test-{A:50samples, B:30samples}

View Answer

Que. 3
Suppose we are performing cross-validation on a multiclass classification dataset with N data points. Which of the following statement(s) is/are correct?

a) In k-fold cross-validation, we train k−1 different models and evaluate them on the same test set.
b) In k-fold cross-validation, we train k different models and evaluate them on different test sets.
c) In k-fold cross-validation, each fold should have a class-wise proportion similar to the given dataset.
d) In LOOCV (Leave-One-Out Cross Validation), we train N different models, using N−1 data points for training each model.

View Answer

Que. 4
For a binary classification problem, we train classifiers and evaluate them to obtain confusion matrices in the following format:
Which of the following classifiers should be chosen to maximize the recall?

a) [413677]
b) [840260]
c) [59581]
d) [70390]

View Answer

Que. 5
For the confusion matrices described in Q4, which of the following classifiers should be chosen to minimize the False Positive Rate?

a) [46684]
b) [813277]
c) [12988]
d) [104086]

View Answer

Que. 6
For the confusion matrices described in Q4, which of the following classifiers should be chosen to maximize the precision?

a) [46684]
b) [813277]
c) [12988]
d) [104086]

View Answer

Que. 7
For the confusion matrices described in Q4, which of the following classifiers should be chosen to maximize the F1-score?

a) [46684]
b) [83287]
c) [12988]
d) [104086]

View Answer

Que. 8
Which of the following statement(s) regarding boosting is/are correct?

a) Boosting is an example of an ensemble method.
b) Boosting assigns equal weights to the predictions of all the weak classifiers.
c) Boosting may assign unequal weights to the predictions of all the weak classifiers.
d) The individual classifiers in boosting can be trained parallelly.
e) The individual classifiers in boosting cannot be trained parallelly.

View Answer

Que. 9
Which of the following statement(s) about bagging is/are correct?

a) Bagging is an example of an ensemble method.
b) The individual classifiers in bagging can be trained in parallel.
c) Training sets are constructed from the original dataset by sampling with replacement.
d) Training sets are constructed from the original dataset by sampling without replacement.
e) Bagging increases the variance of an unstable classifier.

View Answer

Que. 10
Which of the following statement(s) about ensemble methods is/are correct?

a) Ensemble aggregation methods like bagging aim to reduce overfitting and variance.
b) Committee machines may consist of different types of classifiers.
c) Weak learners are models that perform slightly worse than random guessing.
d) Stacking involves training multiple models and stacking their predictions into new training data.

View Answer

Introduction to Machine Learning Nptel Week 6 Answers (Jan-Apr 2024)

Course name: Introduction to Machine Learning

Course Link: Click Here

For answers or latest updates join our telegram channel: Click here to join

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q1. Which of the following statement(s) regarding the evaluation of Machine Learning models is/are true?
A model with a lower training loss will perform better on a test dataset.
The train and test datasets should represent the underlying distribution of the data.
To determine the variation in the performance of a learning algorithm, we generally use one training set and one test set.
A learning algorithm can learn different parameter values if given different samples from the same distribution.

Answer: b), d)

Q2. Suppose we have a classification dataset comprising of 2 classes A and B with 100 and 50 samples respectively. Suppose we use stratified sampling to split the data into train and test sets. Which of the following train-test splits would be appropriate?
Train- {A:80samples,B:30samples}, Test- {A:20samples,B:20samples}
Train- {A:20samples,B:20samples}, Test- {A:80samples,B:30samples}
Train- {A:80samples,B:40samples}, Test- {A:20samples,B:10samples}
Train- {A:20samples,B:10samples}, Test- {A:80samples,B:40samples}

Answer: c) Train- {A:80samples,B:40samples}, Test- {A:20samples,B:10samples}

For answers or latest updates join our telegram channel: Click here to join

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q3. Suppose we are performing cross-validation on a multiclass classification dataset with N
data points. Which of the following statement(s) is/are correct?
In k-fold cross validation, each fold should have a class-wise proportion similar to the given dataset.
In k-fold cross-validation, we train one model and evaluate it on the k different test sets.
In LOOCV, we train N different models, using (N-1) data points for training each model.
In LOOCV, we can use the same test data to evaluate all the trained models.

Answer: a), c)

Q4. Suppose we have a binary classification problem wherein we need to achieve a high recall. On training four classifiers and evaluating them, we obtain the following confusion matrices. Each matrix has the format indicated below:
Which of these classifiers should we prefer?
[4 3 6 87]
[8 11 2 79]
[5 0 5 90]
[2 4 8 86]

Answer: b) [8 11 2 79]

For answers or latest updates join our telegram channel: Click here to join

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q5. Suppose we have a binary classification problem wherein we need to achieve a low False Positive Rate (FPR). On training four classifiers and evaluating them, we obtain the following confusion matrices. Each matrix has the format indicated below:
Which of these classifiers should we prefer?
[4 6 6 84]
[8 13 2 77]
[5 2 5 88]
[10 4 0 86]

Answer: c) [5 2 5 88]

Q6. We have a logistic regression model that computes the probability p(x) that a given input x belongs to the positive class. For a threshold θ∈(0,1), the class labels f(x)∈{negative,positive} are predicted as given below.
f(x)={negative,positive,if p(x)<θif p(x)≥θ
For θ=0.5, we have TPR=0.8 and FPR=0.3. Then which of the following statement(s) is/are correct?
For θ=0.4, the FPR could be lower than 0.25.
For θ=0.4, the FPR could be higher than 0.45.
For θ=0.6, the TPR must be higher than 0.85.
For θ=0.6, the TPR could be higher than 0.85.
For θ=0.4, the TPR must be lower than 0.75.
For θ=0.4, the TPR could be lower than 0.75.

Answer: b) For θ=0.4, the FPR could be higher than 0.45.

For answers or latest updates join our telegram channel: Click here to join

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q7. Consider the following statements.
Statement P: Boosting takes multiple weak classifiers and combines them into a strong classifier.
Statement Q: Boosting assigns equal weights to the predictions of all the weak classifiers, resulting in a high overall performance.
P is True. Q is True. Q is the correct explanation for A.
P is True. Q is True. Q is not the correct explanation for A.
P is True. Q is False.
Both P and Q are False.

Answer: c) P is True. Q is False.

Q8. Which of the following statement(s) about ensemble methods is/are correct?
The individual classifiers in bagging cannot be trained parallelly.
The individual classifiers in boosting cannot be trained parallelly.
A committee machine can consist of different kinds of classifiers like SVM, decision trees and logistic regression.
Bagging further increases the variance of an unstable classifier.

Answer: b), c)

For answers or latest updates join our telegram channel: Click here to join

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

More Weeks of Introduction to Machine Learning: Click here

More Nptel Courses: https://progiez.com/nptel-assignment-answers

Session: JULY-DEC 2023

Course Name: Introduction to Machine Learning

Course Link: Click Here

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q1. What is bootstrapping in the context of machine learning?
A technique to improve model training speed.
A method to reduce the size of the dataset.
Creating multiple datasets by randomly sampling with replacement.
A preprocessing step to normalize data.

Answer: Creating multiple datasets by randomly sampling with replacement.

Q2. Which of the following is NOT a benefit of cross-validation?
Reduces the risk of overfitting.
Provides a more accurate estimate of model performance.
Allows for better understanding of model bias.
Increases the size of the training dataset.

Answer: Increases the size of the training dataset.

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q3. Bagging is an ensemble method that:
Focuses on boosting the performance of a single weak learner.
Trains multiple models sequentially, each learning from the mistakes of the previous one.
Combines predictions of multiple models to improve overall accuracy.
Utilizes a committee of diverse models for prediction.

Answer: Combines predictions of multiple models to improve overall accuracy.

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q4. Which evaluation measure is more suitable for imbalanced classification problems?
Accuracy
Precision
F1-score
Mean Squared Error

Answer: F1-score

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q5. What does the ROC curve represent?
The trade-off between precision and recall.
The relationship between accuracy and F1-score.
The performance of a model across various thresholds.
The distribution of classes in a dataset.

Answer: The performance of a model across various thresholds.

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q6. Which ensemble method involves training multiple models in such a way that each model corrects the errors of the previous model?
Bagging
Stacking
Boosting
Committee Machines

Answer: Boosting

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q7. In a ROC curve, what does the diagonal line represent?
The perfect classifier
Random guessing
Trade-off between sensitivity and specificity
The ideal threshold for classification

Answer: Random guessing

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q8. In k-fold cross-validation, how is the dataset divided for training and testing?
The dataset is randomly shuffled and divided into k equal parts. One part is used for testing and the remaining k-1 parts are used for training.
The dataset is split into two equal parts: one for training and the other for testing.
The dataset is divided into k equal parts. One part is used for testing and the remaining k-1 parts are used for training in each iteration.
The dataset is divided into k unequal parts based on data distribution.

Answer: The dataset is divided into k equal parts. One part is used for testing and the remaining k-1 parts are used for training in each iteration.

Q9. What is the primary advantage of ensemble methods over individual models?
Simplicity of implementation
Lower computational complexity
Increased Robustness
Faster training time

Answer: Increased Robustness

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

More Weeks of INTRODUCTION TO MACHINE LEARNING: Click here

More Nptel Courses: Click here

Session: JAN-APR 2023

Course Name: Introduction to Machine Learning

Course Link: Click Here

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q1. For the given confusion matrix, compute the recall

a. 0.73
b. 0.7
c. 0.6
d. 0.67
e. 0.78
f. None of the above

Answer: d. 0.67

Q2. You have 2 multi-class classifiers A and B. A has accuracy = 0% and B has accuracy = 50%. Which classifier is more useful?
a. A
b. B
c. Both are equally good
d. Depends on the number of classes

Answer: d. Depends on the number of classes

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q3. For large datasets, we should always be choosing large k while doing k − fold cross validation to get better performance on test set.
a. True
b. False

Answer: b. False

Q4. We have a dataset with 1000 samples and 5 classes for classification. What would be the training size for a 20 fold cross validation?
a. 50
b. 200
c. 800
d. 950

Answer: d. 950

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q5. Which of the following are true?
TP – True Positive, TN – True Negative, FP – False Positive, FN – False Negative
a. Precision=TP/TP+FP
b. Recall=TP/TP+FN
c. Accuracy=2(TP+TN)/TP+TN+FP+FN
d. Recall=FP/TP+FP

Answer: a, b

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q6. In the ROC plot, what are the quantities along x and y axes respectively?
a. Precision, Recall
b. Recall, Precision
c. True Positive Rate, False Positive Rate
d. False Positive Rate, True Positive Rate
e. Specificity, Sensitivity
f. True Positive, True Negative
g. True Negative, True Positive

Answer: d. False Positive Rate, True Positive Rate

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q7. How does bagging help in improving the classification performance?
a. If the parameters of the resultant classifiers are fully uncorrelated (independent), then bagging is inefficient.
b. It helps reduce variance
c. If the parameters of the resultant classifiers are fully correlated, then bagging is inefficient.
d. It helps reduce bias

Answer: b, c

Q8. Which method among bagging and stacking should be chosen in case of limited training data? and What is the appropriate reason for your preference?
a. Bagging, because we can combine as many classifier as we want by training each on a different sample of the training data
b. Bagging, because we use the same classification algorithms on all samples of the training data
c. Stacking, because we can use different classification algorithms on the training data
d. Stacking, because each classifier is trained on all of the available data

Answer: d. Stacking, because each classifier is trained on all of the available data

Q9. Which of the following statements are false when comparing Committee Machines and Stacking
a. Committee Machines are, in general, special cases of 2-layer stacking where the second- layer classifier provides uniform weightage.
b. Both Committee Machines and Stacking have similar mechanisms, but Stacking uses different classifiers while Committee Machines use similar classifiers.
c. Committee Machines are more powerful than Stacking
d. Committee Machines are less powerful than Stacking

Answer: b, c

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

More Weeks of Introduction to Machine Learning: Click Here

More Nptel courses: https://progiez.com/nptel

Session: JUL-DEC 2022

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Course Name: INTRODUCTION TO MACHINE LEARNING

Link to Enroll: Click Here

Q1. You have 2 binary classifiers A and B. A has accuracy=0% and B has accuracy=50%. Which classifier is more useful?
a. A
b. B
c. Both are good
d. Cannot say

Answer: c. Both are good

Q2. You have 2 multi-class classifiers A and B. A has accuracy=0% and B has accuracy=50%. Which classifier is more useful?
a. A
b. B
c. Both are good
d. Cannot say

Answer: d. Cannot say

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q3. Using the bootstrap approach for sampling, the new dataset will have _________ of the original samples on expectation.
a. 50.0%
b. 56.8%
c. 63.2%
d. 73.6%

Answer: a. 50.0%

Q4. You have a special case where your data has 10 classes and is sorted according to target labels. You attempt 5-fold cross validation by selecting the folds sequentially. What can you say about your resulting model?
a. It will have 100% accuracy.
b. It will have 0% accuracy.
c. Accuracy will depend on how good the model does.
d. Accuracy will depend on the compute power available for training.

Answer: d. Accuracy will depend on the compute power available for training.

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q5. Given the following information
What is the precision and recall?
a. 0.5, 0.4375
b. 0.7, 0.636
c. 0.6, 0.636
d. 0.7, 0.4375
e. None of the above

Answer: e. None of the above

Q6. AUC for your newly trained model is 0.5. Is your model prediction completely random?
a. Yes
b. No
c. ROC curve is needed to derive this conclusion
d. Cannot be determined even with ROC

Answer: d. Cannot be determined even with ROC

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q7. What is the effect of using bagging on weak classifiers for variance?
a. Increases variance
b. Reduces variance
c. Does not change

Answer: a. Increases variance

Q8. You are building a model to detect cancer. Which metric will you prefer for evaluating your model?
a. Accuracy
b. Sensitivity
c. Specificity
d. MSE

Answer: c. Specificity

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

Q9. You are building a model to detect a mild medical condition for which further testing costs are extremely expensive. Which metric will you prefer for evaluating your model?
a. Accuracy
b. Sensitivity
c. Specificity
d. MSE

Answer: d. MSE

Q10. A: Boosting takes many weak learners and combines them into a strong learner.
B: Boosting determines the proportion of importance each weak learner should be assigned and weighs its prediction by it and combines them to make the final prediction.
a. A is True. B is True. B is the correct explanation for A.
b. A is True. B is True. B is not the correct explanation for A.
c. A is True. B is False.
d. Both A and B are False.

Answer: c. A is True. B is False.

These are Introduction to Machine Learning Week 7 Assignment 7 Answers

More NPTEL Solutions: https://progiez.com/nptel