# Introduction to Machine Learning | Week 1

Session: JAN-APR 2024

Course name: Introduction to Machine Learning

#### Q1. Which of the following is/are unsupervised learning problem(s)?Grouping documents into different categories based on their topicsForecasting the hourly temperature in a city based on historical temperature patternsIdentifying close-knit communities of people in a social networkTraining an autonomous agent to drive a vehicleIdentifying different species of animals from images

Grouping documents into different categories based on their topics
Identifying close-knit communities of people in a social network

Q2. Which of the following statement(s) about Reinforcement Learning (RL) is/are true?
While learning a policy, the goal is to maximize the long-term reward.
During training, the agent is explicitly provided the most optimal action to be taken in each state.
The state of the environment changes based on the action taken by the agent.
RL is used for building agents to play chess.
RL is used for predicting the prices of apartments from their features.

While learning a policy, the goal is to maximize the long-term reward.
The state of the environment changes based on the action taken by the agent.
RL is used for building agents to play chess.

Q3. Which of the following is/are classification tasks(s)?
Predicting whether an email is spam or not spam
Predicting the number of COVID cases over a given period
Predicting the score of a cricket team
Identifying the language of a text document

Predicting whether an email is spam or not spam
Identifying the language of a text document

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q4. Which of the following is/are regression task(s)?
Predicting whether or not a customer will repay a loan based on their credit history
Forecasting the amount of rainfall in a given place
Identifying the types of crops from aerial images of farms
Predicting the future price of a stock

Forecasting the amount of rainfall in a given place
Predicting the future price of a stock

Q5. Consider the following dataset. Fit a linear regression model of the form y=β0+β1×1+β2×2 using the mean-squared error loss. Using this model, the predicted value of y at the point (x1,x2)=(0.5,−1.0) is
−0.651
−0.737
0.245
−0.872

Q6. Consider the following dataset. Using a k-nearest neighbour (k-NN) regression model with k=3, predict the value of y at (x1,x2)=(0.5,−1.0). Use the Euclidean distance to find the nearest neighbours.
−1.762
−2.061
−1.930
−1.529

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q7. Consider the following statements regarding linear regression and k-NN regression models. Select the true statements.
A linear regressor requires the training data points during inference.
A k-NN regressor requires the training data points during inference.
A k-NN regressor with a higher value of k is less prone to overfitting.
A linear regressor partitions the input space into multiple regions such that the prediction over a given region is constant.

A k-NN regressor requires the training data points during inference.
A k-NN regressor with a higher value of k is less prone to overfitting.

Q8. Consider a binary classification problem where we are given certain measurements from a blood test and need to predict whether the patient does not have a particular disease (class 0) or has the disease (class 1). In this problem, false negatives (incorrectly predicting that the patient is healthy) have more serious consequences as compared to false positives (incorrectly predicting that the patient has the disease). Which of the following is an appropriate cost matrix for this classification problem? The row denotes the true class and the column denotes the predicted class.
[0 0 100 0]
[0 1 100 0]
[0 1 1 0]
[0 100 1 0]
[0 100 0 0]

Answer: D. [0 1 100 0]

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q9. Consider the following dataset with three classes: 0, 1 and 2. x1 and x2 are the independent variables whereas y is the class label. Using a k-NN classifier with k = 3, predict the class label at the point (x1,x2)=(0.7,−0.8). Use the Euclidean distance to find the nearest neighbours.
0
1
2
Cannot be predicted

Q10. Suppose that we train two kinds of regression models corresponding to the following equations.
(i) y=β0+β1×1+β2×2
(ii) y=β0+β1×1+β2×2+β3x1x2
Which of the following statement(s) is/are correct?

On a given training dataset, the mean-squared error of (i) is always greater than or equal to that of (ii).
(i) is likely to have a higher variance than (ii).
(ii) is likely to have a higher variance than (i).
If (ii) overfits the data, then (i) will definitely overfit.
If (ii) underfits the data, then (i) will definitely underfit.

On a given training dataset, the mean-squared error of (i) is always greater than or equal to that of (ii).
(ii) is likely to have a higher variance than (i).
If (ii) underfits the data, then (i) will definitely underfit.

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Session: JULY-DEC 2023

Course Name: Introduction to Machine Learning

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q1. Which of the following is a supervised learning problem?
Grouping related documents from an unannotated corpus.
Predicting credit approval based on historical data.
Predicting if a new image has cat or dog based on the historical data of other images of cats and dogs, where you are supplied the information about which image is cat or dog.
Fingerprint recognition of a particular person used in biometric attendance from the fingerprint data of various other people and that particular person.

Q2. Which of the following are classification problems?
Predict the runs a cricketer will score in a particular match.
Predict which team will win a tournament.
Predict whether it will rain today.

Q3. Which of the following is a regression task?
Predicting the monthly sales of a cloth store in rupees.
Predicting if a user would like to listen to a newly released song or not based on historical data.
Predicting the confirmation probability (in fraction) of your train ticket whose current status is waiting list based on historical data.
Predicting if a patient has diabetes or not based on historical medical records.
Predicting if a customer is satisfied or unsatisfied from the product purchased from ecommerce website using the the reviews he/she wrote for the purchased product.

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q4. Which of the following is an unsupervised learning task?
Group audio files based on language of the speakers.
Group applicants to a university based on their nationality.
Predict a student’s performance in the final exams.
Predict the trajectory of a meteorite.

Q5. Which of the following is a categorical feature?
Number of rooms in a hostel.
Gender of a person
Ethnicity of a person
Area (in sq. centimeter) of your laptop screen.
The color of the curtains in your room.
Number of legs an animal.
Minimum RAM requirement (in GB) of a system to play a game like FIFA, DOTA.

Q6. Which of the following is a reinforcement learning task?
Learning to drive a cycle
Learning to predict stock prices
Learning to play chess
Leaning to predict spam labels for e-mails

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q7. Let X and Y be a uniformly distributed random variable over the interval [0,4] and [0,6] respectively. If X and Y are independent events, then compute the probability, P(max(X,Y)>3)
1/6
5/6
2/3
1/2
2/6
5/8
None of the above

Q8. Find the mean of 0-1 loss for the given predictions:
1
0
1.5
0.5

Q9. Which of the following statements are true? Check all that apply.
A model with more parameters is more prone to overfitting and typically has higher variance.
If a learning algorithm is suffering from high bias, only adding more training examples may not improve the test error significantly.
When debugging learning algorithms, it is useful to plot a learning curve to understand if there is a high bias or high variance problem.
If a neural network has much lower training error than test error, then adding more layers will help bring the test error down because we can fit the test set better.

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q10. Bias and variance are given by:
E[f^(x)]−f(x),E[(E[f^(x)]−f^(x))2]
E[f^(x)]−f(x),E[(E[f^(x)]−f^(x))]2
(E[f^(x)]−f(x))2,E[(E[f^(x)]−f^(x))2]
(E[f^(x)]−f(x))2,E[(E[f^(x)]−f^(x))]2

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Session: JAN-APR 2023

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Course Name: Introduction to Machine Learning

#### Q1) Which of the following is a supervised learning problem?a. Grouping related documents from an unannotated corpus.b. Predicting credit approval based on historical datac. Predicting rainfall based on historical datad. Predicting if a customer is going to return or keep a particular product he/she purchased from e-commerce website based on the historical data about the customer purchases and the particular product.e. Fingerprint recognition of a particular person used in biometric attendance from the fingerprint data of various other people and that particular person

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q2) Which of the following is not a classification problem?
a. Predicting the temperature (in Celsius) of a room from other environmental features (such as atmospheric pressure, humidity etc).
b. Predicting if a cricket player is a batsman or bowler given his playing records.
c. Predicting the price of house (in INR) based on the data consisting prices of other house (in INR) and its features such as area, number of rooms, location etc.
d. Filtering of spam messages
e. Predicting the weather for tomorrow as “hot”, “cold”, or “rainy” based on the historical data wind speed, humidity, temperature, and precipitation.

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q3) Which of the following is a regression task? (multiple options may be correct)
a. Predicting the monthly sales of a cloth store in rupees.
b. Predicting if a user would like to listen to a newly released song or not based on historical data.
c. Predicting the confirmation probability (in fraction) of your train ticket whose current status is waiting list based on historical data.
d. Predicting if a patient has diabetes or not based on historical medical records.
e. Predicting if a customer is satisfied or unsatisfied from the product purchased from e-commerce website using the the reviews he/she wrote for the purchased product.

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q4) Which of the following is an unsupervised task?
a. Predicting if a new edible item is sweet or spicy based on the information of the ingredients, their quantities, and labels (sweet or spicy) for many other similar dishes.
b. Grouping related documents from an unannotated corpus.
c. Grouping of hand-written digits from their image.
d. Predicting the time (in days) a PhD student will take to complete his/her thesis to earn a degree based on the historical data such as qualifications, department, institute, research area, and time taken by other scholars to earn the degree.
e. all of the above

Q5) Which of the following is a categorical feature?
a. Number of rooms in a hostel.
b. Minimum RAM requirement (in GB) of a system to play a game like FIFA, DOTA.
c. Your weekly expenditure in rupees.
d. Ethnicity of a person
e. Area (in sq. centimeter) of your laptop screen.
f. The color of the curtains in your room.

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q6) Let X and Y be a uniformly distributed random variable over the interval [0, 4] and [0, 6] respectively. If X and Y are independent events, then compute the probability, P(max(X,Y)>3)
a. 1/6
b. 5/6
c. 2/3
d. 1/2
e. 2/6
f. 5/8
g. None of the above

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q7) Let the trace and determinant of a matrix A[acbd] be 6 and 16 respectively. The eigenvalues of A are
a. 3+i√7/2,3−i√7/√2, where √=√−1
b. 1,3
c. 3+i√7/4,3−√7/4, where i=√−1
d. 1/2,3/2
e. 3+i√7,3−i√7, where i=√−1
f. 2,8
g. None of the above
h. Can be computed only if A is a symmetric matrix.
i. Can be computed only if A is a symmetric matrix.
j. Can not be computed as the entries of the matrix A are not given.

Q8) What happens when your model complexity increases? (multiple options may be correct)
a. Model Bias decreases
b. Model Bias increases
c. Variance of the model decreases
d. Variance of the model increases

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q9) A new phone, E-Corp X1 has been announced and it is what you’ve been waiting for, all along. You decide to read the reviews before buying it. From past experiences, you’ve figured out that good reviews mean that the product is good 90% of the time and bad reviews mean that it is bad 70% of the time. Upon glancing through the reviews section, you find out that the X1 has been reviewed 1269 times and only 172 of them were bad reviews. What is the probability that, if you order the X1, it is a bad phone?
a. 0.136
b. 0.160
c. 0.360
d. 0.840
e. 0.773
f. 0.573
g. 0.181

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Q10) Which of the following are false about bias and variance of overfitted and underfitted models? (multiple options may be correct)
a. Underfitted models have high bias.
b. Underfitted models have low bias.
c. Overfitted models have low variance.
d. Overfitted models have high variance.

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

Session: JUL-DEC 2022

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

1. Which of the following are supervised learning problems? (multiple may be correct)

a. Learning to drive using a reward signal.
b. Predicting disease from blood sample.
c. Grouping students in the same class based on similar features.
d. Face recognition to unlock your phone.

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

2. Which of the following are classification problems? (multiple may be correct)

a. Predict the runs a cricketer will score in a particular match.
b. Predict which team will win a tournament.
c. Predict whether it will rain today.

3. Which of the following is a regression task? (multiple options may be correct)

a. Predict the price of a house 10 years after it is constructed.
b. Predict if a house will be standing 50 years after it is constructed.
c. Predict the weight of food wasted in a restaurant during next month.
d. Predict the sales of a new Apple product.

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

4. Which of the following is an unsupervised learning task? (multiple options may be correct)

a. Group audio files based on language of the speakers.
b. Group applicants to a university based on their nationality.
c. Predict a student’s performance in the final exams.
d. Predict the trajectory of a meteorite.

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

5. Given below is your dataset. You are using KNN regression with K=3. What is the prediction for a new input value (3, 2)?

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

6. Which of the following is a reinforcement learning task? (multiple options may be correct)

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

7. Find the mean of squared error for the given predictions:

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

8. Find the mean of 0-1 loss for the given predictions:

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

9. Bias and variance are given by:

These are Introduction to Machine Learning Week 1 Assignment 1 Answers

10. Which of the following are true about bias and variance? (multiple options may be correct)