# Introduction to Machine Learning | Week 11

Session: JAN-APR 2024

Course name: Introduction to Machine Learning

#### Q1. Which of the following is/are estimated by the Expectation Maximization (EM) algorithm for a Gaussian Mixture Model (GMM)?K (number of components)πk (mixing coefficient of each component)μk (mean vector of each component)Σk (covariance matrix of each component)None of the above

Q2. Which of the following is/are true about the responsibility terms in GMMs? Assume the standard notation used in the lectures.
∑kγ(znk)=1∀n
∑nγ(znk)=1∀k
γ(znk)∈{0,1}∀n,k
0<γ(znk)<1∀n,k
πj>πk⟹γ(znj)>γ(znk)∀n

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Q3. What is the update equation for μk in the EM algorithm for GMM?
a) μ(m)k=∑Nn=1γ(znk)|v(m)xn/∑Nn=1γ(znk)|v(m−1)
b) μ(m)k=∑Nn=1γ(znk)|v(m−1)xn/∑Nn=1γ(znk)|v(m−1)
c) μ(m)k=∑Nn=1γ(znk)|v(m−1)xn/N
d) μ(m)k=∑Nn=1γ(znk)|v(m)xn/N

Q4. Select the correct statement(s) about the EM algorithm for GMMs.
In the mth iteration, the γ(znk) values are computed using the paramater estimates computed in the same iteration.
In the mth iteration, the γ(znk) values are computed using the paramater estimates computed in the (m−1)th iteration.
The Σk parameter estimates are computed during the E step.
The πk parameter estimates are computed during the M step.

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Q5. For questions 5 to 7, use the following data consisting of 8 points (xi,yi).
Fit a GMM with 2 components for this data. What are the mixing coefficients of the learned components? (Note: Use the sklearn implementation of GMM with random state = 0. Do not change the other default parameters).

(0.791, 0.209)
(0.538, 0.462)
(0.714, 0.286)
(0.625, 0.375)

Q6. Using the model trained in question 5, compute the log-likelihood of the following points. Which of these points has the highest likelihood of being sampled from the model?
(2.0, 0.5)
(-1.0, -0.5)
(7.5, 8.0)
(5.0, 5.5)

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Q7. Let Model A be the GMM with 2 components that was trained in question 5. Using the same data from question 5, estimate a GMM with 3 components (Model B). (Note: Use the sklearn implementation of GMM with random state = 0 and all the other default parameters.)
Select the pair(s) of points that have the same label in Model A but different labels in Model B.

(1.0, 1.5) and (0.9, 1.6)
(1.8, 1.2) and (0.9, 1.6)
(7.8, 9.5) and (8.8, 7.5)
(7.8, 9.5) and (7.6, 8.0)
(8.2, 7.3) and (7.6, 8.0)

Q8. Consider the following two statements.
Statement A: In a GMM with two or more components, the likelihood can attain arbitrarily high values.
Statement B: The likelihood increases monotonically with each iteration of EM.
Both the statements are correct and Statement B is the correct explanation for Statement A.
Both the statements are correct, but Statement B is not the correct explanation for Statement A.
Statement A is correct and Statement B is incorrect.
Statement A is incorrect and Statement B is correct.
Both the statements are incorrect.

Answer: Both the statements are correct, but Statement B is not the correct explanation for Statement A.

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Session: JULY-DEC 2023

Course Name: Introduction to Machine Learning

Session: JULY-DEC 2023

Course Name: Introduction to Machine Learning

#### These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Q1. What is the update for πk in EM algorithm for GMM?
a. π(m)k=∑Nn=1γ(znk)|v(m−1)N−1
b. π(m)k=∑Nn=1γ(znk)|v(m)N
c. π(m)k=∑Nn=1γ(znk)|v(m−1)N
d. π(m)k=∑Nn=1γ(znk)|v(m)N−1

Q2. Consider the two statements:
Statement 1: The EM algorithm can only be used for parameter estimation of mixture models.
Statement 2: The Gaussian Mixture Models used for clustering always outperform k-means and single-link clustering.
Which of these are true?

Both the statements are true
Statement 1 is true, and Statement 2 is false
Statement 1 is false, and Statement 2 is true
Both the statements are false

Answer: Both the statements are false

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Q3. KNN is a special case of GMM with the following properties: (Select all that apply)
γi=i(2πe)1/2e−12ϵ
Covariance = ϵI
μi=μj∀i,j
πk=1k

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Q4. What does soft clustering mean in GMMs?
There may be samples that are outside of any cluster boundary.
The updates during maximum likelihood are taken in small steps, to guarantee convergence.
It restricts the underlying distribution to be gaussian.
Samples are assigned probabilities of belonging to a cluster.

Answer: Samples are assigned probabilities of belonging to a cluster.

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Q5. In Gaussian Mixture Models, πi are the mixing coefficients. Select the correct conditions that the mixing coefficients need to satisfy for a valid GMM model.
0≤πi≤1∀i
−1≤πi≤1∀i
∑iπi=1
∑iπi need not be bounded

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Q6. What statement(s) are true about the expectation-maximization (EM) algorithm?
It requires some assumption about the underlying probability distribution.
Comparing to a gradient descent algorithm that optimizes the same objective function as EM, EM may only find a local optima, whereas the gradient descent will always find the global optima
The EM algorithm minimizes a lower bound of the marginal likelihood P(D;θ)
The algorithm assumes that some of the data generated by the probability distribution are not observed.

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Q7. Consider the two statements:
Statement 1: The EM algorithm can get stuck at saddle points.
Statement 2: EM is guaranteed to converge to a point with zero gradient.
Which of these are true?

Both the statements are true
Statement 1 is true, and Statement 2 is false
Statement 1 is false, and Statement 2 is true
Both the statements are false

Answer: Both the statements are true

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Session: JAN-APR 2023

Course Name: Introduction to Machine Learning

#### Q1. Given n samples x1,x2,…,xN drawn independently from an Exponential distribution unknown parameter λ, find the MLE of λ.a. λMLE=∑ni=1xib. λMLE=n∑ni=1xic. λMLE=n∑ni=1xid. λMLE=∑ni=1xi/ne. λMLE=n−1/∑ni=1xif. λMLE=∑ni=1xi/n−1

Q2. Given n samples x1,x2,…,xn drawn independently from an Geometric distribution unknown parameter p given by pdf Pr(X=k)=(1−p)k−1p for k=1,2,3,⋅⋅⋅ , find the MLE of p.
a. pMLE=∑ni=1xi
b. pMLE=n∑ni=1xi
c. pMLE=n/∑ni=1xi
d. pMLE=∑ni=1xi/n
e. pMLE=n−1/∑ni=1xi
f. pMLE=∑ni=1xi/n−1

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Q3. Suppose we are trying to model a p dimensional Gaussian distribution. What is the actual number of independent parameters that need to be estimated in mean and covariance matrix respectively?
a. 1,1
b. p−1,1
c. p,p
d. p,p(p+1)
e. p,p(p+1)/2
f. p,(p+3)/2
g. p−1,p(p+1)
h. p−1,p(p+1)/2+1
i. p−1,(p+3)/2
j. p,p(p+1)−1
k. p,p(p+1)/2−1
l. p,(p+3)/2−1
m. p,p2
n. p,p2/2
o. None of these

Q4. Given n samples x1,x2,…,xN drawn independently from a Poisson distribution unknown parameter λ, find the MLE of λ.
a. λMLE=∑ni=1xi
b. λMLE=n∑ni=1xi
c. λMLE=n/∑ni=1xi
d. λMLE=∑ni=1xi/n
e. λMLE=n−1/∑ni=1xi
f. λMLE=∑ni=1xi/n−1

These are Introduction to Machine Learning Week 11 Assignment 11 Answers

Q5. In Gaussian Mixture Models, πi are the mixing coefficients. Select the correct conditions that the mixing coefficients need to satisfy for a valid GMM model.
a. −1≤πi≤1,∀i
b. 0≤πi≤1,∀i
c. ∑iπi=1
d. ∑iπi need not be bounded

Q6. Expectation-Maximization, or the EM algorithm, consists of two steps – E step and the M-step. Using the following notation, select the correct set of equations used at each step of the algorithm.
Notation.
X: Known/Given variables/data
Z: Hidden/Unknown variables
θ: Total set of parameters to be learned
θk: Values of all the parameters after stage k
Q(,): The Q-function as described in the lectures

a. E-step: EZ|X,θ[log(Pr(X,Z|θm))]
b. E-step: EZ|X,θm−1[log(Pr(X,Z|θ))]
c. M-step: argmaxθ∑ZPr(Z|X,θm−2)⋅log(Pr(X,Z|θ))
d. M-step: argmaxθQ(θ,θm−1)
e. M-step: argmaxθQ(θ,θm−2)