### Data Analytics With Python Week 11 Answers

**Q1.** ________ is used for calculating distance measures in clustering using python

a. distance_matrix

b. spatial_matrix

c. scipy_matrix

d. distance.matrix

**Answer:- A**

**Q2.** The formula for dissimilarity computation between two objects for categorical variables is –

Here p is a categorical variable and m denotes the number of matches.

- D(i,j) = p-m / p
- D(i,j) = p-m / m
- D(i,j) = m-p / p
- D(i,j) = m-p / m

**Answer: A**

**Q3.** Select the correct option for a data set with 7 objects and an interval-scaled variable ‘f’ we have the following measurements: f = (1, 2, 3, 4, 5, 8, 50) containing one outlying value.

- Std deviation (std_f) and mean absolute deviation (s_f) are equally affected
- Mean absolute deviation (s_f) is more affected by the outlier
- Std deviation (std_f) is more affected by the outlier
- None of these

**Answer: B**

**Q4.** Which of the following is true for K-means clustering?

- It comes under the partitioning method
- The number of clusters is predefined for this method
- Cluster similarity is measure in regard to the mean value of the objects in a cluster
- All of the above

**Answer: D**

**Q5.** Which of the following can act as possible termination conditions in K-Means?

- For a fixed number of iterations.
- Assignment of observations to clusters does not change between iterations. Except for cases with a bad local minimum.
- Centroids do not change between successive iterations.
- Terminate when Residual Sum of Squares (RSS) falls below a threshold.

- 1,3 and 4
- 1,2,3 and 4
- 2 and 3
- None of these

**Answer: B**

**Q6.** In the figure below, if you draw a horizontal line on y-axis for y=2. What will be the number of clusters formed?

**Answer:- B**

**Q7.** Which of the following clustering requires merging approach?

**Answer: C**

**Q8.** State True or False: Hierarchical clustering should primarily be used for exploration

**Answer: A**

**Q9.** State True or False: For finding dissimilarity between two clusters in hierarchical clustering, average-link is the only metric used

**Answer: B**

**Q10.** Hierarchical clustering can either be an agglomerative or divisive algorithm

**Answer: A**