# Data Science for Engineers | Week 8

**Session: JULY-DEC 2023**

**Course Name: Data Science for Engineers**

**Course Link: Click Here**

**These are NPTEL Data Science for Engineers Assignment 8 Answers**

Consider the dataset **“USArrests.csv”**. Answer questions 1 to 4 based on the information given below:

This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 1973. Also given is the percent of the population living in urban areas.

- First level item Set the column “States” as index of the data frame while reading the data
- First level itemSet the random number generator to set.seed(123)
- First level item Normalize the data using scale function and build the K-means algorithm with the given conditions:

– Second level item number of clusters = 4

– Second level item nstart=20

**Q1. According to the built model, the within cluster sum of squares for each cluster is _ (the order of values in each option could be different):-**

8.316061 11.952463 16.212213 19.922437

7.453059 12.158682 13.212213 21.158766

8.316061 13.952463 15.212213 19.922437

None of the above

**Answer: 8.316061 13.952463 15.212213 19.922437**

**Q2. According to the built model, the size of each cluster is _ (the order of values in each option could be different):-**

13 13 7 14

11 18 25 24

8 13 16 13

None of the above

**Answer: 11 18 25 24**

**These are NPTEL Data Science for Engineers Assignment 8 Answers**

**Q3. The Between Cluster Sum-of-Squares (BCSS) value of the built K-means model is _ (Choose the appropriate range)**

100 – 200

200 – 300

300 – 350

None of the above

**Answer: 100 – 200**

**Q4. The Total Sum-of-Squares value of the built k-means model is _ (Choose the appropriate range)**

100 – 200

200 – 300

300 – 350

None of the above

**Answer: 300 – 350**

**These are NPTEL Data Science for Engineers Assignment 8 Answers**

**Q5. Which of the statement is INCORRECT about KNN algorithm?**

KNN works ONLY for binary classification problems

If k=1, then the algorithm is simply called the nearest neighbour algorithm

Number of neighbours (K) will influence classification output

None of the above

**Answer: KNN works ONLY for binary classification problems**

**Q6. K means clustering algorithm clusters the data points based on:-**

Dependent and independent variables

The eigen values

Distance between the points and a cluster centre

None of the above

**Answer: Distance between the points and a cluster centre**

**These are NPTEL Data Science for Engineers Assignment 8 Answers**

**Q7. The method / metric which is NOT useful to determine the optimal number of clusters in unsupervised clustering algorithms is**

Scatter plot

Elbow method

Dendrogram

None of the above

**Answer: None of the above**

**These are NPTEL Data Science for Engineers Assignment 8 Answers**

**Q8. The unsupervised learning algorithm which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest centroid is**

Hierarchical clustering

K-means clustering

KNN

None of the above

**Answer: K-means clustering**

**These are NPTEL Data Science for Engineers Assignment 8 Answers**

More Solutions of Data Science for Engineers: Click Here

More Nptel Courses: Click here

**Course Name: Data Science for Engineers**

**Course Link: Click Here**

**These are NPTEL Data Science for Engineers Assignment 8 Answers**

**Consider the dataset “USArrests.csv”. Answer questions 1 to 4 based on the information given below:This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 1973. Also given is the percent of the population livingin urban areas.**

**• Set the column “States” as index of the data frame while reading the data• Set the random number generator to set.seed(123)• Normalize the data using scale function and build the K-means algorithm with the given conditions:– number of clusters = 4– nstart=20**

**Q1. According to the built model, the within cluster sum of squares for each cluster is ***__* (the order of values in each option could be different):-

a. 8.316061 11.952463 16.212213 19.922437

b. 7.453059 12.158682 13.212213 21.158766

c. 8.316061 13.952463 15.212213 19.922437

d. None of the above

*(the order of values in each option could be different):-*

*__***Answer: a. 8.316061 11.952463 16.212213 19.922437**

**Q2. According to the built model, the size of each cluster is (the order of values ___ in each option could be different):-**

a. 13 13 7 14

b. 11 18 25 24

c. 8 13 16 13

d. None of the above

**Answer: c. 8 13 16 13**

**These are NPTEL Data Science for Engineers Assignment 8 Answers**

**Q3. The Between Cluster Sum-of-Squares (BCSS) value of the built K-means model is ___ (Choose the appropriate range)**

a. 100 – 200

b. 200 – 300

c. 300 – 350

d. None of the above

**Answer: a. 100 – 200**

**Q4. The Total Sum-of-Squares value of the built k-means model is _____(Choose the appropriate range)**

a. 100 – 200

b. 200 – 300

c. 300 – 350

d. None of the above

**Answer: a. 100 – 200**

**These are NPTEL Data Science for Engineers Assignment 8 Answers**

**Q5. Which of the statement is INCORRECT about KNN algorithm?**

a. KNN works ONLY for binary classification problems

b. If k=1, then the algorithm is simply called the nearest neighbour algorithm

c. Number of neighbours (K) will influence classification output

d. None of the above

**Answer: a. KNN works ONLY for binary classification problems**

**Q6. K means clustering algorithm clusters the data points based on:-**

a. dependent and independent variables

b. the eigen values

c. distance between the points and a cluster centre

d. None of the above

**Answer: c. distance between the points and a cluster centre**

**These are NPTEL Data Science for Engineers Assignment 8 Answers**

**Q7. The method / metric which is NOT useful to determine the optimal number of clusters in unsupervised clustering algorithms is**

a. Scatter plot

b. Elbow method

c. Dendrogram

d. None of the above

**Answer: a. Scatter plot**

**Q8. The unsupervised learning algorithm which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest centroid is**

a. Hierarchical clustering

b. K-means clustering

c. KNN

d. None of the above

**Answer: b. K-means clustering**

**These are NPTEL Data Science for Engineers Assignment 8 Answers**

More Solutions of Data Science for Engineers: Click Here

More NPTEL Solutions: https://progiez.com/nptel/

This content is uploaded for study, general information, and reference purpose only.