NPTEL Data Analytics with Python Week 11 Assignment Answers 2025

NPTEL Data Analytics with Python Week 11 Assignment Answers 2025

1. Which library is used for calculating distance measures in clustering using python?

  • distance_matrix
  • scipy.spatial
  • scipy_spatial
  • distance.matrix
Answer :- For Answers Click Here 

2. Formula for dissimilarity computation between two objects for categorical variables is – Here p is a categorical variable and m denotes the number of matches.

  • D(i, j) = p-m / p
  • D(i, j) = p-m / m
  • D(i, j) = m-p / p
  • D(i, j) = m-p / m
Answer :- 

3. Select the correct option for a data set with 7 objects and an interval-scaled variable ‘f’ we have the following measurements:

f = (1, 2, 3, 4, 5, 8, 50)

containing one outlying value.

  • Std deviation (std_f) and mean absolute deviation (s_f) are having the same effect of the outlier.
  • Mean absolute deviation (s_f) is more affected by the outlier
  • Std deviation (std_f) is less affected by the outlier
  • Std deviation(std_f) is more affected by the outlier.
Answer :- 

4. Select the correct statement about the standardization in the following options –

  • Standardizing the data always gives inefficient result while making clusters
  • Standardizing the data always beneficial during clustering analysis
  • The variables having an absolute value may not efficient after standardization during clustering analysis
  • Outliers can not be detected by standardized data
Answer :- 

5. Which of the following can act as possible termination conditions in K-Means?

1. For a fixed number of iterations.
2. Assignment of observations to clusters does not change between iterations. Except for cases with a bad local minimum.
3. Centroids do not change between successive iterations.
4. Terminate when RSS falls below a threshold.

  • 1,3, and 4
  • 1,2,3 and 4
  • 2 and 3
  • None of these
Answer :- For Answers Click Here 

6. In the figure below (see attached drive link), if you draw a horizontal line on y-axis for y=2. What will be the number of clusters formed?
https://drive.google.com/file/d/1pZaKZa6CDK-Hzn0Iar6LTWPA_zlpStW7/view?usp=sharing

  • 1
  • 2
  • 3
  • 4
Answer :- 

7. Which of the following clustering requires merging approach?

  • Partitional
  • Naive Bayes
  • Hierarchical
  • None of the above
Answer :- 

8. State True or False: Hierarchical clustering should primarily be used for data exploration

  • True
  • False
Answer :- 

9. State True or False: For finding dissimilarity between two clusters in hierarchical clustering, average-link is the only metric used

  • True
  • False
Answer :- 

10. If two variables V1 and V2, are used for clustering. Which of the following are true for K means clustering with k =3?

1. If V1 and V2 has a correlation of 1, the cluster centroids will be in a straight line
2. If V1 and V2 has a correlation of 0, the cluster centroids will be in straight line

  • 1 only
  • 2 only
  • 1 and 2
  • None of the above
Answer :- For Answers Click Here 
Scroll to Top