NPTEL Business Intelligence & Analytics Week 9 Assignment Answers 2025

NPTEL Business Intelligence & Analytics Week 9 Assignment Answers 2025

1. Which of the following statements is NOT true about clustering algorithms?

  • K-medoids algorithm uses actual data points as cluster representatives, while K-modes algorithm employs modes to assess similarity in categorical data.
  • K-means algorithm calculates the mean of points within a cluster to determine the centroid, while K-modes algorithm utilizes modes to evaluate similarity in categorical data.
  • K-medoids algorithm is generally more robust to outliers and noise compared to K-means algorithm.
  • The K-means algorithm always produces better results than K-medoids for all types of datasets.
Answer :- For Answers Click Here 

2. Which of the following statements is true about the agglomerative hierarchical clustering method?

  • Agglomerative hierarchical clustering follows a top-down approach, starting with all objects in one cluster and splitting them iteratively.
  • In agglomerative hierarchical clustering, each data point starts as an individual cluster, and clusters are merged iteratively
  • The merging process in agglomerative clustering is random and does not depend on distance measures.
  • Agglomerative hierarchical clustering requires exactly 𝑛 + 1iterations to form the final clustering structure.
Answer :- 

3. Which hierarchical clustering method computes all pairwise dissimilarities between the observations in cluster A and the observations in cluster B, and records the smallest of these dissimilarities?

  • Single linkage
  • Average linkage
  • Complete linkage
  • Centroid linkage
Answer :- 

4. A dendrogram in hierarchical clustering is a _____________ representation that shows how clusters are merged at different levels.

  • linear
  • tree-like
  • tabular
  • circular
Answer :- 

5. ___________ is an unsupervised learning algorithm that groups data points into clusters based on similarity.

  • Linear Regression
  • K-Means Clustering
  • Decision Tree
  • Logistic Regression
Answer :- 

6. A data scientist is working with a dataset where the number of fraudulent transactions is significantly lower than the number of legitimate transactions. Which technique would be most suitable to handle this class imbalance?

  • PCA
  • SMOTE
  • Decision Tree
  • t-SNE
Answer :- For Answers Click Here 

7. In a 3-dimensional space represented by coordinates (x, y, z), two cluster centroids, A and B, have coordinates A(1, 5, 8) and B (7, 3, 2). Calculate the Euclidean distance between these centroids to determine their dissimilarity. Round your answer to two decimal places.

  • 8.72 units
  • 7.11 units
  • 8.54 units
  • 9.38 units
Answer :- 

8. The elbow method in K-means clustering is commonly used to:

  • Identify the convergence threshold
  • Optimize the starting centroids
  • Determine the ideal number of clusters
  • Choose the distance metric
Answer :- 

9. What will be the Manhattan distance for observation (8, 8) from cluster centroid C1 in the second iteration?

  • 12
  • 8
  • 10
  • 14
Answer :- 

10. A dendrogram is used in _____________ clustering to visualize the merging of clusters.

  • Hierarchical
  • K-Means
  • DBSCAN
  • Spectral
Answer :- 

11. ____________ are used to determine how distances between clusters are measured in hierarchical clustering.

  • Partitioning methods
  • Linkage measures
  • Cross-validation
  • Density measures
Answer :- 

12. Which of the following best describes the divisive hierarchical clustering method?

  • Probabilistic
  • Deterministic
  • Stochastic
  • Non-parametric
Answer :- 

13. Density-based clustering methods group data points based on density, requiring that each core point’s neighborhood within a specified radius contains at least a minimum number of points.

  • True
  • False
Answer :- 

14. The clustering objective function seeks to achieve which of the following?

  • High similarity within clusters, high similarity between clusters
  • Low similarity within clusters, low similarity between clusters
  • High similarity within clusters, low similarity between clusters
  • Low similarity within clusters, high similarity between clusters
Answer :- 

15. The k-modes method is a variant of k-means that is specifically used for clustering:

  • Numerical data
  • Sequential data
  • Nominal data
  • Time-series data
Answer :- For Answers Click Here 
Scroll to Top