NPTEL Business Intelligence & Analytics Week 10 Assignment Answers 2025
1. Which of the following describes a customer with low recency and low frequency but high monetary value?
- A new customer who made a single expensive purchase recently.
- A long-term customer who used to spend a lot but has stopped buying.
- A customer who frequently purchases but in small amounts.
- A customer who buys often and recently but spends little per transaction.
Answer :- For Answers Click Here
2. To avoid data leakage, we should apply ___________on the training set and__________ on the test set using StandardScaler.
- fit() , fit_transform()
- fit_transform(), fit_transform()
- fit(), transform()
- transform(), fit()
Answer :-
3. A customer prefers Android phones but decides to purchase an iPhone just because it offers the best features in the current market. This customer is an example of:
- True Friends
- Barnacles
- Butterflies
- Strangers
Answer :-
4. Which of the following best describes how RFM analysis contributes to predicting Customer Lifetime Value (CLV)?
- CLV is determined solely by Frequency, as frequent buyers generate the most revenue
- Only Recency and Frequency matter, as the amount spent is irrelevant in CLV prediction.
- RFM helps estimate CLV by assessing customer engagement, spending behavior, and likelihood of future purchases.
- Monetary Value alone is the best predictor of CLV, as higher spenders always have a higher lifetime value.
Answer :-
5. A company selling high-end luxury goods wants to optimize its customer segmentation strategy. They decide to assign higher weight to customers who spend the most money, as they contribute more revenue.
Which RFM factor is likely given the highest weight in their segmentation?
- Recency
- Frequency
- Monetary Value
- All three must be equal
Answer :- For Answers Click Here
6. A data scientist applies K-Means clustering in Python to segment customers based on their purchase behavior. To understand the characteristics of each cluster, such as the average spending pattern of customers in each group, which method should they use?
- model.labels_
- model.cluster_centers_
- model.fit_predict(data)
- model.predict(data)
Answer :-
7. Which of the following is true about the elbow method in K-Means clustering?
- The elbow point represents the highest number of clusters possible
- The inertia value increases as the number of clusters increases
- The elbow point is the cluster with the lowest variance
- The elbow point is where the rate of decrease in inertia slows significantly
Answer :-
8. Which of the following is NOT a common method for selecting the number of clusters in a dataset?
- Elbow Method
- Calinski-Harabasz Index
- Silhouette Coefficient
- Principal Component Analysis (PCA)
Answer :-
9. Which of the following is the best approach when choosing weights for RFM scoring?
- Using a predefined formula without adjusting for business goals
- Assigning weights based on what maximizes customer retention and revenue
- Keeping all three factors equal regardless of business context
- Ignoring Monetary (M) since spending doesn’t predict future behavior
Answer :-
10. How does NTILE() differ from RANK() in SQL?
- NTILE() assigns equal-sized groups, while RANK() assigns unique ranks based on values.
- Both functions perform the same role in SQL.
- NTILE() ranks values based on total sum, while RANK() divides data into quantiles.
- NTILE() is only used for Monetary (M) in RFM analysis.
Answer :-
11. Which of the following clustering algorithms is better suited than K-Means for handling noise and outliers?
- K-Means
- DBSCAN
- K-Medoids
- Both b and c
Answer :-
12. K-Means clustering minimizes _________ to improve cluster assignment accuracy.
- Silhouette score
- Sum of Squared Errors (SSE)
- Euclidean distance between all points
- Inter-cluster distance
Answer :-
13. Which of the following correctly counts the number of missing values in a Pandas DataFrame or Series?
- data.isnull().sum()
- np.count_nonzero(pd.isnull(data))
- Both (a) and (b)
- None of the above
Answer :-
14. In K-Means clustering, what does the n_init parameter control?
- The number of times the algorithm is run with different initial centroids.
- The number of clusters generated in each iteration.
- The number of iterations before stopping the algorithm.
- The number of data points assigned to each cluster.
Answer :-
15. StandardScaler transforms data by subtracting the _________and dividing by the ___________, resulting in a transformed dataset with a mean of 0 and a standard deviation of 1.
- Minimum, Maximum
- Mean, Standard Deviation
- Median, Interquartile Range
- Variance, Range
Answer :- For Answers Click Here