NPTEL Business Intelligence & Analytics Week 8 Assignment Answers 2025

NPTEL Business Intelligence & Analytics Week 8 Assignment Answers 2025

1. In a RandomForestClassifier, what does setting n_estimators=100 mean?

  • The number of samples used for each tree
  • The number of trees in the forest
  • The number of features considered at each split
  • The maximum depth of each tree
Answer :- For Answers Click Here 

2. 1 point
Imagine you have a decision tree that perfectly fits your training data but performs poorly on test data. What technique can help address this issue?

  • Increasing the tree depth.
  • Growing additional branches on the tree
  • Adding more features to the dataset.
  • Applying pruning to avoid overfitting.
Answer :- 

3. If you want to visualize the decision-making process of a decision tree x in scikit-learn, which function would you use?

  • tree.plot_tree(x)
  • tree.evaluate_tree(x)
  • tree.train_tree(x)
  • tree.prune_tree(x)
Answer :- 

4. Using scikit-learn, you generate a classification report to evaluate a model predicting exam grade categories. Which of the following metrics is NOT included in the report?

  • Precision
  • Recall
  • F1-score
  • Mean Squared Error
Answer :- 

5. In a Random Forest classifier, what function does the Gini criterion serve?

  • To measure the prediction accuracy of the forest.
  • To identify the most important features for classification.
  • To determine the best feature for splitting at each node in a decision tree.
  • To calculate the purity of the final node.
Answer :- 

6. What action does rf.fit (X_train, Y_train) perform when working with a Random Forest classifier in scikit-learn?

  • It selects the best features for the training data
  • It trains the Random Forest model using the provided training data
  • It predicts the target values for a new set of inputs.
  • It calculates the training data accuracy.
Answer :- 

7. You are using the make_classification function from scikit-learn to generate a dataset for predicting whether graduate level students will pass or fail a course. The dataset includes features such as hours of study, previous grades, and mental health factors. What does the n_samples parameter control in this function?

  • The number of features (e.g., study hours, grades, mental health factors).
  • The number of classes (e.g., Pass/Fail) in the dataset.
  • The number of data points (students) to be generated, including all their characteristics.
  • The degree of noise in the dataset, such as random errors in student data.
Answer :- 

8. What function does ccp_alpha serve in decision tree pruning in scikit-learn?

  • Defines the minimum number of leaf nodes required.
  • Controls the number of samples required to split a node
  • Sets the threshold that helps decide which nodes to prune based on cost complexity.
  • Determines the maximum depth of the tree.
Answer :- 

9. When using the roc_curve function from sklearn.metrics, which of the following statements is true?

  • The roc_curve function computes the precision-recall curve for binary classification models.
  • The roc_curve function requires predicted probabilities or decision function scores as inputs, not just class labels.
  • The roc_curve function can be used to evaluate multi-class classification models without modifications.
  • The roc_curve function returns the Receiver Operating Characteristic (ROC) curve plot by default.
Answer :- 

10. In NumPy, what is the output of the np.shape function when applied to an array?

  • The data type of the array
  • A tuple representing the size of each dimension of the array.
  • The number of dimensions of the array.
  • The total number of elements in the array
Answer :- For Answers Click Here 

11. Which of the following best describes the difference between classification and regression trees in the CART algorithm?

  • Classification trees predict continuous variables, while regression trees predict categorical variables.
  • Regression trees predict categorical variables, while classification trees predict continuous variables.
  • Classification trees predict categorical variables, while regression trees predict continuous variables.
  • Regression trees are computationally expensive, while classification trees are computationally inexpensive.
Answer :- 

12. What is the role of entropy in decision trees?

  • It measures the accuracy of the tree’s predictions
  • It determines the optimal number of splits in the tree
  • It quantifies the disorder or impurity in a node
  • It calculates the variance of the data in each node
Answer :- 

13. You are creating a model to identify spam emails that you receive in your college mail ID. Which of the following defines a False Positive (FP) in this context?

  • An email correctly identified as not spam
  • An email predicted as not spam that is actually spam
  • An email predicted as spam that is actually not spam
  • An email correctly identified as spam
Answer :- 

14.If the true positive value is 20 and the false negative value is 5, what is the recall score for the classification model?

  • 0.8
  • 0.9
  • 0.7
  • None of the above
Answer :- 

15. Which method is used to evaluate a model by splitting the data into multiple subsets?

  • Gradient Descent
  • Cross-validation
  • Principal Component Analysis
  • Regularization
Answer :- For Answers Click Here 
Scroll to Top