Business Intelligence and Analytics Nptel Week 8 Answers

Are you looking for Nptel Business Intelligence and Analytics Week 8 Answers ? You are here at right place for assignment answers.

Nptel Business Intelligence and Analytics Week 8 Answers (Jan-Apr 2025)

Course link: Click here

In a RandomForestClassifier, what does setting n_estimators=100 mean?
a) The number of samples used for each tree
b) The number of trees in the forest
c) The number of features considered at each split
d) The maximum depth of each tree

Nptel Business Intelligence and Analytics Week 8 Answers

Imagine you have a decision tree that perfectly fits your training data but performs poorly on test data. What technique can help address this issue?
a) Increasing the tree depth.
b) Growing additional branches on the tree
c) Adding more features to the dataset.
d) Applying pruning to avoid overfitting.

If you want to visualize the decision-making process of a decision tree x in scikit-learn, which function would you use?
a) tree.plot_tree(x)
b) tree.evaluate_tree(x)
c) tree.train_tree(x)
d) tree.prune_tree(x)

Nptel Business Intelligence and Analytics Week 8 Answers

Using scikit-learn, you generate a classification report to evaluate a model predicting exam grade categories. Which of the following metrics is NOT included in the report?
a) Precision
b) Recall
c) F1-score
d) Mean Squared Error

In a Random Forest classifier, what function does the Gini criterion serve?
a) To measure the prediction accuracy of the forest.
b) To identify the most important features for classification.
c) To determine the best feature for splitting at each node in a decision tree.
d) To calculate the purity of the final node.

Nptel Business Intelligence and Analytics Week 8 Answers

What action does rf.fit(X_train, Y_train) perform when working with a Random Forest classifier in scikit-learn?
a) It selects the best features for the training data
b) It trains the Random Forest model using the provided training data
c) It predicts the target values for a new set of inputs.
d) It calculates the training data accuracy.

You are using the make_classification function from scikit-learn to generate a dataset for predicting whether graduate-level students will pass or fail a course. The dataset includes features such as hours of study, previous grades, and mental health factors. What does the n_samples parameter control in this function?
a) The number of features (e.g., study hours, grades, mental health factors).
b) The number of classes (e.g., Pass/Fail) in the dataset.
c) The number of data points (students) to be generated, including all their characteristics.
d) The degree of noise in the dataset, such as random errors in student data.

What function does ccp_alpha serve in decision tree pruning in scikit-learn?
a) Defines the minimum number of leaf nodes required.
b) Controls the number of samples required to split a node.
c) Sets the threshold that helps decide which nodes to prune based on cost complexity.
d) Determines the maximum depth of the tree.

When using the roc_curve function from sklearn.metrics, which of the following statements is true?
a) The roc_curve function computes the precision-recall curve for binary classification models.
b) The roc_curve function requires predicted probabilities or decision function scores as inputs, not just class labels.
c) The roc_curve function can be used to evaluate multi-class classification models without modifications.
d) The roc_curve function returns the Receiver Operating Characteristic (ROC) curve plot by default.

Nptel Business Intelligence and Analytics Week 8 Answers

In NumPy, what is the output of the np.shape function when applied to an array?
a) The data type of the array
b) A tuple representing the size of each dimension of the array.
c) The number of dimensions of the array.
d) The total number of elements in the array.

Which of the following best describes the difference between classification and regression trees in the CART algorithm?
a) Classification trees predict continuous variables, while regression trees predict categorical variables.
b) Regression trees predict categorical variables, while classification trees predict continuous variables.
c) Classification trees predict categorical variables, while regression trees predict continuous variables.
d) Regression trees are computationally expensive, while classification trees are computationally inexpensive.

What is the role of entropy in decision trees?
a) It measures the accuracy of the tree’s predictions.
b) It determines the optimal number of splits in the tree.
c) It quantifies the disorder or impurity in a node.
d) It calculates the variance of the data in each node.

Nptel Business Intelligence and Analytics Week 8 Answers

You are creating a model to identify spam emails that you receive in your college mail ID. Which of the following defines a False Positive (FP) in this context?
a) An email correctly identified as not spam
b) An email predicted as not spam that is actually spam
c) An email predicted as spam that is actually not spam
d) An email correctly identified as spam

If the true positive value is 20 and the false negative value is 5, what is the recall score for the classification model?
a) 0.8
b) 0.9
c) 0.7
d) None of the above

Which method is used to evaluate a model by splitting the data into multiple subsets?
a) Gradient Descent
b) Cross-validation
c) Principal Component Analysis
d) Regularization

Nptel Business Intelligence and Analytics Week 8 Answers

For answers or latest updates join our telegram channel: Click here to join

More Answers of Nptel Business Intelligence & Analytics: Click here

For answers to additional Nptel courses, please refer to this link: NPTEL Assignment