Business Intelligence and Analytics Nptel Week 12 Answer

Last updated: April 6, 2025

Are you looking for Nptel Business Intelligence and Analytics Week 12 Answers ? You are here at right place for assignment answers.

Nptel Business Intelligence and Analytics Week 12 Answers (Jan-Apr 2025)

Course link: Click here

1. What is the process of breaking text into smaller units called in text mining?

a) Lemmatization
b) Stopword removal
c) Stemming
d) Tokenization

View Answer

2. In sentiment analysis, which dataset assigns integer values to words based on their positive or negative strength?

a) NRC sentiment dataset
b) SentiWordNet
c) AFINN sentiment dataset
d) VADER sentiment dataset

View Answer

3. A cybersecurity system uses TF-IDF to detect suspicious keywords in threat reports. If a specific term appears frequently in one report but rarely in others, what does its term frequency (TF) measure?

a) The number of reports containing the term
b) The total number of words in the security database
c) The similarity between different threat reports
d) The term’s relative importance within that threat report

View Answer

4. A higher Phi coefficient in word co-occurrence analysis suggests:

a) No relationship between two words
b) A strong association between two words appearing together
c) A weak correlation between two words
d) That one word is always followed by the other in a sequence

View Answer

Nptel Business Intelligence and Analytics Week 12 Answers

5. What advantage do bigrams provide in text analysis compared to single words?

a) Bigrams eliminate the need for stopword removal
b) Bigrams are better than all other types of n-grams
c) Bigrams always have higher frequency counts than individual words
d) Bigrams offer more context and capture structural relationships between words

View Answer

6. A search engine computes the cosine similarity between a user query and multiple documents. If one document has a similarity score of 0, what does this imply?

a) The document shares no common words with the query
b) The document has a partial match with the query
c) The document has the highest relevance
d) The document is highly relevant to the query

View Answer

Nptel Business Intelligence and Analytics Week 12 Answers

7. In a dataset of 250 research papers, the words “ocean” and “reef” do not appear together in 15 papers. However, both “ocean” and “reef” are found in 60 papers. Meanwhile, “ocean” appears alone in 20 papers, and “reef” is found without “ocean” in 25 papers. Based on this data, what is the Phi coefficient measuring the correlation between the occurrence of “ocean” and “reef” in this dataset?

a) 0.69
b) 0.95
c) 0.21
d) 0.88

View Answer

8. If a word appears in only one document of a corpus, what can be said about its IDF score?

a) It will be negative
b) It will be low
c) It will be high
d) It will be zero

View Answer

9. What is the main role of Inverse Document Frequency (IDF) in TF-IDF analysis?

a) Assign higher importance to common words like “the” and “is”
b) Rank documents based on total word count
c) Reduce the weight of frequently occurring words across documents
d) Ensure all words are treated equally

View Answer

10. In text classification, how does the Bag of Words model process text?

a) Uses deep learning to understand the meaning of words
b) Converts words into numerical representations based on presence and frequency
c) Ignores word frequency and focuses only on synonyms
d) Retains sentence structure while analyzing text

View Answer

Nptel Business Intelligence and Analytics Week 12 Answers

11. Cluster profiling is used to determine the optimal number of clusters in a dataset.

a) True
b) False

View Answer

12. Cross-sectional data is collected from the same subjects over multiple time periods.

a) True
b) False

View Answer

13. A data scientist is preprocessing text for a sentiment analysis model. What would they likely do with words like “the,” “and,” “is,” and “of”?

a) Remove them as stop words
b) Convert them into their root forms
c) Merge them into a single feature
d) Assign them higher weights for analysis

View Answer

14. Which of the following is NOT a function of stemming or lemmatization?

a) Reducing different word variations to a common form
b) Enhancing text classification by normalizing words
c) Improving text search accuracy
d) Converting text into numerical vectors

View Answer

15. Which of the following is NOT an example of a corpus?

a) A collection of legal documents used in NLP
b) A dataset of medical research papers
c) A collection of all Shakespeare’s works
d) A single email from a spam filter dataset

View Answer

Nptel Business Intelligence and Analytics Week 12 Answers

For answers or latest updates join our telegram channel: Click here to join

More Answers of Nptel Business Intelligence & Analytics: Click here

For answers to additional Nptel courses, please refer to this link: NPTEL Assignment

Nptel Business Intelligence and Analytics Week 12 Answers

Table of Contents

Nptel Business Intelligence and Analytics Week 12 Answers (Jan-Apr 2025)