NPTEL Business Intelligence & Analytics Week 12 Assignment Answers 2025

1. What is the process of breaking text into smaller units called in text mining?

Lemmatization
Stopword removal
Stemming
Tokenization

Answer :- For Answers Click Here

2. In sentiment analysis, which dataset assigns integer values to words based on their positive or negative strength?

NRC sentiment dataset
SentiWordNet
AFINN sentiment dataset
VADER sentiment dataset

Answer :-

3. A cybersecurity system uses TF-IDF to detect suspicious keywords in threat reports. If a specific term appears frequently in one report but rarely in others, what does its term frequency (TF) measure?

The number of reports containing the term
The total number of words in the security database
The similarity between different threat reports
The term’s relative importance within that threat report

Answer :-

4. A higher Phi coefficient in word co-occurrence analysis suggests:

No relationship between two words
A strong association between two words appearing together
A weak correlation between two words
That one word is always followed by the other in a sequence

Answer :-

5. What advantage do bigrams provide in text analysis compared to single words?

Bigrams eliminate the need for stopword removal
Bigrams are better than all other types of n-grams
Bigrams always have higher frequency counts than individual words
Bigrams offer more context and capture structural relationships between words

Answer :- For Answers Click Here

6. A search engine computes the cosine similarity between a user query and multiple documents. If one document has a similarity score of 0, what does this imply?

The document shares no common words with the query
The document has a partial match with the query
The document has the highest relevance
The document is highly relevant to the query

Answer :-

7. In a dataset of 250 research papers, the words “ocean” and “reef” do not appear together in 15 papers. However, both “ocean” and “reef” are found in 60 papers. Meanwhile, “ocean” appears alone in 20 papers, and “reef” is found without “ocean” in 25 papers. Based on this data, what is the Phi coefficient measuring the correlation between the occurrence of “ocean” and “reef” in this dataset?

0.69
0.95
0.21
0.88

Answer :-

8. If a word appears in only one document of a corpus, what can be said about its IDF score?

It will be negative
It will be low
It will be high
It will be zero

Answer :-

9. What is the main role of Inverse Document Frequency (IDF) in TF-IDF analysis?

Assign higher importance to common words like “the” and “is”
Rank documents based on total word count
Reduce the weight of frequently occurring words across documents
Ensure all words are treated equally

Answer :-

10. In text classification, how does the Bag of Words model process text?

Uses deep learning to understand the meaning of words
Converts words into numerical representations based on presence and frequency
Ignores word frequency and focuses only on synonyms
Retains sentence structure while analyzing text

Answer :-

11. Cluster profiling is used to determine the optimal number of clusters in a dataset.

True
False

Answer :-

12. Cross-sectional data is collected from the same subjects over multiple time periods.

True
False

Answer :-

13. A data scientist is preprocessing text for a sentiment analysis model. What would they likely do with words like “the,” “and,” “is,” and “of”?

Remove them as stop words
Convert them into their root forms
Merge them into a single feature
Assign them higher weights for analysis

Answer :-

14. Which of the following is NOT a function of stemming or lemmatization?

Reducing different word variations to a common form
Enhancing text classification by normalizing words
Improving text search accuracy
Converting text into numerical vectors

Answer :-

15. Which of the following is NOT an example of a corpus?

A collection of legal documents used in NLP
A dataset of medical research papers
A collection of all Shakespeare’s works
A single email from a spam filter dataset

Answer :- For Answers Click Here

NPTEL Business Intelligence & Analytics Week 12 Assignment Answers 2025

NPTEL Business Intelligence & Analytics Week 12 Assignment Answers 2025

Related Posts