A statistical measure used to evaluate the importance of a word in a document relative to a collection of documents.
How It Works
TF-IDF stands for Term Frequency-Inverse Document Frequency. It calculates the importance of a term in a document by considering how often it appears in the document and how rare it is across the entire document set.
Technical Details
TF-IDF is calculated by multiplying the term frequency (TF) by the inverse document frequency (IDF). TF is the number of times a term appears in a document, and IDF is the logarithm of the total number of documents divided by the number of documents containing the term.
Best Practices
Use TF-IDF for keyword extraction
Combine with other metrics for comprehensive analysis