TF-IDF

    0

    TF-IDF, or Term Frequency-Inverse Document Frequency, is a statistical measure used in information retrieval and text mining to evaluate the importance of a word in a document relative to a collection of documents (corpus). It combines two metrics: Term Frequency (TF), which counts how often a term appears in a document, and Inverse Document Frequency (IDF), which measures how unique or rare a term is across all documents. The TF-IDF value increases proportionally with the number of times a term appears in a document but is offset by the frequency of the term in the corpus, helping to highlight terms that are more informative and less common. This technique is widely used in search engines, document classification, and natural language processing tasks to identify and rank relevant keywords in text data.