src.preprocessing package
Submodules
src.preprocessing.keyword_extraction module
- src.preprocessing.keyword_extraction.bert_keyword_extraction(texts: List[str], top_n: int = 10) List[str][source]
Extracts keywords from a list of texts using KeyBERT.
- Parameters:
texts (List[str]) – List of texts to extract keywords from.
top_n (int) – Number of top keywords to extract per text.
- Returns:
List of unique extracted keywords.
- Return type:
List[str]
- src.preprocessing.keyword_extraction.extract_keywords(article_ids, top_n: int = 10)[source]
Extracts keywords from a list of texts using KeyBERT.
- Parameters:
texts (List[str]) – List of texts to extract keywords from.
top_n (int) – Number of top keywords to extract per text.
- Returns:
It returns something else not a list of list of str. List[List[str]]: List of keyword lists for each text.