src.utils package
Submodules
src.utils.dbconnector module
- src.utils.dbconnector.append_to_document(collection_name, query, update_data)[source]
Appends new data to an existing document in the MongoDB collection.
- Parameters:
collection_name (str) – The name of the MongoDB collection.
query (dict) – The query to select the document to update.
update_data (dict) – The new data to be appended to the document.
- Returns:
The number of documents updated.
- Return type:
int
- src.utils.dbconnector.content_manager(article_id, required_fields)[source]
Checks if the specified fields are present in the database for the given article_id.
- Parameters:
article_id (str) – The ID of the article to check.
required_fields (list) – A list of fields to check for presence (e.g., [“content”, “summary”, “keywords”, “sentiment”]).
- Returns:
A dictionary with the status of each field (True if present, False if not).
- Return type:
dict
- src.utils.dbconnector.fetch_and_combine_articles(collection_name, article_ids)[source]
Fetches documents from the given MongoDB collection using the given IDs and combines them into a Pandas DataFrame.
- Parameters:
collection_name (str) – The name of the MongoDB collection.
article_ids (List[str]) – List of IDs of the articles to fetch and combine.
- Returns:
A Pandas DataFrame containing the combined documents.
- Return type:
pd.DataFrame
- Raises:
Exception – If there is an error fetching and combining the documents.
- src.utils.dbconnector.find_documents(collection_name, query)[source]
Finds documents in the given MongoDB collection using the given query.
- Parameters:
collection_name (str) – The name of the MongoDB collection.
query (dict) – The query to select documents.
- Returns:
A list of documents found by the query.
- Return type:
list
- Raises:
Exception – If there is an error finding documents.
- src.utils.dbconnector.find_one_document(collection_name, query)[source]
Finds a single document in the given MongoDB collection using the given query.
- Parameters:
collection_name (str) – The name of the collection.
query (dict) – The query to select documents.
- Returns:
The selected document.
- Return type:
dict
- Raises:
Exception – If there is an error finding the document.
- src.utils.dbconnector.get_mongo_client()[source]
Connects to MongoDB and returns the database object.
- Uses environment variables for connection:
MONGO_USERNAME: username for MongoDB authentication MONGO_PASSWORD: password for MongoDB authentication MONGO_DB_NAME: name of the database to connect to
- Returns:
the connected database object
- Return type:
pymongo.database.Database
- Raises:
Exception – if connection fails
- src.utils.dbconnector.insert_document(collection_name, document)[source]
Inserts a document into the given collection.
- Parameters:
collection_name (str) – The name of the collection.
document (dict) – The document to be inserted.
- Returns:
The ID of the inserted document.
- Return type:
str
- Raises:
Exception – If there is an error inserting the document.
src.utils.logger module
- src.utils.logger.setup_logger(log_file='app.log')[source]
Sets up a logger with a console handler and a rotating file handler.
The console handler has color coding for different log levels, while the file handler does not. The file handler will rotate the log file every 5MB, keeping up to 5 backups.
- Parameters:
log_file (str) – The name of the log file to write to. Defaults to “app.log”.
- Returns:
The configured logger.
- Return type:
logger (logging.Logger)