src.utils package

Submodules

src.utils.dbconnector module

src.utils.dbconnector.append_to_document(collection_name, query, update_data)[source]

Appends new data to an existing document in the MongoDB collection.

Parameters:
  • collection_name (str) – The name of the MongoDB collection.

  • query (dict) – The query to select the document to update.

  • update_data (dict) – The new data to be appended to the document.

Returns:

The number of documents updated.

Return type:

int

src.utils.dbconnector.content_manager(article_id, required_fields)[source]

Checks if the specified fields are present in the database for the given article_id.

Parameters:
  • article_id (str) – The ID of the article to check.

  • required_fields (list) – A list of fields to check for presence (e.g., [“content”, “summary”, “keywords”, “sentiment”]).

Returns:

A dictionary with the status of each field (True if present, False if not).

Return type:

dict

src.utils.dbconnector.fetch_and_combine_articles(collection_name, article_ids)[source]

Fetches documents from the given MongoDB collection using the given IDs and combines them into a Pandas DataFrame.

Parameters:
  • collection_name (str) – The name of the MongoDB collection.

  • article_ids (List[str]) – List of IDs of the articles to fetch and combine.

Returns:

A Pandas DataFrame containing the combined documents.

Return type:

pd.DataFrame

Raises:

Exception – If there is an error fetching and combining the documents.

src.utils.dbconnector.find_documents(collection_name, query)[source]

Finds documents in the given MongoDB collection using the given query.

Parameters:
  • collection_name (str) – The name of the MongoDB collection.

  • query (dict) – The query to select documents.

Returns:

A list of documents found by the query.

Return type:

list

Raises:

Exception – If there is an error finding documents.

src.utils.dbconnector.find_one_document(collection_name, query)[source]

Finds a single document in the given MongoDB collection using the given query.

Parameters:
  • collection_name (str) – The name of the collection.

  • query (dict) – The query to select documents.

Returns:

The selected document.

Return type:

dict

Raises:

Exception – If there is an error finding the document.

src.utils.dbconnector.get_mongo_client()[source]

Connects to MongoDB and returns the database object.

Uses environment variables for connection:

MONGO_USERNAME: username for MongoDB authentication MONGO_PASSWORD: password for MongoDB authentication MONGO_DB_NAME: name of the database to connect to

Returns:

the connected database object

Return type:

pymongo.database.Database

Raises:

Exception – if connection fails

src.utils.dbconnector.insert_document(collection_name, document)[source]

Inserts a document into the given collection.

Parameters:
  • collection_name (str) – The name of the collection.

  • document (dict) – The document to be inserted.

Returns:

The ID of the inserted document.

Return type:

str

Raises:

Exception – If there is an error inserting the document.

src.utils.logger module

src.utils.logger.setup_logger(log_file='app.log')[source]

Sets up a logger with a console handler and a rotating file handler.

The console handler has color coding for different log levels, while the file handler does not. The file handler will rotate the log file every 5MB, keeping up to 5 backups.

Parameters:

log_file (str) – The name of the log file to write to. Defaults to “app.log”.

Returns:

The configured logger.

Return type:

logger (logging.Logger)

Module contents