Skip to content

Logging Configuration

Centralized logging setup for development and production environments.


Overview

The Are You Not Entertained (AYNE) project uses a centralized logging configuration (src/core/logging.py) that provides:

  • JSON logging for production/cloud environments (Azure, AWS, Docker)
  • Colorized console logging for local development
  • Structured log filtering to suppress noisy third-party libraries
  • Consistent formatting across all modules

This replaces the previous logger.py module with a more robust, production-ready solution.

Quick Start

Basic Usage

from ayne.core.logging import get_logger

# Get a logger for your module
logger = get_logger(__name__)

# Log messages
logger.info("Processing started")
logger.warning("Data quality issue detected")
logger.error("Failed to load file", exc_info=True)

Pipeline Configuration

The CLI automatically configures logging based on verbosity flags:

# In your own scripts using the library
from ayne.core.logging import configure_logging

# Configure logging once at application startup
configure_logging(
    level="INFO",        # DEBUG, INFO, WARNING, ERROR, CRITICAL
    use_json=False,      # True for production, False for development
    include_uvicorn=False  # True if running with uvicorn/FastAPI
)

Configuration Options

Log Levels

Level When to Use
DEBUG Detailed diagnostic information (variable values, flow control)
INFO General operational messages (process started, completed, counts)
WARNING Potentially problematic situations (missing optional files, degraded performance)
ERROR Error events that might still allow the application to continue
CRITICAL Severe errors causing premature termination

Output Formats

Development (Colorized)

configure_logging(level="INFO", use_json=False)

Output:

2025-12-08 14:32:15 | INFO     | ayne.data_collection.tmdb.client | Fetching movie details from TMDB
2025-12-08 14:32:16 | INFO     | ayne.data_collection.tmdb.client | Loaded 1,234 movies from TMDB
2025-12-08 14:32:17 | WARNING  | ayne.database.duckdb_client  | Movie already exists in database
2025-12-08 14:32:18 | ERROR    | ayne.data_collection.omdb.client | OMDB API rate limit exceeded

Features:

  • Color-coded log levels (Green=INFO, Yellow=WARNING, Red=ERROR)
  • Human-readable timestamps
  • Clear module names
  • Easy to scan visually

Production (JSON)

configure_logging(level="INFO", use_json=True)

Output:

{"timestamp": "2025-12-08 14:32:15", "level": "INFO", "logger": "ayne.data_collection.tmdb.client", "message": "Fetching movie details", "module": "client", "process": 12345, "thread": 67890}
{"timestamp": "2025-12-08 14:32:16", "level": "INFO", "logger": "ayne.data_collection.tmdb.client", "message": "Loaded 1,234 movies", "module": "client", "process": 12345, "thread": 67890}

Features:

  • Structured JSON for log aggregation tools
  • Parseable by Azure Monitor, CloudWatch, Datadog, Loki
  • Includes metadata (process ID, thread ID, module)
  • Easy to query and filter

Advanced Usage

Custom Context Fields

Add custom fields to log entries:

import logging

logger = get_logger(__name__)

# Create a log record with extra fields
extra_fields = {
    "request_id": "abc-123",
    "user_id": "user_456",
    "correlation_id": "xyz-789"
}

# Log with context
record = logging.LogRecord(
    name=logger.name,
    level=logging.INFO,
    pathname="",
    lineno=0,
    msg="Processing user request",
    args=(),
    exc_info=None
)
record.extra_fields = extra_fields
logger.handle(record)

JSON Output:

{
    "timestamp": "2025-11-19 14:32:15",
    "level": "INFO",
    "message": "Processing user request",
    "request_id": "abc-123",
    "user_id": "user_456",
    "correlation_id": "xyz-789"
}

Exception Logging

Always include exception info for error logs:

try:
    process_data(file_path)
except Exception as e:
    logger.error(f"Failed to process {file_path}: {e}", exc_info=True)
    raise

Benefit: Full stack traces are captured in logs for debugging.

Suppressing Noisy Libraries

The logging configuration automatically suppresses verbose output from:

  • chromadb - Telemetry messages
  • sentence_transformers - Model loading details
  • transformers - Tokenizer warnings
  • httpx - HTTP request details

To suppress additional libraries:

import logging

# In your configure_logging() setup
logging.getLogger("some_noisy_library").setLevel(logging.ERROR)

Best Practices

✅ Do's

  1. Use appropriate log levels:
logger.info("Processing 1,234 records")        # Normal operation
logger.warning("Using default value for X")     # Potential issue
logger.error("Failed to connect to database")   # Actual error
  1. Include relevant context:
logger.info(f"Processed {count:,} records in {duration:.2f}s")
logger.error(f"File not found: {file_path}")
  1. Use f-strings for formatting:
logger.info(f"User {user_id} completed action {action}")
  1. Log at module boundaries:
def process_fusion(year, month):
    logger.info(f"Starting fusion for {year}-{month:02d}")
    # ... processing ...
    logger.info(f"Fusion complete: {total_records:,} records")

❌ Don'ts

  1. Don't over-log in tight loops:
# BAD - logs 10,000 times
for i in range(10000):
    logger.debug(f"Processing item {i}")

# GOOD - logs once or periodically
logger.info(f"Processing {len(items):,} items")
for i, item in enumerate(items):
    if i % 1000 == 0:
        logger.debug(f"Progress: {i:,}/{len(items):,}")
  1. Don't log sensitive data:
# BAD
logger.info(f"Password: {password}")

# GOOD
logger.info("Authentication successful")
  1. Don't use print() statements:
# BAD
print("Processing started")

# GOOD
logger.info("Processing started")
  1. Don't log before configuring:
# BAD - logger not configured yet
logger = get_logger(__name__)
logger.info("Starting...")
configure_logging()

# GOOD - configure first
configure_logging()
logger = get_logger(__name__)
logger.info("Starting...")

Migration from Old Logger

Changes Required

Old Code (logger.py):

from ayne.core.logger import setup_logger, get_logger

# Setup with file handler
logger = setup_logger(
    name="ayne.module",
    level=logging.INFO,
    log_file="logs/module.log",
    console=True
)

New Code (logging.py):

from ayne.core.logging import configure_logging, get_logger

# Configure once at app startup
configure_logging(level="INFO", use_json=False)

# Get logger in each module
logger = get_logger(__name__)

Key Differences

Feature Old (logger.py) New (logging.py)
Configuration Per-module setup Global configuration
Format Console only JSON or colorized
Filtering Manual Automatic for known libraries
Production Ready No Yes (JSON logging)
File Logging Per-module files Centralized (via handlers)

Deployment Considerations

Local Development

# Use colorized logging
configure_logging(level="DEBUG", use_json=False)

Docker/Kubernetes

# Use JSON logging for container logs
import os
use_json = os.getenv("LOG_FORMAT", "json") == "json"
configure_logging(level="INFO", use_json=use_json)

Azure/AWS

# Enable JSON logging for cloud log aggregation
configure_logging(level="INFO", use_json=True)

CI/CD Pipeline

# Use plain text for readable build logs
configure_logging(level="INFO", use_json=False)

Troubleshooting

Logs Not Appearing

Problem: No log output is shown.

Solution:

# Ensure configure_logging() is called before any get_logger()
configure_logging(level="INFO", use_json=False)
logger = get_logger(__name__)
logger.info("Test message")  # Should now appear

Too Much Log Output

Problem: Logs are overwhelming with debug messages.

Solution:

# Use INFO level instead of DEBUG
configure_logging(level="INFO", use_json=False)

JSON Logs Not Parsing

Problem: JSON logs are malformed in log aggregation tool.

Solution:

# Ensure use_json=True is set
configure_logging(level="INFO", use_json=True, include_uvicorn=False)

Duplicate Log Messages

Problem: Same log message appears multiple times.

Solution:

# Don't call configure_logging() multiple times
# Call it once in main() or __main__ block

if __name__ == "__main__":
    configure_logging(level="INFO", use_json=False)
    main()  # All modules will use this configuration

Last Updated: November 19, 2025 Logging Module: src/ayne/core/logging.py