Logging Configuration
Centralized logging setup for development and production environments.
Overview
The Are You Not Entertained (AYNE) project uses a centralized logging configuration (src/core/logging.py) that provides:
- JSON logging for production/cloud environments (Azure, AWS, Docker)
- Colorized console logging for local development
- Structured log filtering to suppress noisy third-party libraries
- Consistent formatting across all modules
This replaces the previous logger.py module with a more robust, production-ready solution.
Quick Start
Basic Usage
from ayne.core.logging import get_logger
# Get a logger for your module
logger = get_logger(__name__)
# Log messages
logger.info("Processing started")
logger.warning("Data quality issue detected")
logger.error("Failed to load file", exc_info=True)
Pipeline Configuration
The CLI automatically configures logging based on verbosity flags:
# In your own scripts using the library
from ayne.core.logging import configure_logging
# Configure logging once at application startup
configure_logging(
level="INFO", # DEBUG, INFO, WARNING, ERROR, CRITICAL
use_json=False, # True for production, False for development
include_uvicorn=False # True if running with uvicorn/FastAPI
)
Configuration Options
Log Levels
| Level | When to Use |
|---|---|
DEBUG |
Detailed diagnostic information (variable values, flow control) |
INFO |
General operational messages (process started, completed, counts) |
WARNING |
Potentially problematic situations (missing optional files, degraded performance) |
ERROR |
Error events that might still allow the application to continue |
CRITICAL |
Severe errors causing premature termination |
Output Formats
Development (Colorized)
Output:
2025-12-08 14:32:15 | INFO | ayne.data_collection.tmdb.client | Fetching movie details from TMDB
2025-12-08 14:32:16 | INFO | ayne.data_collection.tmdb.client | Loaded 1,234 movies from TMDB
2025-12-08 14:32:17 | WARNING | ayne.database.duckdb_client | Movie already exists in database
2025-12-08 14:32:18 | ERROR | ayne.data_collection.omdb.client | OMDB API rate limit exceeded
Features:
- Color-coded log levels (Green=INFO, Yellow=WARNING, Red=ERROR)
- Human-readable timestamps
- Clear module names
- Easy to scan visually
Production (JSON)
Output:
{"timestamp": "2025-12-08 14:32:15", "level": "INFO", "logger": "ayne.data_collection.tmdb.client", "message": "Fetching movie details", "module": "client", "process": 12345, "thread": 67890}
{"timestamp": "2025-12-08 14:32:16", "level": "INFO", "logger": "ayne.data_collection.tmdb.client", "message": "Loaded 1,234 movies", "module": "client", "process": 12345, "thread": 67890}
Features:
- Structured JSON for log aggregation tools
- Parseable by Azure Monitor, CloudWatch, Datadog, Loki
- Includes metadata (process ID, thread ID, module)
- Easy to query and filter
Advanced Usage
Custom Context Fields
Add custom fields to log entries:
import logging
logger = get_logger(__name__)
# Create a log record with extra fields
extra_fields = {
"request_id": "abc-123",
"user_id": "user_456",
"correlation_id": "xyz-789"
}
# Log with context
record = logging.LogRecord(
name=logger.name,
level=logging.INFO,
pathname="",
lineno=0,
msg="Processing user request",
args=(),
exc_info=None
)
record.extra_fields = extra_fields
logger.handle(record)
JSON Output:
{
"timestamp": "2025-11-19 14:32:15",
"level": "INFO",
"message": "Processing user request",
"request_id": "abc-123",
"user_id": "user_456",
"correlation_id": "xyz-789"
}
Exception Logging
Always include exception info for error logs:
try:
process_data(file_path)
except Exception as e:
logger.error(f"Failed to process {file_path}: {e}", exc_info=True)
raise
Benefit: Full stack traces are captured in logs for debugging.
Suppressing Noisy Libraries
The logging configuration automatically suppresses verbose output from:
chromadb- Telemetry messagessentence_transformers- Model loading detailstransformers- Tokenizer warningshttpx- HTTP request details
To suppress additional libraries:
import logging
# In your configure_logging() setup
logging.getLogger("some_noisy_library").setLevel(logging.ERROR)
Best Practices
✅ Do's
- Use appropriate log levels:
logger.info("Processing 1,234 records") # Normal operation
logger.warning("Using default value for X") # Potential issue
logger.error("Failed to connect to database") # Actual error
- Include relevant context:
logger.info(f"Processed {count:,} records in {duration:.2f}s")
logger.error(f"File not found: {file_path}")
- Use f-strings for formatting:
- Log at module boundaries:
def process_fusion(year, month):
logger.info(f"Starting fusion for {year}-{month:02d}")
# ... processing ...
logger.info(f"Fusion complete: {total_records:,} records")
❌ Don'ts
- Don't over-log in tight loops:
# BAD - logs 10,000 times
for i in range(10000):
logger.debug(f"Processing item {i}")
# GOOD - logs once or periodically
logger.info(f"Processing {len(items):,} items")
for i, item in enumerate(items):
if i % 1000 == 0:
logger.debug(f"Progress: {i:,}/{len(items):,}")
- Don't log sensitive data:
- Don't use print() statements:
- Don't log before configuring:
# BAD - logger not configured yet
logger = get_logger(__name__)
logger.info("Starting...")
configure_logging()
# GOOD - configure first
configure_logging()
logger = get_logger(__name__)
logger.info("Starting...")
Migration from Old Logger
Changes Required
Old Code (logger.py):
from ayne.core.logger import setup_logger, get_logger
# Setup with file handler
logger = setup_logger(
name="ayne.module",
level=logging.INFO,
log_file="logs/module.log",
console=True
)
New Code (logging.py):
from ayne.core.logging import configure_logging, get_logger
# Configure once at app startup
configure_logging(level="INFO", use_json=False)
# Get logger in each module
logger = get_logger(__name__)
Key Differences
| Feature | Old (logger.py) |
New (logging.py) |
|---|---|---|
| Configuration | Per-module setup | Global configuration |
| Format | Console only | JSON or colorized |
| Filtering | Manual | Automatic for known libraries |
| Production Ready | No | Yes (JSON logging) |
| File Logging | Per-module files | Centralized (via handlers) |
Deployment Considerations
Local Development
Docker/Kubernetes
# Use JSON logging for container logs
import os
use_json = os.getenv("LOG_FORMAT", "json") == "json"
configure_logging(level="INFO", use_json=use_json)
Azure/AWS
CI/CD Pipeline
Troubleshooting
Logs Not Appearing
Problem: No log output is shown.
Solution:
# Ensure configure_logging() is called before any get_logger()
configure_logging(level="INFO", use_json=False)
logger = get_logger(__name__)
logger.info("Test message") # Should now appear
Too Much Log Output
Problem: Logs are overwhelming with debug messages.
Solution:
JSON Logs Not Parsing
Problem: JSON logs are malformed in log aggregation tool.
Solution:
Duplicate Log Messages
Problem: Same log message appears multiple times.
Solution:
# Don't call configure_logging() multiple times
# Call it once in main() or __main__ block
if __name__ == "__main__":
configure_logging(level="INFO", use_json=False)
main() # All modules will use this configuration
Related Documentation
- Development Setup - Initial project setup
- Contributing - Code contribution guidelines
- Debugging - Debugging techniques
Last Updated: November 19, 2025
Logging Module: src/ayne/core/logging.py