Configuration Guide¶
This project uses Pydantic Settings v2 for type-safe configuration, combining YAML config files with environment variable overrides. This is a modern, cloud-friendly pattern that works well for local development and deployments on AWS or other platforms.
Configuration Architecture¶
Core Components¶
src/speech_nlp/config/settings.py- Central configuration module withSettingsclass- YAML config files in
configs/(e.g.,configs/development.yaml,configs/production.yaml) .envfile - Environment variables for sensitive values and overrides- Validation - Automatic type checking and validation via Pydantic
Benefits¶
- ✅ Type-safe - Compile-time checking of configuration values
- ✅ Environment-aware - Different configs for dev/staging/prod
- ✅ Cloud-friendly - Works seamlessly with Azure, AWS, GCP
- ✅ Validated - Invalid configs fail fast with clear error messages
- ✅ Documented - Self-documenting with type hints and descriptions
Quick Start¶
1. Choose Your Environment (YAML)¶
Configuration defaults live in YAML files under configs/:
configs/development.yaml– for local developmentconfigs/production.yaml– for production deployments (AWS, Azure, etc.)
By default, the app uses the development environment. You can override this via the ENVIRONMENT environment variable:
The active environment name is used to pick configs/<environment>.yaml.
2. Create Your .env File¶
Copy the example file:
Use .env for secrets and overrides only (API keys, tokens, one-off tweaks). All non-sensitive defaults should live in YAML.
3. Set Your LLM Provider¶
Edit .env and configure your preferred LLM provider (sensitive values like API keys stay here):
Option A: Google Gemini (Default)¶
Get a free key at: https://ai.google.dev/
Option B: OpenAI¶
Option C: Anthropic (Claude)¶
LLM_PROVIDER=anthropic
LLM_API_KEY=sk-ant-your_anthropic_api_key_here
LLM_MODEL_NAME=claude-3-5-sonnet-20241022
4. Run the Application¶
The app will automatically:
- Load base defaults from
configs/<ENVIRONMENT>.yaml - Apply Pydantic model defaults for any missing values
- Override with environment variables /
.envvalues - Validate all configuration values
- Initialize services with configured parameters
- Display startup configuration in logs
Configuration Options¶
Application Settings¶
Core metadata and logging live primarily in YAML:
# configs/development.yaml
environment: development
log_level: DEBUG
app_name: "Trump Speeches NLP Chatbot API (Development)"
app_version: "0.1.0"
You can still override via .env or environment variables if needed:
ENVIRONMENT="production" # selects configs/production.yaml
LOG_LEVEL="INFO" # overrides YAML
APP_NAME="Custom Name" # overrides YAML
LLM Provider (Multi-Provider Support)¶
Configure which LLM provider to use for answer generation, sentiment interpretation, and topic analysis.
General LLM Settings¶
In YAML we configure non-sensitive defaults under the llm section:
llm:
provider: "gemini" # gemini | openai | anthropic | none
enabled: true
model_name: "gemini-2.5-flash"
temperature: 0.3
max_output_tokens: 1024
Sensitive values like API keys are supplied via environment variables / .env:
LLM_PROVIDER="gemini" # optional override for provider
LLM_API_KEY="your_api_key" # Single API key for active provider
LLM_MODEL_NAME="model-name" # optional override for model
LLM_TEMPERATURE="0.7" # optional override for temperature
LLM_MAX_OUTPUT_TOKENS="2048" # optional override for max tokens
LLM_ENABLED="true" # optional override
Provider-Specific Examples¶
Gemini (Default - Always Available):
LLM_PROVIDER="gemini"
LLM_API_KEY="your_gemini_api_key"
LLM_MODEL_NAME="gemini-2.0-flash-exp" # or gemini-1.5-pro
LLM_TEMPERATURE="0.7"
LLM_MAX_OUTPUT_TOKENS="2048"
OpenAI (Optional - Install with uv sync --group llm-openai):
LLM_PROVIDER="openai"
LLM_API_KEY="sk-your_openai_api_key"
LLM_MODEL_NAME="gpt-4o-mini" # or gpt-4o, gpt-4-turbo
LLM_TEMPERATURE="0.7"
LLM_MAX_OUTPUT_TOKENS="2048"
Anthropic (Optional - Install with uv sync --group llm-anthropic):
LLM_PROVIDER="anthropic"
LLM_API_KEY="sk-ant-your_anthropic_api_key"
LLM_MODEL_NAME="claude-3-5-sonnet-20241022" # or claude-3-opus-20240229
LLM_TEMPERATURE="0.7"
LLM_MAX_OUTPUT_TOKENS="2048"
Disable LLM:
Switching Providers¶
- Install optional provider (if not already installed):
uv sync --group llm-openai # For OpenAI
uv sync --group llm-anthropic # For Anthropic
uv sync --group llm-all # For all providers
-
Update
.envfile with new provider settings -
Restart application:
The application will automatically use the new provider without code changes.
ML Models¶
Configure which models to use for different tasks via YAML:
models:
sentiment_model_name: "ProsusAI/finbert"
embedding_model_name: "all-mpnet-base-v2"
reranker_model_name: "cross-encoder/ms-marco-MiniLM-L-6-v2"
emotion_model_name: "j-hartmann/emotion-english-distilroberta-base"
You can override any of them via environment variables if needed:
SENTIMENT_MODEL_NAME="ProsusAI/finbert"
EMBEDDING_MODEL_NAME="all-mpnet-base-v2"
RERANKER_MODEL_NAME="cross-encoder/ms-marco-MiniLM-L-6-v2"
EMOTION_MODEL_NAME="j-hartmann/emotion-english-distilroberta-base"
RAG Configuration¶
These live under the rag section in YAML:
rag:
chromadb_persist_directory: "./data/chromadb"
chromadb_collection_name: "speeches"
chunk_size: 2048
chunk_overlap: 150
default_top_k: 5
use_reranking: true
use_hybrid_search: true
Environment variables can override them if necessary (e.g. for a one-off deployment):
CHROMADB_PERSIST_DIRECTORY="./data/chromadb"
CHROMADB_COLLECTION_NAME="speeches"
CHUNK_SIZE="2048"
CHUNK_OVERLAP="150"
DEFAULT_TOP_K="5"
USE_RERANKING="true"
USE_HYBRID_SEARCH="true"
Semantic Chunking¶
Control how documents are split into chunks:
rag:
chunking_strategy: "semantic" # "semantic" or "fixed"
semantic_min_chunk_size: 256 # Merge groups smaller than this (bytes)
semantic_breakpoint_percentile: 90.0 # Percentile for topic-shift detection
# semantic_similarity_threshold: null # Override percentile with an absolute threshold
When chunking_strategy is "semantic", the DocumentLoader embeds each sentence, computes
consecutive cosine similarities, and splits at topic boundaries. Groups smaller than
semantic_min_chunk_size are merged with their neighbour; groups exceeding chunk_size fall
back to RecursiveCharacterTextSplitter. Set "fixed" to use traditional character-based splitting.
RAG Guardrails¶
Three-layer quality gates that prevent hallucination and ensure answer grounding:
rag:
guardrails_enabled: true # Master switch for all guardrail layers
similarity_threshold: 0.01 # Min sigmoid-normalised relevance score (0-1)
grounding_threshold: 0.3 # Min token-overlap ratio for grounding check
| Setting | Default | Description |
|---|---|---|
guardrails_enabled |
true |
Enables pre-retrieval validation, post-retrieval relevance filtering, and post-generation grounding verification |
similarity_threshold |
0.01 |
Minimum sigmoid-normalised cross-encoder score. Results below this are dropped before reaching the LLM. Calibrated for the ms-marco-MiniLM-L-6-v2 model on speech transcripts |
grounding_threshold |
0.3 |
Minimum token-overlap between the generated answer and the retrieved context. Answers below this get a caveat warning appended |
Environment variable overrides:
Query Rewriting¶
LLM-powered query optimisation that rewrites user queries before search to fix typos, expand abbreviations, and improve retrieval quality:
| Setting | Default | Description |
|---|---|---|
query_rewriting_enabled |
true |
Enables LLM-powered query rewriting before search. Requires an active LLM provider. Falls back to the original query on error |
Environment variable override:
Data Directories¶
Configured under paths in YAML:
API Settings¶
These are grouped under the api section in YAML:
In production (e.g. AWS) you might use something like:
# configs/production.yaml
api:
host: "0.0.0.0"
port: 8000
reload: false
cors_origins:
- "https://your-domain.com"
Environment-Specific Configs¶
Development¶
# configs/development.yaml
environment: development
log_level: DEBUG
api:
host: "0.0.0.0"
port: 8000
reload: true
cors_origins:
- "*"
Production example¶
# configs/production.yaml
environment: production
log_level: INFO
app_name: "Trump Speeches NLP Chatbot API"
api:
host: "0.0.0.0"
port: 8000
reload: false
cors_origins:
- "https://your-domain.com"
Using Configuration in Code¶
Accessing Settings¶
from speech_nlp.config.settings import get_settings
settings = get_settings()
# Access values from nested sections
print(settings.llm.provider)
print(settings.rag.chunk_size)
print(settings.log_level)
Type-Safe Access¶
# All settings are type-checked
settings.rag.chunk_size # int
settings.llm.temperature # float
settings.rag.use_reranking # bool
settings.llm.provider # Literal["gemini", "openai", "anthropic", "none"]
Helper Methods¶
# Check if LLM is configured
if settings.is_llm_configured():
api_key = settings.get_llm_api_key()
model = settings.get_llm_model_name()
# Get Path objects
speeches_path = settings.get_speeches_path()
chromadb_path = settings.get_chromadb_path()
# Setup logging
settings.setup_logging()
Logging Configuration¶
The project uses src/speech_nlp/config/logging.py for production-ready logging with automatic format detection.
Log Levels¶
- DEBUG: Detailed diagnostic information for troubleshooting
- INFO: Important application events (default, recommended for production)
- WARNING: Unexpected but recoverable situations
- ERROR: Application errors requiring attention
- CRITICAL: System-critical failures
Log Formats¶
Development (Colored)¶
Automatically enabled when ENVIRONMENT=development:
2025-11-04 12:34:56 | INFO | speech_nlp.app | Application startup complete
2025-11-04 12:34:57 | DEBUG | speech_nlp.services.rag | Performing hybrid search
- ANSI colors by level (green=INFO, red=ERROR, etc.)
- Human-readable timestamps
- Module names right-aligned
Production (JSON)¶
Automatically enabled when ENVIRONMENT=production:
{"timestamp": "2025-11-04 12:34:56", "level": "INFO", "name": "speech_nlp.app", "message": "Application startup complete"}
{"timestamp": "2025-11-04 12:34:57", "level": "DEBUG", "name": "speech_nlp.services.rag", "message": "Performing hybrid search"}
- Machine-parseable JSON
- Compatible with Azure Application Insights, CloudWatch, ELK stack
- Automatic exception field for errors
Changing Log Settings¶
Edit .env:
# Log level
LOG_LEVEL="INFO" # Recommended for production
LOG_LEVEL="DEBUG" # Verbose for debugging
# Environment (affects format)
ENVIRONMENT="development" # Colored logs
ENVIRONMENT="production" # JSON logs
The logging system automatically:
- Detects environment and chooses appropriate format
- Suppresses noisy third-party loggers (chromadb, httpx, transformers)
- Configures uvicorn logs
- Filters ChromaDB telemetry errors
For detailed logging documentation, see docs/development/logging.md.
Azure Deployment¶
Azure App Service automatically loads environment variables. Configure them in:
- Azure Portal: App Service → Configuration → Application Settings
- Azure CLI:
az webapp config appsettings set --name myapp --resource-group mygroup \
--settings GEMINI_API_KEY="your_key" LOG_LEVEL="INFO"
Docker Deployment¶
Using .env file¶
Using environment variables¶
Docker Compose¶
services:
api:
build: .
environment:
- GEMINI_API_KEY=${GEMINI_API_KEY}
- LOG_LEVEL=${LOG_LEVEL:-INFO}
env_file:
- .env
ports:
- "8000:8000"
Validation¶
Pydantic automatically validates configuration:
Example Validation Errors¶
# Invalid log level
LOG_LEVEL="INVALID"
# ❌ Error: Invalid log level. Must be one of: DEBUG, INFO, WARNING, ERROR, CRITICAL
# Invalid chunk size
CHUNK_SIZE="not_a_number"
# ❌ Error: Input should be a valid integer
# Missing required API key (when LLM enabled)
LLM_ENABLED="true"
GEMINI_API_KEY=""
# ❌ Error: API key appears to be too short
Best Practices¶
- Never commit
.env- Add to.gitignore - Use
.env.example- Document all available options - Validate early - Settings load at startup, fail fast
- Environment-specific - Different configs for dev/prod
- Security - Use Azure Key Vault for sensitive values in production
- Logging - Use appropriate log levels for each environment
Troubleshooting¶
Settings not loading¶
Check:
.envfile exists in project root (for secrets/overrides)- A YAML config exists at
configs/<ENVIRONMENT>.yaml(orconfigs/development.yamlby default) - File encoding is UTF-8
- No syntax errors in
.envor YAML files
Invalid configuration¶
Check logs at startup:
ERROR: ValidationError: 1 validation error for Settings
Invalid log level. Must be one of: DEBUG, INFO, WARNING, ERROR, CRITICAL
API key issues¶
# Check if API key is set
python -c "from speech_nlp.config.settings import get_settings; print(get_settings().get_llm_api_key())"
Migration from Old Code¶
If you were using environment variables directly:
Before:
After:
from speech_nlp.config import get_settings
settings = get_settings()
api_key = settings.gemini_api_key # Type-safe!