Developer Guide¶
Quick reference for setting up, running, and shipping the project. See deployment.md for the full details on CI/CD and cloud hosting.
Prerequisites¶
- Python 3.11+
- uv (package manager)
- Docker (optional, for container workflows)
- A Gemini API key (get one free) or another supported LLM provider key
First-Time Setup¶
# Clone and enter the repo
git clone https://github.com/JustaKris/Trump-Rally-Speeches-NLP-Chatbot.git
cd Trump-Rally-Speeches-NLP-Chatbot
# Create the virtual environment and install all dependencies
uv sync --all-groups
# Activate the venv (Windows PowerShell)
.venv\Scripts\Activate.ps1
Create a .env file in the project root with your API key:
LLM_API_KEY=your_gemini_api_key_here
# Optional overrides
# LLM_PROVIDER=openai
# LLM_MODEL_NAME=gpt-4o-mini
Gemini is the default provider. For OpenAI or Anthropic, also set
LLM_PROVIDERandLLM_MODEL_NAME.
Running Locally¶
# Start the dev server with hot reload
uv run uvicorn speech_nlp.app:app --reload
# Or bind to all interfaces (useful for testing on a local network)
uv run uvicorn speech_nlp.app:app --host 0.0.0.0 --port 8000 --reload
Once running:
| URL | Purpose |
|---|---|
| http://localhost:8000 | Web UI |
| http://localhost:8000/docs | Swagger API docs |
| http://localhost:8000/redoc | ReDoc API docs |
| http://localhost:8000/health | Health check |
Note: First startup takes ~30–60s while ML models load (~2GB: FinBERT, RoBERTa, MPNet).
Running with Docker¶
# Build the image (one-time, ~5-10 min — downloads and bakes in ML models)
docker build -t trump-speeches-nlp-chatbot .
# Run (models are already in the image, so startup is fast)
docker run --rm -it -p 8000:8000 --env-file .env --name nlp-chatbot trump-speeches-nlp-chatbot
# Run in the background
docker run -d -p 8000:8000 --env-file .env --name nlp-chatbot trump-speeches-nlp-chatbot
# Persist ChromaDB data across container runs
docker run --rm -it -p 8000:8000 `
-v "${PWD}/data/chromadb:/app/data/chromadb" `
--env-file .env --name nlp-chatbot trump-speeches-nlp-chatbot
# View logs
docker logs -f nlp-chatbot
# Stop and remove
docker stop nlp-chatbot
docker rm nlp-chatbot
Docker Compose¶
docker-compose up # Foreground
docker-compose up -d # Background
docker-compose down # Stop and remove containers
Pushing to Docker Hub¶
docker login
# Tag
docker tag trump-speeches-nlp-chatbot yourusername/trump-speeches-nlp-chatbot:latest
docker tag trump-speeches-nlp-chatbot yourusername/trump-speeches-nlp-chatbot:v1.0.0
# Push
docker push yourusername/trump-speeches-nlp-chatbot:latest
Pushing to main also triggers the GitHub Actions workflows which build and push automatically — you only need to do this manually if you want to push a specific local build.
Tests & Code Quality¶
# Run tests
uv run pytest
uv run pytest -v --cov=src # With coverage report
# Lint and format
uv run ruff check src/
uv run ruff check src/ --fix # Auto-fix fixable issues
uv run ruff format src/
# Type checking
uv run mypy src/
# Security scan
uv run bandit -r src/ -c pyproject.toml
All of these run automatically in CI on every push. The pipeline requires tests and linting to pass; type checking and security scanning are informational (allowed to fail).
Managing Dependencies¶
uv add requests # Add a package
uv add --group dev pytest-xdist # Add to a dependency group
uv remove requests # Remove a package
uv sync # Sync install to match lock file
uv sync --upgrade # Upgrade all to latest compatible versions
uv lock --upgrade-package fastapi # Upgrade a single package
Configuration¶
Config is loaded in this order (highest priority first):
- Environment variables
.envfileconfigs/<environment>.yaml(default:development.yaml)- Code defaults
Set ENVIRONMENT=staging or ENVIRONMENT=production to switch config files.
Project Structure (Key Paths)¶
src/speech_nlp/
app.py # Application entry point, startup hooks
api/ # FastAPI route handlers
services/
rag/ # RAG orchestrator, search, chunking, confidence, entities
llm/ # Pluggable LLM providers (Gemini, OpenAI, Anthropic)
analysis/ # Sentiment, topics, text processing
config/ # Pydantic settings + structured logging
schemas/ # Request/response models
templates/index.html # Single-page frontend
data/
chromadb/ # Persisted vector store
Donald Trump Rally Speeches/ # Source speech transcripts
configs/ # Environment YAML configs
scripts/ # Utility scripts (model download, migrations)
tests/ # pytest test suite
Useful Links¶
- Full Deployment Guide — CI/CD, Azure, Render setup
- Quickstart — Faster first-run walkthrough
- Troubleshooting (Azure)
- ROADMAP
- Live App