Skip to content

Quick Start Guide

Get the Trump Speeches NLP Chatbot API running locally in minutes.

Prerequisites

Setup

  1. Install Dependencies
uv sync

# If you need a specific Python version:
# uv venv --python 3.12
  1. Configure Environment

Create a .env file in the project root:

GEMINI_API_KEY=your_api_key_here

# Optional: Use different LLM provider
# LLM_PROVIDER=openai
# LLM_API_KEY=sk-your_openai_key
# LLM_MODEL_NAME=gpt-4o-mini
  1. Run the API
uv run uvicorn speech_nlp.app:app --host 0.0.0.0 --port 8000 --reload

The API will automatically:

  • Load configuration from .env
  • Initialize logging (colored output in development)
  • Load ML models (FinBERT ~440MB, RoBERTa ~330MB, MPNet ~420MB)
  • Initialize LLM provider (Gemini by default)
  • Load ChromaDB vector database with existing embeddings
  • Start FastAPI server

Expected startup output:

2025-11-04 12:34:56 | INFO     | speech_nlp.app       | Application: Trump Speeches NLP Chatbot API v0.1.0
2025-11-04 12:34:56 | INFO     | speech_nlp.app       | Environment: development
2025-11-04 12:34:56 | INFO     | speech_nlp.app       | ✓ Sentiment analysis model loaded successfully
2025-11-04 12:34:57 | INFO     | speech_nlp.app       | ✓ LLM service initialized and tested successfully
2025-11-04 12:34:58 | INFO     | speech_nlp.app       | ✓ RAG service initialized with 1082 existing chunks
2025-11-04 12:34:58 | INFO     | speech_nlp.app       | Application startup complete
  1. Access the Application

Local Development (Recommended for Testing): - Web UI: http://localhost:8000 - API Docs: http://localhost:8000/docs - Health Check: http://localhost:8000/health

Azure Deployment (Live Demo): - Web UI: https://trump-speeches-nlp-chatbot.azurewebsites.net - API Docs: https://trump-speeches-nlp-chatbot.azurewebsites.net/docs

⚠️ Azure Cold Start Warning:

The Azure deployment uses Free Tier hosting with ~2GB of ML models. Expect: - Cold start: 1-5 minutes after inactivity - Loading strategy: Refresh the page every 30 seconds until successful - AI responses: 30s-2min for complex queries - Warmed up: Fast (2-5s) once active

For instant responses during development, use local setup above!

Running with Docker

Docker containers include all ML models pre-downloaded (~2GB) for fast, consistent startup.

Build and Run

# Build image with models baked in (one-time, ~5-10 min)
docker build -t trump-speeches-nlp-chatbot .

# Run container (starts instantly - models already cached in image)
docker run --rm -it -p 8000:8000 --env-file .env --name nlp-chatbot trump-speeches-nlp-chatbot

Note: The build downloads ~2GB of ML models and includes them in the image. This makes the image larger (~4-5GB) but ensures instant container startup with no runtime downloads.

Docker Compose adds a persistent volume for model caching across rebuilds:

# Start with volume-based caching
docker-compose up

# First run downloads models (~3-4 min)
# Subsequent runs are instant even after rebuilds

The huggingface-cache volume persists models between image updates, so you only download once.

Testing the RAG System

Using the Web Interface

  1. Open http://localhost:8000
  2. Navigate to the "RAG Q&A" tab
  3. Ask a question like "What economic policies were discussed?"
  4. View the AI-generated answer with confidence scores and sources

Using curl

# Ask a question (RAG)
curl -X POST http://localhost:8000/rag/ask `
  -H "Content-Type: application/json" `
  -d '{"question": "What was said about the economy?", "top_k": 5}'

# Semantic search
curl -X POST http://localhost:8000/rag/search `
  -H "Content-Type: application/json" `
  -d '{"query": "immigration policy", "top_k": 5}'

# Get RAG statistics
curl http://localhost:8000/rag/stats

# Sentiment analysis (traditional NLP)
curl -X POST http://localhost:8000/analyze/sentiment `
  -H "Content-Type: application/json" `
  -d '{"text": "The economy is doing great!"}'

Using Python

import requests

# RAG Question Answering
response = requests.post(
    "http://localhost:8000/rag/ask",
    json={
        "question": "What were the main themes in the 2020 speeches?",
        "top_k": 5
    }
)
result = response.json()
print(f"Answer: {result['answer']}")
print(f"Confidence: {result['confidence']} ({result['confidence_score']:.2f})")
print(f"Sources: {', '.join(result['sources'])}")

# Traditional NLP - Sentiment
response = requests.post(
    "http://localhost:8000/analyze/sentiment",
    json={"text": "This is incredible! Best economy ever."}
)
print(response.json())

Troubleshooting

"RAG service not initialized"

The API auto-indexes documents on first startup. This takes ~30-60 seconds. Check the logs for progress:

INFO:     Loading documents into RAG service...
INFO:     Loaded 35 documents into RAG service!

Gemini API Errors

Ensure your .env file exists with a valid GEMINI_API_KEY. Get a free key at https://ai.google.dev/.

Model Download Taking Long

First run downloads ~2GB of models (FinBERT, RoBERTa, MPNet embeddings). Subsequent runs are fast.

Switching LLM Providers

See the FAQ for instructions on using OpenAI or Anthropic instead of Gemini.

Port Already in Use

uv run uvicorn speech_nlp.app:app --reload --port 8001

Module Not Found

Ensure you're in the project root directory and have run uv sync.

Next Steps