PROJECT_MODULE // ORACLE_SEARCH manage_search

> FULL-STACK / LLM SYSTEMS

Oracle Search

Full-Stack Web App & LLM Orchestration | 2025 – Present

Users type something like “cheap white-red removal for a Boros aggro Commander deck” and the system iteratively translates that into Scryfall search syntax, fetches candidate cards, scores them with an LLM for both relevance and power-in-context, then decides — again with an LLM, but based on computed statistics rather than vibes — whether to refine and search again. The frontend shows the loop happening live over WebSocket.

Python 3.12 FastAPI React 19 TypeScript PostgreSQL 16 OpenAI API WebSocket Docker Azure Tailwind CSS 4 Vite 6 SQLAlchemy 2 Google OAuth
Problem_Space // Motivation lightbulb

Scryfall’s search syntax is powerful but unforgiving — it expects exact operators like ci<=wr, t:creature, cmc<=3, otag:removal. Most players don’t know it.

Pure LLM generation of a single Scryfall query is also fragile: the model often returns too few, too many, or off-target cards on the first try.

This project explores a middle path: let an LLM iterate. Generate a query, run it, evaluate results with a cheaper model, compute statistics, then let a third LLM call decide — based on the stats and original intent — whether to stop or refine and try again. The architecture is built around that loop, not around a single oracle call.

Deployment_Status // Azure cloud_done

Backend

Azure Container Apps

Single replica, GHCR-hosted prod image

Database

Azure PostgreSQL Flexible Server

Burstable B1ms/B2s tier for beta

Frontend

Azure Static Web Apps

CI-built Vite bundle, envsubst-driven CSP

Status: Closed Beta

Two-user scale. Email allowlist via BETA_WHITELIST_EMAILS. OpenAI key is dashboard-capped.

Core_Algorithm // The_Search_Loop loop

Each user query triggers an agentic loop capped at 5 iterations. The loop is driven by statistics and LLM judgment — not hand-coded thresholds. Every step streams live to the browser.

01 Query Generator

LLM converts natural language + active filters + prior-loop stats into Scryfall syntax. Can call a tag_search tool for fuzzy Scryfall tag lookups.

02 Filter Injection + Scryfall Fetch

Deterministically prepends color/format/CMC/type syntax so the LLM never duplicates constraints. Async, rate-limited, paginated Scryfall call. Cache check separates new from already-scored cards.

03 Card Evaluator

Cheaper LLM scores new cards in parallel batches (semaphore-capped at 30 in-flight calls). Each card gets relevant: bool, power_score: 1–10, and one-line reasoning.

04 Stop / Continue Decision

LLM reads the original query, full EvalStats (relevance %, score distribution, avg CMC, color/type breakdown, cache hits), then emits should_continue and refinement hints for the next generator.

Hard cap: MAX_SEARCH_LOOPS = 5  ·  Output: Top cards ranked by power_score (relevant-only subset)
Engineering_Highlights // What_Makes_It_Interesting star
No Agent Framework code

Three distinct LLM roles wired up with the raw OpenAI Python SDK and Pydantic structured outputs. No LangChain, LangGraph, or PydanticAI — deliberately. The orchestration logic is explicit Python, not a framework abstraction.

Statistical Decision-Making analytics

The stop/continue LLM call is fed a computed EvalStats payload — relevance %, score distribution, average CMC, color/type breakdowns, cache hit count — not just a text summary. Decisions are grounded in numbers.

WebSocket-First UX stream

The frontend never polls. It connects to /ws/search and renders nine live typed events as the loop runs. The protocol already defines feedback, card_vote, and override_decision message types for future human-in-the-loop extensions.

Pydantic → TypeScript Codegen sync_alt

A custom scripts/gen_types.py walks backend/app/schemas/, calls TypeAdapter(T).json_schema() on each Pydantic model, merges to a single JSON Schema doc, and feeds it to json-schema-to-typescript. Freshness enforced by a pre-commit hook and a dedicated CI job that fails on drift.

Refresh Token Rotation key

Each refresh token carries a jti claim whose SHA-256 hash is persisted in Postgres. On every /api/auth/refresh the presented token is verified against the stored hash, the old row is revoked, and a replacement is issued. Logout revokes before clearing the cookie.

Real Postgres in CI Tests science

Test suite uses testcontainers to spin up a real Postgres instance in CI rather than mocking, with function-scoped transaction rollback via SAVEPOINTs for isolation. No mocked SQL in the test surface.

Fuzzy Scryfall Tag Tool search

On backend startup the app fetches Scryfall’s Tagger tag page and parses oracle tag slugs, with a bundled tags.json fallback if the fetch fails. thefuzz token_sort_ratio powers a function-call tool exposed to the query generator LLM so it can ground tags like “removal” to real otag: values.

Strict Production Validation verified_user

With APP_ENV=production the backend refuses to boot if JWT_SECRET is weak, AUTH_COOKIE_SECURE=false, required secrets are missing, or CORS_ALLOW_ORIGINS includes localhost. The Vite frontend has a parallel assertProductionEnv() that throws at app boot on any bad VITE_* value.

Polyglot Monorepo folder_open

just dev brings up Postgres, backend (uvicorn --reload), and frontend (Vite HMR) via Docker Compose. Secrets live outside the repo at $WORKTREE_SECRET_ROOT/<repo>/.env so multiple git worktrees share one env file.

System_Architecture // Overview architecture
  ┌──────────────────────────────┐
  │  React 19 + Vite SPA         │
  │  (Azure Static Web Apps)     │
  └──────────────┬───────────────┘
                 │  HTTPS (REST) + WSS (search stream)
                 ▼
  ┌──────────────────────────────┐
  │  FastAPI backend (1 replica) │
  │  Azure Container Apps        │
  │  ┌────────────────────────┐  │
  │  │  Search Orchestrator   │  │
  │  │   ├─ Query Generator   │──┼──▶ OpenAI (mini-class)
  │  │   ├─ Card Evaluator    │──┼──▶ OpenAI (nano, parallel)
  │  │   ├─ Stats Computation │  │
  │  │   └─ Stop/Continue     │──┼──▶ OpenAI (mini-class)
  │  └────────────────────────┘  │
  │           │   │              │
  │           │   └─ Tag tool ───┼──▶ Scryfall Tagger (startup)
  │           ▼                  │
  │  httpx async ────────────────┼──▶ Scryfall REST API
  │           │                  │
  │  SQLAlchemy async ───────────┼──▶ Postgres (Azure)
  └──────────────────────────────┘

The backend coordinates three distinct LLM calls, rate-limited Scryfall pagination, fuzzy tag matching, and per-user search history behind Google OAuth. Active search state lives in app.state.active_searches — a documented constraint that bounds the beta to one worker.

WebSocket_Protocol // Live_Events wifi

Client → Server

start_searchKick off a new search
cancelAbort an in-flight search
feedbackfuture: human-in-the-loop signal
card_votefuture: per-card relevance vote
override_decisionfuture: force stop/continue

Server → Client

search_started
loop_started
query_generated
scryfall_results
evaluation_progress
evaluation_complete
decision
search_complete
error / cancelled

Auth: HTTP-only refresh cookie — not a URL token

Endpoint: wss://<host>/ws/search

Tech_Stack // Full_Breakdown layers

Backend

  • Python 3.12 / FastAPI (fully async)
  • Uvicorn — single worker in prod
  • PostgreSQL 16 + asyncpg driver
  • SQLAlchemy 2.0 async ORM
  • Alembic — auto-applied on container start
  • pydantic-settings (Pydantic v2)
  • OpenAI SDK — structured outputs + function tools
  • httpx — async, rate-limited Scryfall client
  • Google OAuth 2.0 (google-auth) + JWT (python-jose)
  • thefuzz[speedup] — token_sort_ratio tag matching
  • uv package manager + committed lockfile
  • ruff (lint/format), pyright (standard mode)
  • pytest + pytest-asyncio + testcontainers + httpx-ws

Frontend

  • TypeScript (strict, noUncheckedIndexedAccess, noImplicitOverride)
  • React 19
  • React Router 7
  • Vite 6 build tool
  • Tailwind CSS 4
  • Vitest + @testing-library/react + happy-dom
  • Biome (lint/format)
  • pnpm 9 / Node 22 LTS

Infrastructure & Tooling

  • Docker — multi-stage dev/prod Dockerfiles
  • Docker Compose — one-command local dev
  • Nginx — envsubst CSP + SPA fallback
  • Azure Container Apps — backend runtime
  • Azure PostgreSQL Flexible Server
  • Azure Static Web Apps — frontend
  • GitHub Actions — 7 parallel CI jobs
  • GHCR — container image registry
  • just task runner (~20 targets)
  • pre-commit + Gitleaks (secret scanning)
Implementation_Details // Notable_Constraints settings

Two-Tier Model Strategy

Mini-class model for the low-volume query generator and decision calls; nano model for the high-volume per-card evaluation. Evaluations run in parallel batches, semaphore-capped at 30 in-flight OpenAI calls per search on a single-vCPU host — sized to avoid tail-latency inflation rather than to maximize throughput.

Single-Worker Constraint (Documented and Intentional)

Production Uvicorn runs with --workers 1. Active search state (reconnect/cancel) lives in app.state.active_searches in process memory, so scaling out requires distributed coordination first. One active search per user is enforced at the API level; combined with single-worker, this keeps reconnect and cancel state coherent.

Connection Pool Sizing

db_pool_size=10 + db_max_overflow=5 is sized against the Burstable Postgres tier (workers × (pool + overflow) ≤ max_connections), leaving headroom for Alembic on boot and operator psql sessions.

Build-Time vs Runtime Config

VITE_* values are baked into the JS bundle at build time — the prod image cannot be repointed at a different backend without a rebuild. The Nginx CSP is runtime-templated via envsubst from VITE_API_URL / VITE_WS_URL so the connect-src always matches the deployed backend. CSP and security headers are mirrored in both nginx.conf and staticwebapp.config.json so both hosting paths enforce the same policy.

Input Bounds Before Any LLM Call

Query text is trimmed and capped at 500 chars; color and type lists are bounded by enum size and deduplicated; max_cmc must be non-negative. The backend never spends an OpenAI token on malformed input.

CI_CD_Pipeline // GitHub_Actions rocket_launch

PR Gate (ci.yml)

  • Backend lint — ruff check + ruff format --check
  • Backend typecheck — pyright
  • Backend tests — pytest + real Postgres via testcontainers
  • Frontend lint — biome ci
  • Frontend typecheck — tsc --noEmit
  • Frontend tests — vitest run
  • Generated-types drift check
  • Prod container build verification

7 parallel jobs, path-filtered

Backend Deploy

  • Triggers on successful CI on main
  • Builds prod Dockerfile target
  • Tags with short commit SHA
  • Pushes to GHCR
  • Azure rollout is a manual az containerapp update step due to Entra tenant restrictions on service-principal creation

Frontend Deploy

  • Triggers on successful CI on main
  • Substitutes __API_ORIGIN__ / __WS_ORIGIN__ placeholders
  • Builds with VITE_* secrets from GitHub Secrets
  • Deploys via Azure/static-web-apps-deploy@v1
Database_Schema // Postgres table

users

Google identity, email, display name, avatar URL

user_preferences

Per-user default filters (color, format, max CMC) stored as JSONB

refresh_tokens

SHA-256 hash of JWT jti, expires_at, revoked_at. Enables rotation and server-side revocation.

search_sessions

Query text, filters JSONB, loops completed, total cards evaluated, top cards JSONB, status, timestamps

External_Links // Resources link