Overview
MTG Card Search Agent is a multi-agent natural-language search system that queries 33k+ Magic: The Gathering cards. Users can search in plain English, and the agent generates Scryfall queries to find the perfect cards.
This project demonstrates advanced LLM system design, agent orchestration, evaluation strategies, and performance optimization techniques.
Key Features
Natural Language Search
Convert English queries into Scryfall queries. Ask for "red creatures with flying" and get exactly that.
Iterative Refinement
Up to 5 refinement loops with evaluator feedback. The agent improves results based on relevance scores.
Parallel Evaluation
Improved throughput by parallelizing relevance scoring across multiple cards simultaneously.
Smart Caching
Deduplication and score caching skip previously seen cards, avoiding redundant API calls.
CLI & Module
Packaged as an installable Python module with a command-line interface for easy use.
Comprehensive Testing
pytest suite covering config, events, orchestrator, models, tools, and agents.
Technical Highlights
Agent Orchestration
Multi-agent system coordinating search, evaluation, and refinement. Uses PydanticAI for structured outputs and orchestration, ensuring reliable communication between agents.
Iterative Refinement Loop
Implements feedback loops where an evaluator agent scores results (1–10 relevance) and guides the search agent to refine queries. Up to 5 iterations improve result quality.
Performance Optimization
Parallel evaluation processes multiple cards concurrently. Deduplication and caching skip redundant work, significantly reducing execution time.
Production Packaging
Structured as an installable module with CLI entry points, making it easy to use and distribute.
What I Learned
- LLM Agent Design: Building agents that think step-by-step and refine outputs based on feedback.
- Evaluation Metrics: Designing relevance scoring systems to guide agent behavior and measure success.
- Concurrency: Parallel evaluation for throughput improvement without sacrificing quality.
- Structured Outputs: Using Pydantic for type-safe, validated LLM outputs.
- Testing ML Systems: Writing pytest suites for complex multi-agent systems.