MTG Card Search Agent

Overview

MTG Card Search Agent is a multi-agent natural-language search system that queries 33k+ Magic: The Gathering cards. Users can search in plain English, and the agent generates Scryfall queries to find the perfect cards.

This project demonstrates advanced LLM system design, agent orchestration, evaluation strategies, and performance optimization techniques.

Key Features

Natural Language Search

Convert English queries into Scryfall queries. Ask for "red creatures with flying" and get exactly that.

Iterative Refinement

Up to 5 refinement loops with evaluator feedback. The agent improves results based on relevance scores.

Parallel Evaluation

Improved throughput by parallelizing relevance scoring across multiple cards simultaneously.

Smart Caching

Deduplication and score caching skip previously seen cards, avoiding redundant API calls.

CLI & Module

Packaged as an installable Python module with a command-line interface for easy use.

Comprehensive Testing

pytest suite covering config, events, orchestrator, models, tools, and agents.

Technical Highlights

Agent Orchestration

Multi-agent system coordinating search, evaluation, and refinement. Uses PydanticAI for structured outputs and orchestration, ensuring reliable communication between agents.

Iterative Refinement Loop

Implements feedback loops where an evaluator agent scores results (1–10 relevance) and guides the search agent to refine queries. Up to 5 iterations improve result quality.

Performance Optimization

Parallel evaluation processes multiple cards concurrently. Deduplication and caching skip redundant work, significantly reducing execution time.

Production Packaging

Structured as an installable module with CLI entry points, making it easy to use and distribute.

What I Learned

LLM Agent Design: Building agents that think step-by-step and refine outputs based on feedback.
Evaluation Metrics: Designing relevance scoring systems to guide agent behavior and measure success.
Concurrency: Parallel evaluation for throughput improvement without sacrificing quality.
Structured Outputs: Using Pydantic for type-safe, validated LLM outputs.
Testing ML Systems: Writing pytest suites for complex multi-agent systems.

Links

View on GitHub