Testing infrastructure for
GenAI & AI agents
HeptaFox builds the datasets, MCP servers, and scrape targets you need to test RAG pipelines, agents, and MCP clients — under realistic, reproducible conditions.
Built for evaluating AI systems
Each product targets one part of the AI testing stack. NIAH and the MCP Sandbox are live today; Scrape is on the way.
NIAH
Needle-in-a-haystack datasets for stress-testing RAG retrieval and long-context agents. Generate haystacks, plant needles, and measure what your pipeline actually recalls.
Launch app ↗ LiveMCP Sandbox
A free, zero-auth sandbox for testing any MCP client in under a minute. Connect via a single HTTP endpoint with built-in echo and identity tools — no credentials required.
Launch app ↗Scrape Target
A purpose-built website for agents to scrape — varied layouts, pagination, and edge cases — so you can benchmark web-browsing and extraction agents against fixed content.
In developmentFrom retrieval to agents to evals
HeptaFox focuses on the unglamorous-but-essential layer: reproducible test data and environments that let you measure GenAI systems instead of guessing.
RAG & retrieval
Controlled haystacks and ground-truth needles to quantify recall, precision, and long-context degradation.
Agents & tools
Deterministic MCP servers and scrape targets so agent behavior is testable, not anecdotal.
Evals & testing
Fixed, versioned environments that make regression testing of GenAI pipelines actually possible.