Janus Launches: Simulation Testing for AI Agents
"Test AI Agents with Human Simulation"
TL;DR: Janus battle-tests your AI agents to surface hallucinations, rule violations, and tool-call/performance failures. They run thousands of AI simulations against your chat/voice agents and offer custom evals for further model improvement.
Founded by Jet Wu & Shivum Pandove
Shivum and Jet left incoming roles at Anduril and IBM, dropped out of Carnegie Mellon ML, and moved to SF to build Janus full-time. They felt this pain first‑hand while building consumer-facing agents themselves: every new model or prompt tweak broke something in prod. They built Janus to give themselves the “crash‑test dummy” they wished existed from day-1.

💸 Why this matters
A single broken AI conversation can mean:
- A PR disaster (Air Canada chatbot inventing refund policies)
- Users churning after one bad reply
- Lawsuits or regulatory fines for poor compliance
Yet most teams still test agents manually by pasting prompts into playgrounds.
🤕 The Problem
Manual QA covers maybe 100 scenarios, while real users trigger millions. Generic testing platforms don’t understand your customers and can’t simulate nuanced back‑and‑forths at scale. This leaves companies with no actionable insights and blind spots that only appear after you ship.
💡 Their Solution
Janus automatically:
- Generates thousands of hyper‑realistic user personas—from angry customers to domain experts—to cover every possible edge case
- Runs full multi‑turn conversations (text or voice) against your agent, APIs, and function calls
- Allows you to input natural language rules on what to test your agent against and how you’d like it to perform
- Detects hallucinations, bias, tool‑call failures, and risky responses using SOTA LLM‑as‑a‑Judge + black-box UQ techniques
- Pinpoints root causes and produces actionable recommendations you can plug straight into CI/CD.
All in < 10 min.
Learn More
🌐 Visit www.withjanus.com to learn more.
🤝 Building or piloting an AI agent? Skip manual QA and get started in 15 minutes to see how Janus makes agent eval effortless. Click here to have a chat with the team.
📨 Email the team here.
👣 Follow Janus on LinkedIn & X.
Simplify Startup Finances Today
Take the stress out of bookkeeping, taxes, and tax credits with Fondo’s all-in-one accounting platform built for startups. Start saving time and money with our expert-backed solutions.
Get Started