Arthur Engine
Product & Design Lead
Series B · Open Source · New York, NY · 2024–2026

Built Arthur Engine from 0 to 1, a free, open-source toolkit for evaluating AI models. Designed to give every developer the same evaluation primitives that enterprises pay for, packaged for the open-source community and built to run anywhere.
Technology Partners
Challenge
Real-time AI evaluation was locked behind enterprise contracts. Independent developers, researchers, and small teams had no way to run production-grade guardrails on their own models without rolling their own from scratch, and the open-source landscape was a patchwork of one-off scripts, not a coherent toolkit.
Solution
Defined the product, shaped the surface area, and shipped Arthur Engine as an open-source evaluation toolkit. Engine ships with built-in evaluators for hallucination, toxicity, PII, sensitive data, and prompt injection, usable on any model, in any environment, with a few lines of code. Built developer-first: clear docs, low setup cost, and a clean upgrade path to the enterprise platform.
Impact
- •Led the 0-to-1 launch of the open-source Evals Engine, cutting deployment time-to-value 90% for early adopters
- •First open-source real-time AI evaluation toolkit in the agentic infrastructure category
- •Designed the developer surface from scratch, installable in minutes, runs anywhere, framework-agnostic
- •Shipped a complete evaluator suite out of the box: hallucination, toxicity, PII, sensitive data, and prompt injection
- •Created a clean OSS-to-enterprise path, devs adopt Engine, teams graduate to the Arthur Platform
Results
- •90% reduction in deployment time-to-value for early adopters
- •Arthur Engine reached thousands of developers within weeks of launch
- •Became Arthur's top of funnel, driving inbound from devs to enterprise pipeline
- •Established Arthur as the open-source standard for production AI evaluation
Design

