Open SourceEvaluation0→1Developer ToolsAI Governance

Arthur Engine

Product & Design Lead

Series B · Open Source · New York, NY · 2024–2026

Built Arthur Engine from 0 to 1, a free, open-source toolkit for evaluating AI models. Designed to give every developer the same evaluation primitives that enterprises pay for, packaged for the open-source community and built to run anywhere.

Technology Partners

OpenAIAnthropicHugging FaceOpenTelemetry

Challenge

Real-time AI evaluation was locked behind enterprise contracts. Independent developers, researchers, and small teams had no way to run production-grade guardrails on their own models without rolling their own from scratch, and the open-source landscape was a patchwork of one-off scripts, not a coherent toolkit.

Solution

Defined the product, shaped the surface area, and shipped Arthur Engine as an open-source evaluation toolkit. Engine ships with built-in evaluators for hallucination, toxicity, PII, sensitive data, and prompt injection, usable on any model, in any environment, with a few lines of code. Built developer-first: clear docs, low setup cost, and a clean upgrade path to the enterprise platform.

Impact

•Led the 0-to-1 launch of the open-source Evals Engine, cutting deployment time-to-value 90% for early adopters
•First open-source real-time AI evaluation toolkit in the agentic infrastructure category
•Designed the developer surface from scratch, installable in minutes, runs anywhere, framework-agnostic
•Shipped a complete evaluator suite out of the box: hallucination, toxicity, PII, sensitive data, and prompt injection
•Created a clean OSS-to-enterprise path, devs adopt Engine, teams graduate to the Arthur Platform

Results

•90% reduction in deployment time-to-value for early adopters
•Arthur Engine reached thousands of developers within weeks of launch
•Became Arthur's top of funnel, driving inbound from devs to enterprise pipeline
•Established Arthur as the open-source standard for production AI evaluation

Design

← PreviousThe Arthur Platform Next →Arthur Agent Toolkit