AN
← Back to work
AI GovernanceAgentic AIEnterprise0→1Open Source

Arthur AI

Head of Product

Series B · Enterprise AI · New York, NY · 2024–2026

Arthur AI hero

Led product across Arthur's full agentic infrastructure platform — from an open-source evaluation engine to an enterprise-grade agent governance layer. Built the products that help organizations discover, test, deploy, and govern AI agents in production with confidence.

Technology Partners

Google CloudOpenAIAnthropicLangChainOpenTelemetry

Challenge

As AI agents moved from demos to production, enterprises faced a new class of problems: no visibility into what agents were doing, no way to test them reliably before shipping, and no governance layer to enforce policy at scale. The tooling that existed was built for models, not agents — and the gap was growing fast.

Solution

Defined and shipped four interconnected products forming Arthur's agentic development lifecycle: the open-source Engine Evaluation toolkit for real-time model and agent evaluation; the Agent Development Toolkit for prompt versioning, A/B testing, and pre-deployment validation; an Agent Discovery & Governance platform giving enterprises a living inventory and policy control layer across all agents; and the Agentic Development Lifecycle framework that tied all of it together. Launched on Google Cloud Marketplace in January 2026.

Impact

  • Shipped Arthur's full agentic product suite from 0 to GA across four interconnected products
  • Open-sourced the Engine Evaluation toolkit — first real-time AI eval engine in the category
  • Launched Agent Discovery & Governance platform on Google Cloud Marketplace
  • Built prompt versioning, A/B testing, and tracing infrastructure now used by enterprise customers
  • Enabled continuous guardrails for hallucination, PII, prompt injection, and toxicity at inference time

Results

  • Arthur became the go-to AI control plane for enterprises deploying agents at scale
  • GCP Marketplace launch opened distribution to Google Cloud's enterprise customer base
  • Open-source Engine reached thousands of developers within weeks of launch
  • Platform positioned Arthur as the category-defining governance layer for agentic AI

Design

Arthur AI screenshot 1
Arthur AI screenshot 2
Arthur AI screenshot 3