AI & Machine Learning

FDA Has No Framework for Agentic Clinical AI. ARPA-H Is About to Create One.

ARPA-H is selecting teams this month to build the first FDA-authorized agentic clinical AI — a system that writes prescriptions, adjusts medications, and acts as a 24/7 cardiovascular care member. FDA has no published guidance for agentic AI validation, no clearance precedent, and no framework that covers autonomous clinical agents.

Share this article
Comments
Share:
ARPA-H is selecting teams this month to build the first FDA-authorized agentic clinical AI — a system that writes prescriptions, adjusts medications, and acts as a 24/7 cardiovascular care member. FDA has no published guidance for agentic AI validation, no clearance precedent, and no framework that covers autonomous clinical agents.
Table of Contents

The FDA has authorized more than 1,350 AI-enabled medical devices. Every one of them is a predictive, fixed-function system — a model that takes defined inputs, returns a recommendation, and relies on a clinician to review and act on that output. That is the regulatory model the FDA built, the validation pathway it has refined over five years, and the framework every healthcare AI team has built their compliance architecture around.

ARPA-H is now selecting teams to build something that breaks every assumption in that model.

The Agentic AI-EnableD CardioVascular CAre TransfOrmation program — ADVOCATE — launched in January 2026 with a single, ambitious goal: develop the first FDA-authorized agentic clinical AI system capable of serving as a 24/7 member of a cardiovascular care team. Not a recommendation engine. Not a diagnostic assistant. An autonomous agent that connects to patient records, adjusts medications, writes prescriptions, schedules appointments, and provides diet and physical therapy guidance — continuously, without a clinician in the loop for each action.

ARPA-H will select innovation teams this month. The full program runs 39 months. FDA review is baked into that timeline.

There is currently no FDA guidance on how to validate an agentic clinical AI. No clearance precedent. No pathway. The agency has approved exactly zero autonomous AI systems capable of independent clinical action. What ADVOCATE is proposing to build, and what it is asking the FDA to authorize, does not fit inside any existing regulatory framework.

For healthcare AI architects, this is the production governance problem that matters most in 2026 — not because ADVOCATE is years away, but because the governance gap it exposes is already present in the clinical AI systems being deployed today.

Regulatory & Compliance Angle

FDA’s January 2026 revised guidance on Clinical Decision Support software took a deliberately permissive stance. For AI tools that provide a single, clinically appropriate recommendation where a clinician can “independently review the basis for the recommendation,” FDA extended enforcement discretion even when the tool uses generative AI. The intent was to accelerate AI-assisted care while keeping humans meaningfully in the loop.

ADVOCATE’s clinical AI agent cannot satisfy this condition. A system that writes prescriptions, adjusts medication dosing, modifies care plans, and schedules follow-up appointments is executing a chain of interdependent clinical decisions. A cardiologist cannot “independently review the basis” for each link in that chain without effectively re-doing the clinical reasoning the agent performed — defeating the purpose of the system and the program’s operational goals.

FDA’s existing SaMD Lifecycle Management guidance, updated in 2025, addresses AI systems that continuously update their internal models through post-market learning. This was already a stretch from traditional 510(k) clearance, which evaluates a fixed algorithm against a defined test set. But the guidance still assumes the system’s outputs are recommendations, not autonomous actions. ADVOCATE’s agent is not recommending that a patient’s beta-blocker dose be adjusted — it is adjusting it.

Prescription writing adds a further legal layer that FDA guidance has never addressed. In the United States, prescribing authority is governed by state medical practice acts and requires licensure — a legal status an AI agent cannot hold. DEA controlled substance regulations add additional constraints that have no AI-specific carve-out. ARPA-H’s program description notes the agent should be able to “write and modify prescriptions,” but neither ARPA-H nor FDA has published guidance on how an authorized clinical AI agent navigates these boundaries. The 39-month program timeline assumes these questions get resolved during the development process. That is an optimistic assumption.

The QMSR — FDA’s Quality Management System Regulation that took effect February 2, 2026, aligning U.S. oversight with ISO 13485:2016 — adds a manufacturing-side quality framework but does not address the fundamental validation problem for autonomous agents. You cannot quality-manage an agent’s behavior the same way you quality-manage a device’s output range.

Model Risk Reality

Traditional SaMD model risk follows a recognizable pattern: characterize the algorithm, define its intended use and user population, build a validation dataset representative of deployment conditions, test against pre-specified performance criteria, and submit. FDA reviewers evaluate whether the claimed performance is credible and whether the intended use is safe. It is a rigorous but fundamentally static assessment of a fixed artifact.

Agentic clinical AI breaks this pattern at every step.

The algorithm is not fixed. ADVOCATE explicitly requires the clinical AI agent to operate continuously — which means the model that passes FDA review in month 39 is not necessarily the model that is operating in month 42 after it has been shaped by post-deployment interactions with real patients. FDA’s adaptive algorithm framework acknowledges this problem for learning-enabled medical devices but has never published a pre-specified change protocol for an agent that is simultaneously acting on patients and learning from those actions.

The validation dataset cannot cover the action space. A fixed AI diagnostic tool operates over a defined input distribution. An autonomous agent operates in an open-ended action space — it can take actions nobody anticipated during validation. ADVOCATE’s supervisory agent requirement acknowledges this: one of the program’s three technical components is a “supervisory agent that ensures clinical AI agents’ consistent safety and effectiveness.” But FDA has never evaluated a meta-governance architecture where one AI system is responsible for monitoring the safety of another AI system. This is precisely the structure the Financial Stability Board just recommended for banking AI — and the FSB notes that banking regulators have no validated methodology for it either.

The intended use boundary is ambiguous. A clinical AI agent that operates 24/7 across a patient population will encounter clinical scenarios outside its training distribution. The model risk question is not just whether the agent performs well on average — it is whether the agent knows when it is out of distribution and whether it reliably deescalates to human oversight in those cases. FDA’s current framework has no mechanism for validating this capability in an autonomous agent.

Healthcare AI teams building clinical decision support today should read ADVOCATE’s program requirements carefully. Many of the governance problems ADVOCATE is trying to solve exist in scaled-down form in the CDS tools already in production. An AI system that generates a care plan recommendation touching multiple clinical domains is not structurally different from an agent in the way that matters for governance — it just takes more steps for the clinical AI to reach the same action.

The Governance Gap

ARPA-H’s program description is admirably direct about what ADVOCATE requires: an agent that can perform clinical tasks “a cardiologist could do over the phone.” The parallel is instructive. A cardiologist operating over the phone is still a licensed physician, governed by professional standards, medical board oversight, malpractice liability, and the ability to be held accountable as an individual. None of those accountability mechanisms exist for an AI agent.

The FDA’s pathway for ADVOCATE — whatever form it ultimately takes — will have to create accountability structures that don’t currently exist in law, regulation, or clinical governance frameworks. The most likely approach is to treat the health system deploying the agent as the responsible party, following the accountability model used for laboratory-developed tests. Under this model, the health system that deploys the ADVOCATE agent becomes the entity responsible for its clinical performance, its safety monitoring, and its compliance with state prescribing laws. The AI developer provides a platform; the health system assumes the clinical liability.

This is architecturally significant. Health systems deploying clinical AI agents under this accountability model will need clinical AI governance infrastructure that most of them do not have: real-time behavioral monitoring of agent actions, incident reporting workflows, autonomous decision audit trails, and mechanisms to identify and investigate patient harm that may have originated from an AI action in a care chain. This is not a technology problem — it is a governance architecture problem, and it is not solved by the model’s validation package.

The vendor side has its own governance gap. ADVOCATE seeks teams that will build an agent system and co-deploy it with partner health systems. The development timeline assumes that the governance architecture — the supervisory agent, the clinical validation framework, the FDA authorization pathway — all get designed and built within 39 months. The history of complex regulatory submissions in medtech suggests this is extraordinarily optimistic. The most sophisticated medical device submissions take four to seven years even when the regulatory pathway is well-established. ADVOCATE is proposing to establish the pathway at the same time as it builds the device.

What typically happens when a new device category encounters this kind of regulatory vacuum is that a wave of early market entrants deploy systems with ambiguous regulatory status, accumulate real-world data, and either succeed commercially before FDA acts or face enforcement action. The difference with an autonomous clinical AI is that “deploying before the framework exists” means deploying a prescription-writing agent on real patients with no regulatory backstop if it gets the clinical decision wrong.

The SuperML Take

ARPA-H’s ADVOCATE is not just a research program. It is a forcing function for the entire healthcare AI governance stack — and for everyone in this space, the most important part of the story is not what ARPA-H is building but what it reveals about what the industry is already deploying without a framework.

Every health system running a clinical AI system that touches diagnosis, treatment planning, medication management, or care coordination is operating in the same regulatory ambiguity that ADVOCATE is trying to resolve. The January 2026 CDS guidance gave the market more room to operate in that ambiguity — you can deploy AI-assisted clinical recommendations as long as a clinician can review the logic. In practice, many deployed systems are closer to the autonomous end of the spectrum than the enforcement discretion guidance intended. The healthcare AI market has been running a governance experiment on real patients, and ADVOCATE is the first formal acknowledgment that the experiment needs a framework.

The 39-month ADVOCATE timeline is almost certainly insufficient to generate the regulatory clarity healthcare AI teams actually need. FDA will need to publish a draft guidance document on agentic clinical AI — one that addresses autonomous action chains, supervisory AI governance, accountability models for prescription-writing agents, and post-market behavioral monitoring requirements. That guidance will take time to develop, time to comment on, and time to finalize. The path from “ARPA-H selects teams in June 2026” to “FDA publishes final guidance on agentic clinical AI” realistically runs to 2029 or 2030, even under optimistic assumptions about regulatory bandwidth and agency prioritization.

What this means practically for healthcare AI architects: the systems you are building today will face the governance framework that ADVOCATE forces FDA to create, and they should be designed with that in mind. Build behavioral audit trails for every autonomous or semi-autonomous clinical action. Document the action boundary between recommendation and execution explicitly. Build supervisory monitoring architecture — not because ADVOCATE requires it, but because the accountability model that emerges from ADVOCATE will almost certainly require it retrospectively. Treat your clinical AI systems as if the governance framework that does not yet exist will apply to them, because it will.

For teams considering clinical agentic AI deployments in the next 24 months: the ADVOCATE program announcement is not a green light. It is a very clear signal that FDA is not ready to authorize this class of system, and that the regulatory vacuum preceding authorization will be filled by enforcement actions against early movers who misread FDA’s permissive January 2026 CDS stance as covering autonomous clinical agents. The gap between “recommendation AI with enforcement discretion” and “prescription-writing agent with FDA authorization” is not a narrow one. ARPA-H is going to spend 39 months trying to cross it with federal funding and full agency coordination. Plan accordingly.

Sources

Enterprise AI Architecture

Want more enterprise AI architecture breakdowns?

Subscribe to SuperML.

Comments

Sign in to leave a comment

Back to Blog

Related Posts

View All Posts »