AI Agents in Fraud Operations: Closing the Governance Gap

The majority of fraud teams are actively piloting AI agents, but haven’t fully deployed. Three practitioners explain the governance framework that separates the two.

A poll at a recent FraudNet practitioner roundtable asked attendees how their organizations are deploying AI agents in fraud and risk operations. Seventy-five percent said actively piloting or evaluating. The remainder were planning to implement within the next 12 to 24 months. Nobody was fully deployed.

This audience had already answered the question of whether AI agents belong in fraud operations. The question is what separates piloting from deploying, and why almost no one in the room had crossed that line.

Whitney Anderson, CEO and Co-Founder of FraudNet; Martin Naor, CEO and Founder of Bankingly; and Mayra De La Garza, Global Head of Compliance at epay/Skylight, addressed it directly. The gap, in their assessment, is governance.

What agents are built for

Mayra opened with the urgency case:

"You can't afford not to be looking at [AI agents] as a strategy. The fraudsters are. And the bad actors are going to be exploiting this." — Mayra De La Garza

Her argument on what makes agents deployable starts with their scope.

Agents carry out the work that concentrates cognitive load in fraud and AML operations without requiring investigator judgment: data gathering, processing, initial triage, and organizing inputs into a form that supports decision-making. This work is substantial. In most investigation workflows, it occupies the majority of an analyst's time and produces no outcome that requires a human to generate. Assigning it to agents concentrates the investigator's role rather than replacing it.

"[Over time, the people] in the analyst [and] investigator role — they'll become sharper and sharper at making [the] important judgment calls and being more objective, because they'll have more energy and time to focus on [judgment] rather than the data processing [and gathering]." — Mayra De La Garza

The decisions that require cultural context, non-quantifiable factors, and accumulated experience remain with investigators. Those are also the cases that matter most. As agents absorb the processing work, investigators become more focused on judgment, which is where the program's actual quality lives.

The accountability principle

Martin's starting point on governance:

"The one fundamental thing that doesn't change: there is no automated tool that is accountable for anything — and definitely not in a regulated business like ours." — Martin Naor

Accountability remaining with humans doesn't restrict how deeply automation can go. Martin's framing: use agents to surface patterns and information quickly to a human who can answer for them. The deeper automation runs, the more that humans need to understand what the agent surfaced and why.

Governance infrastructure enables the expansion of agent scope. Regulators, examiners, and risk committees need a human responsible for explaining why a decision was made. That person needs to understand the agent's output well enough to defend it. Building the infrastructure that enables that is what moves a program from pilot to deployment.

The governance model in practice

Mayra's framework for deploying agents maps onto how you'd onboard a new analyst: define the role precisely, keep a reviewer in the loop, give the agent structured feedback, and shift toward performance-based evaluation as accuracy builds.

"You should continue to keep [the] human in the loop to review and give feedback to the agents... You should have a human [who] can give it inputs — [saying,] you did well here, but you need to refine this, or you made the wrong decision because of X, Y, and Z." — Mayra De La Garza

The feedback loop serves two purposes: it improves the agent's accuracy, and it builds the governance record that makes the program auditable. Over time, the nature of oversight shifts from close review toward periodic performance evaluation — the same progression a new analyst follows as their judgment is demonstrated. The governance investment isn't a scaffold to be removed once the technology matures. It's the ongoing mechanism by which accountability is maintained.

Martin's illustration of the human-agent handoff comes from a failure his team had to fix in real time. His organization had Russia blacklisted in its inbound transaction requests. When a large number of customers traveled to Russia for the World Cup, legitimate transactions began triggering blocks, and customer service calls spiked. The pattern was visible in the data: a sudden cluster of flagged activity from a geography tied to a rule that hadn't anticipated a major sporting event. In a well-governed agent program, that pattern surfaces as an alert to the right person within minutes. A human recognizes what it means, adjusts the rule, and the false positives stop.

The agent surfaces the pattern. A human reads what it means and acts on it. Neither does the other's job. Martin's summary of how these programs actually function:

"It's not automation versus [human oversight] — it's both."

The compliance frameworks designed for human customers are already being extended to cover autonomous software actors. Mastercard and Santander completed Europe's first end-to-end AI agent payment in 2026. The fraud teams building agent oversight infrastructure are now also positioning for what follows: as AI agents begin transacting on people's behalf at scale, the same governance questions that apply to human customers will need to apply to them as well.

Learn more in Webinar: Why Fraud Rules Fall Short Against AI-Generated Fraud, including how Martin and Mayra have structured oversight in their own programs and the specific questions to resolve before moving from pilot to deployment.

Table of Contents

You might be interested in…

Get Started Today

Experience how FraudNet can help you reduce fraud, stay compliant, and protect your business and bottom line

Recognized as an Industry Leader by