AI Agent Governance for Financial Services: The Complete Framework

Banks deploying AI agents face a governance challenge: how to capture the efficiency gains while satisfying regulators, protecting customers, and maintaining accountability. The good news is that existing UK regulatory frameworks—SS1/23 for model risk, Consumer Duty for customer outcomes, SM&CR for personal accountability—already provide comprehensive coverage. There is no regulatory vacuum. The challenge is applying these frameworks correctly.

This guide presents a governance framework for AI agents in regulated financial services, built on two core models: the Tiered Governance Model for proportionate controls and the Three Lines of Defence adapted for AI systems.

Central thesis: Governance designed in from day one enables innovation. Governance retrofitted after deployment fails.

Why Existing Regulation Applies

Some firms wait for AI-specific regulation. This is a mistake. UK regulators have been clear: they regulate firms, not technologies. Existing frameworks apply.

SS1/23: Model Risk Management

The PRA’s Supervisory Statement 1/23 applies to AI agents as models. It requires:

Governance: Board-level oversight, clear ownership, defined risk appetite
Validation: Independent testing before deployment
Monitoring: Ongoing performance and drift detection
Documentation: Model specification, limitations, intended use

AI agents are models. SS1/23 applies.

Consumer Duty

The FCA’s Consumer Duty requires firms to deliver good outcomes for retail customers. For AI agents, this means:

Avoid foreseeable harm: Test for bias, errors, and edge cases before deployment
Support customer understanding: Explain AI involvement when relevant
Price and value: AI efficiency gains should benefit customers, not just margins
Consumer support: Escalation paths when AI can’t help

An AI agent that harms customers violates Consumer Duty regardless of whether the harm was intended.

SM&CR: Personal Accountability

The Senior Managers and Certification Regime requires named individuals accountable for firm activities. For AI agents:

Named SMF: Every AI agent must have a named Senior Manager accountable for its outcomes
Reasonable steps: The SMF must demonstrate reasonable steps to prevent harm
Evidence: Documentation proving oversight, not just assertions

AI cannot be accountable. A human must be. SM&CR makes this explicit and personal.

The Tiered Governance Model

Not all AI agents are equal. A chatbot answering FAQs poses different risks than an agent making credit decisions. Governance should be proportionate.

Tier 1: High Autonomy, High Impact

Characteristics:

Autonomous decisions affecting customers or finances
Significant potential for harm if wrong
Regulatory or reputational consequences

Examples: Credit decisioning, fraud detection with automatic blocks, investment recommendations

Governance Requirements:

Board-level reporting (quarterly minimum)
Independent validation before deployment
Real-time monitoring with human oversight
Kill switch with <60 second activation
Named SMF with documented accountability

Tier 2: Moderate Autonomy or Impact

Characteristics:

Decisions require human approval, OR
Impact is moderate, OR
Established use case with known risks

Examples: Customer service with human escalation, document processing with review, risk scoring for human decision

Governance Requirements:

Senior management oversight
Periodic validation (annual minimum)
Standard monitoring and alerting
Documented escalation procedures
Business owner accountability

Tier 3: Low Autonomy, Low Impact

Characteristics:

Informational only, no decisions
Minimal customer or financial impact
Well-understood, stable use case

Examples: Internal knowledge search, document summarisation, meeting notes

Governance Requirements:

Business owner sign-off
Self-assessment against standards
Standard IT controls
Proportionate monitoring

Tier Assignment Process

Tier assignment should happen early—Week 2 of a project, not Week 12. Criteria:

Factor	Tier 1	Tier 2	Tier 3
Autonomy	Fully autonomous	Human approval required	Informational only
Customer impact	Direct, significant	Indirect or moderate	Minimal
Financial impact	>£X per decision	£Y-X per decision	<£Y per decision
Regulatory exposure	High (credit, AML)	Moderate	Low
Reversibility	Difficult to reverse	Reversible with effort	Easily reversible

Early tier assignment drives the right governance intensity from the start.

Three Lines of Defence for AI

The Three Lines model is standard in financial services. Applied to AI:

First Line: Business and Technology

Responsibilities:

Owns the AI agent and its outcomes
Implements controls and monitoring
Provides frontline risk management
Escalates issues promptly

For AI specifically:

Defines use case and constraints
Implements input/output controls
Monitors performance metrics
Manages day-to-day operations

Second Line: Risk and Compliance

Responsibilities:

Sets standards and frameworks
Provides independent challenge
Monitors first line effectiveness
Reports to governance forums

For AI specifically:

Defines AI risk appetite and policy
Reviews tier assignments
Validates control effectiveness
Monitors regulatory developments

Third Line: Internal Audit

Responsibilities:

Independent assurance
Tests control design and operation
Reports to Audit Committee

For AI specifically:

Audits AI governance framework effectiveness
Tests AI-specific controls (kill switches, bias detection)
Validates documentation completeness
Assesses regulatory compliance

AI Governance Forum

A cross-functional forum provides oversight across all three lines:

Composition:

CRO (Chair)
Business representatives
Technology leadership
Risk and Compliance
Legal and Data Protection

Responsibilities:

Approves Tier 1 deployments
Reviews Tier 2 deployments
Sets AI risk appetite
Monitors aggregate AI risk
Escalates to Board

Cadence: Monthly, with emergency convening capability

Evidence by Design

Regulators want evidence, not assertions. “We review AI outputs carefully” is an assertion. Logs showing every review, reviewer, and outcome are evidence.

Documentation Requirements

For SS1/23 compliance, each AI agent needs:

Model Specification:

Purpose and intended use
Architecture and LLM providers
Data inputs and outputs
Known limitations

Risk Assessment:

Tier classification with rationale
Risk and control mapping
Residual risk acceptance

Validation Report:

Testing methodology
Results and findings
Limitations identified

Monitoring Specification:

Metrics tracked
Thresholds and alerts
Escalation procedures

Audit Trail Architecture

Build evidence generation into the system:

Every interaction logged:
- Timestamp
- Customer identifier (pseudonymised)
- Input received
- Processing steps
- LLM calls and responses
- Output delivered
- Latency and cost

This isn’t just compliance—it’s debugging, performance management, and customer service. But compliance requires it.

Control Testing Evidence

Controls must be:

Defined: Clear specification exists
Implemented: Actually built and deployed
Enabled: Switched on in production
Tested: Verified to work
Monitored: Continuous operation confirmed
Auditable: Evidence available on request

Every control needs evidence for each criterion.

Human Accountability Is Non-Negotiable

AI cannot be accountable under any current framework. When an AI agent causes harm:

The firm is liable
A named individual is accountable
“The AI did it” is not a defence

Meaningful Human Oversight

Human-in-the-loop must be meaningful, not rubber-stamping:

Meaningful: Human reviews transaction details, applies judgment, can reject or modify Rubber-stamp: Human clicks “approve” on a queue they can’t practically review

Regulators distinguish between these. So should you.

The Reasonable Steps Defence

Under SM&CR, the “reasonable steps” defence requires demonstrating:

Appropriate governance was in place
Controls were designed and operated effectively
Issues were escalated and addressed
Documentation supports the narrative

This defence requires evidence. Build the evidence generation into the system.

Engage Governance Early

The biggest governance mistake is engaging late. Week 12 discovery that a Tier 1 deployment needs Board approval delays launch by months.

Week 2, Not Week 12

Early engagement means:

Tier assignment in project planning
Governance requirements in project scope
Compliance resource allocation from start
No surprises at deployment

Governance as Enabler

Counterintuitively, early governance engagement accelerates deployment:

Requirements are clear from the start
Documentation is built as you go
Validation is planned into timeline
Approval is fast because reviewers are prepared

Governance surprises slow you down. Planned governance speeds you up.

Common Governance Failures

Failure 1: Treating AI as Exempt

“It’s just a chatbot” doesn’t exempt it from Consumer Duty. If it interacts with customers, Consumer Duty applies.

Failure 2: Governance Theatre

Forms without substance. Review meetings that don’t review. Documentation that no one reads. Regulators see through this.

Failure 3: Late Engagement

Discovering governance requirements at deployment. Building evidence after the fact. Explaining to the Board why launch is delayed.

Failure 4: Unclear Accountability

Multiple owners means no owner. “The team” isn’t accountable under SM&CR. A named individual is.

Failure 5: Static Governance

Governance set at deployment and never revisited. LLM providers change. Use cases evolve. Governance must adapt.

When to Seek Expert Help

AI governance in financial services requires regulatory expertise, technical depth, and practical experience. External expertise helps when:

Starting AI agent deployment: Getting governance right from the start prevents expensive rework
Preparing for regulatory scrutiny: Supervisors are increasingly focused on AI—be ready
Scaling AI initiatives: Governance that works for one agent may not work for twenty
Responding to incidents: AI failures require rapid, appropriate response

I help regulated firms implement AI governance frameworks that satisfy regulators while enabling innovation.

Get in touch →

Defence in Depth for AI Agents - Kill switches, circuit breakers, and control layers
LLM Provider Risk Management - Third-party risk for AI systems
AI Agent Safety: The Substrate Pattern - Architectural patterns for AI safety

Dipankar Sarkar is a technology advisor specializing in AI governance for regulated industries. With experience building AI systems at scale and deep knowledge of UK financial services regulation, he helps banks and insurers deploy AI agents that satisfy regulators while delivering business value. Learn more →

AI Agent Governance for Financial Services: The Complete Framework

Why Existing Regulation Applies

SS1/23: Model Risk Management

Consumer Duty

SM&CR: Personal Accountability

The Tiered Governance Model

Tier 1: High Autonomy, High Impact

Tier 2: Moderate Autonomy or Impact

Tier 3: Low Autonomy, Low Impact

Tier Assignment Process

Three Lines of Defence for AI

First Line: Business and Technology

Second Line: Risk and Compliance

Third Line: Internal Audit

AI Governance Forum

Evidence by Design

Documentation Requirements

Audit Trail Architecture

Control Testing Evidence

Human Accountability Is Non-Negotiable

Meaningful Human Oversight

The Reasonable Steps Defence

Engage Governance Early

Week 2, Not Week 12

Governance as Enabler

Common Governance Failures

Failure 1: Treating AI as Exempt

Failure 2: Governance Theatre

Failure 3: Late Engagement

Failure 4: Unclear Accountability

Failure 5: Static Governance

When to Seek Expert Help

Related Articles

LLM Provider Risk Management for Regulated Industries

Defence in Depth for AI Agents: Kill Switches, Circuit Breakers, and Control Layers

Scaling AI-Assisted Development: From Startup to Enterprise

Why Existing Regulation Applies

SS1/23: Model Risk Management

Consumer Duty

SM&CR: Personal Accountability

The Tiered Governance Model

Tier 1: High Autonomy, High Impact

Tier 2: Moderate Autonomy or Impact

Tier 3: Low Autonomy, Low Impact

Tier Assignment Process

Three Lines of Defence for AI

First Line: Business and Technology

Second Line: Risk and Compliance

Third Line: Internal Audit

AI Governance Forum

Evidence by Design

Documentation Requirements

Audit Trail Architecture

Control Testing Evidence

Human Accountability Is Non-Negotiable

Meaningful Human Oversight

The Reasonable Steps Defence

Engage Governance Early

Week 2, Not Week 12

Governance as Enabler

Common Governance Failures

Failure 1: Treating AI as Exempt

Failure 2: Governance Theatre

Failure 3: Late Engagement

Failure 4: Unclear Accountability

Failure 5: Static Governance

When to Seek Expert Help

Related Reading

Related Articles

LLM Provider Risk Management for Regulated Industries

Defence in Depth for AI Agents: Kill Switches, Circuit Breakers, and Control Layers

Scaling AI-Assisted Development: From Startup to Enterprise