AI Agent Governance for Financial Services: The Complete Framework
Banks deploying AI agents face a governance challenge: how to capture the efficiency gains while satisfying regulators, protecting customers, and maintaining accountability. The good news is that existing UK regulatory frameworks—SS1/23 for model risk, Consumer Duty for customer outcomes, SM&CR for personal accountability—already provide comprehensive coverage. There is no regulatory vacuum. The challenge is applying these frameworks correctly.
This guide presents a governance framework for AI agents in regulated financial services, built on two core models: the Tiered Governance Model for proportionate controls and the Three Lines of Defence adapted for AI systems.
Central thesis: Governance designed in from day one enables innovation. Governance retrofitted after deployment fails.
Why Existing Regulation Applies
Some firms wait for AI-specific regulation. This is a mistake. UK regulators have been clear: they regulate firms, not technologies. Existing frameworks apply.
SS1/23: Model Risk Management
The PRA’s Supervisory Statement 1/23 applies to AI agents as models. It requires:
- Governance: Board-level oversight, clear ownership, defined risk appetite
- Validation: Independent testing before deployment
- Monitoring: Ongoing performance and drift detection
- Documentation: Model specification, limitations, intended use
AI agents are models. SS1/23 applies.
Consumer Duty
The FCA’s Consumer Duty requires firms to deliver good outcomes for retail customers. For AI agents, this means:
- Avoid foreseeable harm: Test for bias, errors, and edge cases before deployment
- Support customer understanding: Explain AI involvement when relevant
- Price and value: AI efficiency gains should benefit customers, not just margins
- Consumer support: Escalation paths when AI can’t help
An AI agent that harms customers violates Consumer Duty regardless of whether the harm was intended.
SM&CR: Personal Accountability
The Senior Managers and Certification Regime requires named individuals accountable for firm activities. For AI agents:
- Named SMF: Every AI agent must have a named Senior Manager accountable for its outcomes
- Reasonable steps: The SMF must demonstrate reasonable steps to prevent harm
- Evidence: Documentation proving oversight, not just assertions
AI cannot be accountable. A human must be. SM&CR makes this explicit and personal.
The Tiered Governance Model
Not all AI agents are equal. A chatbot answering FAQs poses different risks than an agent making credit decisions. Governance should be proportionate.
Tier 1: High Autonomy, High Impact
Characteristics:
- Autonomous decisions affecting customers or finances
- Significant potential for harm if wrong
- Regulatory or reputational consequences
Examples: Credit decisioning, fraud detection with automatic blocks, investment recommendations
Governance Requirements:
- Board-level reporting (quarterly minimum)
- Independent validation before deployment
- Real-time monitoring with human oversight
- Kill switch with <60 second activation
- Named SMF with documented accountability
Tier 2: Moderate Autonomy or Impact
Characteristics:
- Decisions require human approval, OR
- Impact is moderate, OR
- Established use case with known risks
Examples: Customer service with human escalation, document processing with review, risk scoring for human decision
Governance Requirements:
- Senior management oversight
- Periodic validation (annual minimum)
- Standard monitoring and alerting
- Documented escalation procedures
- Business owner accountability
Tier 3: Low Autonomy, Low Impact
Characteristics:
- Informational only, no decisions
- Minimal customer or financial impact
- Well-understood, stable use case
Examples: Internal knowledge search, document summarisation, meeting notes
Governance Requirements:
- Business owner sign-off
- Self-assessment against standards
- Standard IT controls
- Proportionate monitoring
Tier Assignment Process
Tier assignment should happen early—Week 2 of a project, not Week 12. Criteria:
| Factor | Tier 1 | Tier 2 | Tier 3 |
|---|---|---|---|
| Autonomy | Fully autonomous | Human approval required | Informational only |
| Customer impact | Direct, significant | Indirect or moderate | Minimal |
| Financial impact | >£X per decision | £Y-X per decision | <£Y per decision |
| Regulatory exposure | High (credit, AML) | Moderate | Low |
| Reversibility | Difficult to reverse | Reversible with effort | Easily reversible |
Early tier assignment drives the right governance intensity from the start.
Three Lines of Defence for AI
The Three Lines model is standard in financial services. Applied to AI:
First Line: Business and Technology
Responsibilities:
- Owns the AI agent and its outcomes
- Implements controls and monitoring
- Provides frontline risk management
- Escalates issues promptly
For AI specifically:
- Defines use case and constraints
- Implements input/output controls
- Monitors performance metrics
- Manages day-to-day operations
Second Line: Risk and Compliance
Responsibilities:
- Sets standards and frameworks
- Provides independent challenge
- Monitors first line effectiveness
- Reports to governance forums
For AI specifically:
- Defines AI risk appetite and policy
- Reviews tier assignments
- Validates control effectiveness
- Monitors regulatory developments
Third Line: Internal Audit
Responsibilities:
- Independent assurance
- Tests control design and operation
- Reports to Audit Committee
For AI specifically:
- Audits AI governance framework effectiveness
- Tests AI-specific controls (kill switches, bias detection)
- Validates documentation completeness
- Assesses regulatory compliance
AI Governance Forum
A cross-functional forum provides oversight across all three lines:
Composition:
- CRO (Chair)
- Business representatives
- Technology leadership
- Risk and Compliance
- Legal and Data Protection
Responsibilities:
- Approves Tier 1 deployments
- Reviews Tier 2 deployments
- Sets AI risk appetite
- Monitors aggregate AI risk
- Escalates to Board
Cadence: Monthly, with emergency convening capability
Evidence by Design
Regulators want evidence, not assertions. “We review AI outputs carefully” is an assertion. Logs showing every review, reviewer, and outcome are evidence.
Documentation Requirements
For SS1/23 compliance, each AI agent needs:
Model Specification:
- Purpose and intended use
- Architecture and LLM providers
- Data inputs and outputs
- Known limitations
Risk Assessment:
- Tier classification with rationale
- Risk and control mapping
- Residual risk acceptance
Validation Report:
- Testing methodology
- Results and findings
- Limitations identified
Monitoring Specification:
- Metrics tracked
- Thresholds and alerts
- Escalation procedures
Audit Trail Architecture
Build evidence generation into the system:
Every interaction logged:
- Timestamp
- Customer identifier (pseudonymised)
- Input received
- Processing steps
- LLM calls and responses
- Output delivered
- Latency and cost
This isn’t just compliance—it’s debugging, performance management, and customer service. But compliance requires it.
Control Testing Evidence
Controls must be:
- Defined: Clear specification exists
- Implemented: Actually built and deployed
- Enabled: Switched on in production
- Tested: Verified to work
- Monitored: Continuous operation confirmed
- Auditable: Evidence available on request
Every control needs evidence for each criterion.
Human Accountability Is Non-Negotiable
AI cannot be accountable under any current framework. When an AI agent causes harm:
- The firm is liable
- A named individual is accountable
- “The AI did it” is not a defence
Meaningful Human Oversight
Human-in-the-loop must be meaningful, not rubber-stamping:
Meaningful: Human reviews transaction details, applies judgment, can reject or modify Rubber-stamp: Human clicks “approve” on a queue they can’t practically review
Regulators distinguish between these. So should you.
The Reasonable Steps Defence
Under SM&CR, the “reasonable steps” defence requires demonstrating:
- Appropriate governance was in place
- Controls were designed and operated effectively
- Issues were escalated and addressed
- Documentation supports the narrative
This defence requires evidence. Build the evidence generation into the system.
Engage Governance Early
The biggest governance mistake is engaging late. Week 12 discovery that a Tier 1 deployment needs Board approval delays launch by months.
Week 2, Not Week 12
Early engagement means:
- Tier assignment in project planning
- Governance requirements in project scope
- Compliance resource allocation from start
- No surprises at deployment
Governance as Enabler
Counterintuitively, early governance engagement accelerates deployment:
- Requirements are clear from the start
- Documentation is built as you go
- Validation is planned into timeline
- Approval is fast because reviewers are prepared
Governance surprises slow you down. Planned governance speeds you up.
Common Governance Failures
Failure 1: Treating AI as Exempt
“It’s just a chatbot” doesn’t exempt it from Consumer Duty. If it interacts with customers, Consumer Duty applies.
Failure 2: Governance Theatre
Forms without substance. Review meetings that don’t review. Documentation that no one reads. Regulators see through this.
Failure 3: Late Engagement
Discovering governance requirements at deployment. Building evidence after the fact. Explaining to the Board why launch is delayed.
Failure 4: Unclear Accountability
Multiple owners means no owner. “The team” isn’t accountable under SM&CR. A named individual is.
Failure 5: Static Governance
Governance set at deployment and never revisited. LLM providers change. Use cases evolve. Governance must adapt.
When to Seek Expert Help
AI governance in financial services requires regulatory expertise, technical depth, and practical experience. External expertise helps when:
- Starting AI agent deployment: Getting governance right from the start prevents expensive rework
- Preparing for regulatory scrutiny: Supervisors are increasingly focused on AI—be ready
- Scaling AI initiatives: Governance that works for one agent may not work for twenty
- Responding to incidents: AI failures require rapid, appropriate response
I help regulated firms implement AI governance frameworks that satisfy regulators while enabling innovation.
Related Reading
- Defence in Depth for AI Agents - Kill switches, circuit breakers, and control layers
- LLM Provider Risk Management - Third-party risk for AI systems
- AI Agent Safety: The Substrate Pattern - Architectural patterns for AI safety
Dipankar Sarkar is a technology advisor specializing in AI governance for regulated industries. With experience building AI systems at scale and deep knowledge of UK financial services regulation, he helps banks and insurers deploy AI agents that satisfy regulators while delivering business value. Learn more →