SR 11-7 was published in April 2011. The iPhone 4S was released that October. GPT didn't exist. The "models" regulators were thinking about were logistic regression scorecards, credit risk PD/LGD models, and stress testing frameworks — not systems that generate text, explain their own reasoning, or produce probabilistic outputs with no closed-form derivation.
Fifteen years later, banks are deploying AI for loan origination, transaction monitoring, fraud detection, and customer risk scoring. Regulators are watching. The OCC, Fed, and FDIC have made clear that SR 11-7 applies to AI — and that the absence of AI-specific guidance doesn't exempt banks from model risk management obligations.
The Fed's 2023 AI in Financial Services report explicitly states that AI models used in credit, fraud, and compliance decisions fall within SR 11-7's scope. Examiners are asking for model inventories that include ML systems.
The Three Pillars of SR 11-7
SR 11-7 organises model risk management around three interdependent pillars. Each one creates specific obligations — and each one breaks differently when applied to modern AI systems.
Conceptual Soundness
Validation of the model's design, assumptions, and theoretical basis. For AI: understanding what the model is optimising, and why.
Ongoing Validation
Continuous performance monitoring, benchmarking, and back-testing. For AI: detecting distribution shift, concept drift, and degradation.
Governance & Controls
Ownership, change management, inventory, and accountability. For AI: version control, access controls, and decision attribution.
Pillar 1: Conceptual Soundness for AI
For a regression model, "conceptual soundness" means understanding the mathematical relationship between inputs and outputs — checking coefficients, testing assumptions, stress-testing edge cases. For an AI system, the question is harder: what is this model actually optimising, and does that objective align with the business decision it's supporting?
What examiners will ask
- Can you explain the model's prediction for a specific decision? (Explainability)
- What training data was used, and is there selection bias? (Provenance)
- Is the model's loss function appropriate for the use case? (Objective alignment)
- What are the known failure modes — and what thresholds trigger human review?
The AI-specific gap
Many AI models — particularly deep learning and large language models — are not fully interpretable. You can explain feature importance using SHAP values or attention weights, but you often cannot write a closed-form explanation of why a specific prediction was produced.
SR 11-7 doesn't require perfect interpretability. It requires that you understand the model well enough to defend it. For AI, that means: documenting what the model does, under what conditions it performs reliably, and where its boundaries are.
For each AI model in your inventory, document: (1) training data lineage, (2) objective function, (3) known limitations and edge cases, (4) explainability method used, and (5) confidence thresholds for automated vs. human-reviewed decisions.
Pillar 2: Ongoing Validation for AI
SR 11-7 requires validation that is "ongoing" — not a one-time exercise at model launch. For traditional models, this meant periodic back-testing and performance reviews. For AI, it means continuous monitoring, because AI models can drift silently in ways that regression models don't.
Types of drift that matter for compliance AI
| Drift type | What happens | AML / fraud implication |
|---|---|---|
| Data drift | Input distribution changes (e.g. new transaction types post-DeFi expansion) | Model scores become miscalibrated — high-risk entities score normal |
| Concept drift | The relationship between inputs and labels changes (e.g. new laundering typologies) | Model misses novel patterns it was never trained on |
| Label drift | Historical SAR outcomes no longer reflect current regulatory expectations | Model optimised on stale ground truth produces biased outputs |
| Upstream drift | A third-party data feed or embedding model changes behaviour | Silent degradation: your model behaves differently, but you don't know why |
What a defensible monitoring programme looks like
- Population stability index (PSI) on key input features — alert if PSI > 0.2
- Alert rate monitoring — statistically significant changes in alert volume or SAR conversion rates
- Outcome analysis — compare model predictions against confirmed SAR outcomes quarterly
- Shadow model — run a challenger model in parallel to detect performance divergence
- Inference logging — retain input/output pairs with timestamps to enable retrospective investigation
Examiners increasingly expect to see not just that monitoring is happening, but that monitoring results are acted on. A dashboard that shows drift but doesn't trigger remediation is not a control — it's a record of inaction.
Pillar 3: Governance & Controls for AI
SR 11-7's governance requirements — model inventory, ownership, change management, and independent validation — are the clearest place to start when extending MRM to AI. They're also where most banks have the largest gaps.
Model inventory: what AI changes
A traditional model inventory might list 50–100 statistical models with stable versions and infrequent updates. An AI model inventory at a mid-size bank in 2026 might cover:
- Vendor-supplied ML models embedded in core banking platforms
- Fine-tuned LLMs used for document analysis or case summarisation
- Python scripts with embedded ML logic that were never formally registered
- Third-party API calls to AI services (credit risk, sanctions, fraud scoring)
SR 11-7 requires all of these to be inventoried. The operative question is not "did we build it?" but "does it make, inform, or adjust a material decision?"
Version control and change management
AI models change more frequently than statistical models — through retraining, fine-tuning, prompt updates, and upstream model updates. Each change that materially affects decision-making should trigger a change management event: approval, validation, and documentation.
Many banks track model versions in spreadsheets or informal Confluence pages. This is not sufficient for AI. You need: (1) immutable version identifiers for each deployed model, (2) a log of when each version was deployed and by whom, and (3) the ability to reproduce any past decision with the model version that made it.
Independent validation
SR 11-7 requires that model validation be independent from model development. For AI, this creates a skills gap: independent validators need enough ML expertise to evaluate training data quality, objective function appropriateness, and monitoring methodology — not just to run back-tests.
Banks are solving this through three approaches: hiring ML-proficient model validators, engaging third-party model risk firms, or using structured documentation frameworks that force developers to produce validator-ready evidence packages.
Practical Starting Points for 2026
If you're trying to get AI governance examination-ready, here's where to focus first:
- Run an AI model discovery exercise. Identify every AI/ML system that informs a material decision — including vendor systems. Shadow IT and informal Python scripts count.
- Assign ownership. Every model in the inventory needs a named owner and a validation owner. These should be different people.
- Implement inference logging. Before you can validate ongoing performance, you need a tamper-evident record of what the model decided and why. Start here.
- Define drift thresholds. For each model, document what level of performance degradation requires escalation. Make this quantitative.
- Document the governance layer. For each model: who approved it, when it was last validated, what the next review date is, and what change events have occurred since deployment.
When examiners review AI model risk management, they're looking for evidence that you know what models you're running, you understand their limitations, you monitor them continuously, and you have a clear chain of accountability for every material AI decision. The documentation is the control.
How QLabs Addresses SR 11-7 for AI
The QLabs AI Governance module is designed specifically to fill the gap between what SR 11-7 requires and what most banks have today. It provides:
- Model Registry — version-controlled record of every AI model in production, with deployment history and ownership assignment
- Inference Logging — tamper-evident, queryable logs of every prediction, with inputs, outputs, timestamps, and model version identifiers
- Drift Monitoring — automated alerts on PSI, alert rate, and outcome metric deviation
- Compliance Exports — pre-formatted reports mapping governance evidence to SR 11-7 and OSFI E-23 requirements, ready for examiner review
The module is currently available for design partnerships. If your team is preparing for a Model Risk examination or building out an AI governance programme, we'd welcome a technical conversation.