Many AI initiatives fail because organisations underestimate what it takes to make AI safe, scalable, and accountable.
Sam Bridges-Sparkes is Head of BI Analytics & Strategy at Shawbrook Bank. His work spans business intelligence, analytics, and the responsible adoption of AI within a highly regulated banking environment.
This article looks at what it takes to move AI from experimentation into production inside a regulated bank - and why governance, culture, and people matter as much as models and infrastructure.
Why Proofs of Concept Rarely Survive Contact With Reality
Building an AI proof of concept has never been easier.
Large language models make it possible to produce something impressive in minutes, often without writing a single line of code.
That ease can be misleading. UK financial services firms are still working out how to scale AI responsibly, as the Bank of England and FCA joint report on AI in UK financial services shows. At Shawbrook, early AI work started with a deliberately low-risk internal use case.
The team focused on competitor analysis using public data. This allowed them to explore value without immediate customer impact.
However, even this relatively contained use case exposed how difficult it was to move from experimentation into production.
AI systems did not fit neatly into established delivery pathways designed for deterministic technology.
Testing, governance, and operational ownership all required rethinking.
"You can knock something up very quickly now, but turning that into a productionised, governed, reusable piece of technology is really quite difficult."
Deterministic Expectations vs Probabilistic Systems
Traditional banking systems assume repeatability. The same input should always produce the same output.
Large language models do not always behave this way. Reassuring colleagues was harder when identical prompts could generate different responses.
Binary testing models based on exact matches were no longer sufficient. Instead, Shawbrook had to define acceptable ranges of output quality, aligned with model risk management expectations for banks.
In practice, this involved combining AI-assisted evaluation with human review.
Governance and risk teams needed confidence that this approach was defensible.
Human-in-the-Loop Is a Design Choice
Augmentation, Not Automation
AI at Shawbrook is positioned as an accelerator, not a decision-maker.
At Shawbrook, AI is framed as getting teams "eight rungs up the ladder". The remaining steps require human judgement.
Roles shift from doing work to evaluating outcomes. This framing helped internal stakeholders become more comfortable with adoption.
It also preserves accountability, which remains essential in regulated environments.
"Your role becomes much more of an evaluator rather than a doer."
That evaluator role matters most when AI moves closer to customer-facing work, where audit trails and approval evidence become non-negotiable.
"When AI starts touching customer-facing workflows, auditability stops being an IT detail. Teams need evidence of what was sent, who approved it, and what happened if a decision is challenged later."
Customer-impacting workflows require deeper controls and evaluation.
Guardrails reduce risk, but they do not eliminate the need for ongoing monitoring and reassessment.
Why One Model Will Never Be Enough
The Case for Hybrid AI
LLMs excel at reading and writing. They still struggle with complex financial analysis at scale.
Sam has described AI as a "shiny new intern" in the past - it is ambitious and always attempts an answer, even when wrong.
These confident errors are big risks in financial contexts.
As a result, Shawbrook expects to rely on a hybrid approach.
Traditional machine learning and deterministic systems remain essential for some workloads. Flexibility also reduces dependency on any single vendor or model.
Where the Real Near-Term Value Sits
Productivity Without Losing Trust
The strongest near-term gain, for Sam, sits with individuals.
AI helps remove low-value tasks and accelerate everyday work. The aim is automating the mundane to empower the brain.
These efficiency gains could be so powerful that people have talked about their ability to reshape organisational structures.
Rather than reducing teams, it changes where effort can be applied. Customer-facing roles can remain human-led, supported by AI.
Sam cautions against the hype. In regulated sectors it is best to adopt a "fast follower" mindset, waiting to see who successfully deploys more comprehensive automated decisioning that regulators accept.
FAQs
Why Do AI Proofs of Concept Fail to Scale in Banking?
Because production systems must meet governance, testing, and accountability standards that proofs of concept rarely address.
Can Large Language Models Be Fully Automated in Regulated Environments?
Not safely today, as explainability, consistency, and accountability still require human oversight.
Is AI Replacing Roles in Banking?
The near-term impact is augmentation, improving productivity rather than removing people.
Why Is Hybrid AI Becoming Necessary?
Because different tasks require different approaches, and LLMs are not suited to all financial workloads.
What Is the Biggest Risk of Unmanaged AI Adoption?
Shadow AI usage without visibility, governance, or data controls.
Sam Kendall works on digital marketing at Beyond Encryption, helping build B2B marketing activity around research, first principles, and sustainable growth. He writes about marketing effectiveness, positioning, customer communications, and digital culture, with longer-form work published at ATNL.