Lead SRE - Chase UK

LONDON, United KingdomFull-timePosted Jul 1, 2026

At JPMorganChase, we understand that customers seek exceptional value and a seamless experience from a trusted financial institution. That's why we launched Chase UK to transform digital banking with intuitive and enjoyable customer journeys. With a strong foundation of trust established by millions of customers in the US, we have been rapidly expanding our presence in the UK and soon across Europe. We have been building the bank of the future from the ground up, offering you the chance to join us and make a significant impact.

As a Site Reliability Engineer at JPMorgan Chase within the International Consumer Bank, you will play a crucial role in this initiative, dedicated to delivering an outstanding banking experience to our customers. You will work in a collaborative environment as part of a diverse, inclusive, and geographically distributed team. We are seeking individuals with a curious mindset and a keen interest in new technology. Our engineers are naturally solution-oriented and possess an interest in the financial sector and focus on addressing our customer needs. We work in teams focused on improving the reliability, resilience, observability, and operability of customer-facing digital banking services. We build automation, define measurable reliability practices, reduce operational friction, and partner with engineering teams to ensure services are designed, delivered, and operated with reliability in mind.

Job responsibilities

Drive continuous improvement of reliability, monitoring, and alerting for mission-critical microservices.
Reduce operational toil through automation by building reliable infrastructure and tooling that expedites feature development.
Develop meaningful service metrics, user journeys, service-level indicators, service-level objectives, error budgets, dashboards, and actionable alerts.
Engage with development teams throughout the software lifecycle to design for reliability and scale.
Design and implement self-healing and resiliency patterns, including graceful degradation, rate limiting, circuit breakers, and failover strategies.
Partner across engineering, product, and platform teams to promote reliability standards and adoption.
Execute performance testing and capacity planning to proactively identify and remove bottlenecks.
Participate in feature planning to ensure metrics, alerting, logging, automation, resiliency, capacity, and performance needs are built in from the start.
Use approved AI tools to accelerate root-cause analysis, log and trace investigation, runbook drafting, post-incident analysis, test scaffolding, and documentation.
Continuously develop AI skills relevant to the role, including effective prompting, output validation, automation workflows, and safe usage patterns.

Required qualifications, capabilities and skills

Formal training or certification on software engineering concepts and advanced applied experience
Proven experience as a software engineer, including proficiency in at least one programming language such as Python, Go, or Java.
Demonstrated experience designing, coding, testing, and delivering software in at least one technology stack.
Strong debugging and troubleshooting skills across distributed systems.
Demonstrated experience as a Site Reliability Engineer or Site Reliability Engineer supporting production services.
Working knowledge of microservice infrastructure components, including service discovery, ingress, networking, and load balancing.
Experience with Kubernetes.
Experience with cloud computing services.
Familiarity with common observability and reliability toolchains such as Grafana, Prometheus, Elasticsearch, Kibana, or Jaeger.
Ability to use AI-assisted engineering tools responsibly, including validating outputs, understanding failure modes, and applying secure handling of sensitive information.

Preferred qualifications, capabilities and skills

Experience with AWS.
Experience building internal tooling for reliability, including command-line tools, automation pipelines, operators, or controllers.
Experience improving developer experience through golden paths, paved roads, templates, or reusable engineering patterns.
Experience applying AI to operational workflows such as alert enrichment, summarisation, runbook generation, or anomaly triage using approved tools and patterns.

#ICBCareers #ICBEngineering