AI Job Summary

8+ years in DevOps, Platform Engineering, SRE, or Infrastructure Engineering.
Hands-on experience operating cloud-native platforms primarily within AWS.
Experience leading reliability engineering: HA/DR, capacity planning, observability, and incident management.

Role Type

Permanent • Full-time • Mid-level Senior

Pay Rate

£70,000 GBP – £75,000 GBP (Annum)

Description

Platform Operations Manager (DevOps & Site Reliability Engineering)

Location: Hybrid – working from our Canary Wharf office 2-3 times per week

Reports to: Director of Engineering

Salary: Up to £75,000

Manages: Platform Operations & DevOps Team (UK & India)

✨Join Us at The Centre for ADHD Research and Excellence: Shaping the Future of Accessible Healthcare✨

At Care ADHD, our mission is to transform ADHD care through innovation, data, and technology — delivering accessible, patient-centred services that improve outcomes for individuals, clinicians, and healthcare providers.

We believe that high-quality data and meaningful insight are essential to improving clinical services, understanding patient journeys, and ensuring that care is delivered efficiently and effectively.

💫The Role

We are looking for an experienced and hands-on Platform Operations Lead to own the reliability, availability, performance, and operational stability of Care ADHDʼs technology platforms. This role combines DevOps, Site Reliability Engineering (SRE), cloud infrastructure, platform operations, and technical leadership — ensuring that our systems are securely deployed, highly available, scalable, and operational 24/7/365. You will lead platform operations across both the UK and India, working closely with engineering, QA, security, and product teams to ensure our infrastructure and deployment capabilities support a fast-moving and high-quality engineering organisation. This is a highly technical leadership role requiring someone who is equally comfortable defining operational strategy, improving engineering practices, and being hands-on with cloud infrastructure, automation, monitoring, incident response, and reliability engineering.

😎What You’ll Be Doing

Key Responsibilities:

Platform Reliability & Operations

Own the operational health, availability, and reliability of all production and non-production environments
Ensure platforms are monitored, maintained, and operational 24/7/365
Lead platform incident management, root cause analysis, and service recovery processes
Establish and improve operational readiness, resilience, and disaster recovery capabilities
Define and manage SLAs, SLOs, and operational performance metrics
Ensure high levels of platform uptime, stability, scalability, and security

DevOps & Infrastructure Engineering

Design, build, and maintain cloud infrastructure primarily within AWS
Lead infrastructure automation and Infrastructure as Code initiatives using Terraform or AWS CDK
Design and optimise CI/CD pipelines to support efficient, secure, and reliable software delivery
Improve deployment automation, release management, and environment consistency Support engineering teams with platform tooling, deployment strategies, and operational best practices
Drive improvements in: Deployment reliability Infrastructure scalability Platform security Cost optimisation Operational efficiency

Site Reliability Engineering (SRE)

Implement and maintain observability solutions including: Monitoring Logging Alerting Tracing
Develop proactive approaches to incident prevention and operational resilience
Lead reliability engineering practices including: Capacity planning Performance monitoring Fault tolerance High availability design
Reduce operational toil through automation and self-service tooling
Establish strong incident response and post-incident review processes

Leadership & Team Management

Lead and mentor platform operations and DevOps engineers across the UK and India
Build a collaborative, accountable, and high-performing operational culture
Allocate and coordinate operational resources across projects and platform priorities
Work closely with the Director of Engineering to align platform strategy with product and engineering delivery goals
Collaborate with engineering leads, QA, security, and product teams to support platform and release readiness

Security, Compliance & Governance

Ensure infrastructure and operational processes follow security best practices
Support compliance with GDPR and healthcare-related operational standards
Help implement operational governance, access controls, and infrastructure security policies
Work closely with security and engineering teams to manage vulnerabilities and operational risk

Technology Environment

AWS cloud infrastructure
Kubernetes and containerised services
Serverless platforms (AWS Lambda, API Gateway)
Node.js / TypeScript applications
PostgreSQL and cloud-native databases
Terraform / AWS CDK
CI/CD pipelines and deployment automation
Monitoring and observability tooling
Microservices and event-driven architectures

🚀What We’re Looking For

Experience

8+ years of experience in DevOps, Platform Engineering, SRE, or Infrastructure Engineering
Proven experience leading operational or platform engineering teams
Strong experience managing distributed or offshore technical teams
Experience supporting business-critical production systems with high availability requirements
Experience operating cloud-native platforms in AWS environment

Technical Skills

Strong hands-on experience with:

AWS cloud infrastructure and services
CI/CD pipeline design and automation
Infrastructure as Code (Terraform or AWS CDK)
Kubernetes and container orchestration
Monitoring, logging, and observability platforms
Incident management and operational support
Linux systems administration and networking fundamentals

Strong understanding of:

Site Reliability Engineering principles
High availability and disaster recovery design
Platform scalability and resilience
Security and operational governance
Performance optimisation and capacity planning

Experience with tools such as:

Terraform
GitHub Actions / GitLab CI / Jenkins
CloudWatch
Datadog / Grafana / Prometheus
Docker / Kubernetes
PagerDuty or similar incident management tooling

Leadership Competencies

Operational Leadership

Strong ownership mindset with the ability to lead operational stability and platform reliability across the organisation.

Communication Excellent communication and stakeholder management skills, particularly across distributed engineering teams.

Problem Solving Calm and effective under pressure with strong incident management and troubleshooting capabilities.

Collaboration Works effectively across engineering, product, QA, and security teams to support reliable platform delivery.

🏆What Success Looks Like

Stable, secure, and highly available platforms operating 24/7/365
Reliable and efficient deployment and release processes
Strong monitoring, observability, and incident management practices
Reduced downtime, operational risk, and deployment failures
High-performing platform operations teams across the UK and India
Engineering teams enabled through strong platform tooling and operational support

🙏🏻Why Join Care ADHD

This is an opportunity to play a critical role in building and operating the platform infrastructure behind a growing digital healthcare organisation focused on improving ADHD care and patient outcomes. Youʼll have significant influence over platform reliability, operational strategy, engineering enablement, and cloud infrastructure — helping ensure the technology powering our services is secure, scalable, and always available.

🙏🏻What You can Expect From Us

Competitive salary
Hybrid working – work from our Canary Wharf office 2-3 times per week
25 days annual leave (plus UK public holidays)
Team get-togethers
A paid day off on your birthday
Office equipment when you join
£500 stipend to set up your home office*
Pension contribution
Be part of one of the UK’s most ambitious HealthTech start-ups

🗓️Our Hiring Process

We aim to make our hiring process as streamlined as possible. Successful applicants will have:

Stage 1 – Screening Call with our Talent Acquisition Specialist

Stage 2 – Interview with the Director of Engineering

Stage 3 – Panel Interview

Stage 4 – Offer

🩵Apply with Confidence

Studies show that men apply for roles when they meet around 60% of the qualifications, whereas women and other marginalised groups often apply only if they meet every requirement. If you believe you’re a great fit but don’t meet every single requirement, we encourage you to apply!

At Care ADHD, we’re committed to building a diverse and inclusive environment. We encourage applications from candidates of all backgrounds, especially those from historically marginalised communities, as we work together to create a more equitable future.

Applications will close on the 9th July, or before if high volumes of applications are received.

Company Overview

At CARE ADHD, we’re revolutionising private healthcare by making ADHD assessments and treatment more affordable and accessible to those who need it. Our client-centred approach, combined with lean methodology and a focus on continuous improvement, drives our commitment to excellence. We embrace an innovative mindset, encouraging rapid learning and adaptation through our ‘fail fast’ ethos. With ambitious plans to become the largest ADHD service provider outside the NHS within the next five years, we are committed to pushing boundaries and fostering innovation.

Platform Operations Manager (DevOps & Site Reliability Engineering)

Role Type

Pay Rate

Description

Platform Operations Manager (DevOps & Site Reliability Engineering)

Company Overview

Share this job