At Sanity.io, we’re building the future of AI-powered Content Operations. Our AI Content Operating System gives teams the freedom to model, create, and automate content the way their business works, accelerating digital development and supercharging content operations efficiency.
It's always peak hour somewhere: with infrastructure and customers spanning every continent, a Sanity SRE makes sure the platform we build is scalable and fast, safe to deploy, and inspiring to use. The scale is real: Content Lake alone handles around 75,000 requests a second, about 4.5m a minute, and companies like SKIMS, Figma, Riot Games, Anthropic, COMPLEX, Nordstrom, Arc’teryx, and Morningbrew run their content operations on it.
Our stack is built on a mix of the tried and tested and the bleeding edge. The core technologies we currently use include Kubernetes, Prometheus, ElasticSearch, PostgreSQL, NATS, Kong, Fastly, and Google Cloud Platform.
The SRE role involves close partnership with our development teams to design and build infrastructure that supports our goal: to be the best platform for authoring, processing, and distributing content worldwide in real time. You will work close to the metal on the security, stability, and performance our customers have come to expect, and help raise the reliability bar as we grow.
What you would do:
Design, build, and operate the shared platform foundations engineers ship on every day: GCP infrastructure, Kubernetes, networking, routing, CI/CD, and observability.
Diagnose and troubleshoot complex distributed systems running at high request volume.
Ensure observability and analyze the behavior of our stack.
Contribute to in-flight work like modernizing our edge, caching, and gateway layers onto Fastly and tightening observability across the platform.
Raise the reliability bar through better dashboards, alert severity, paging standards, on-call readiness, and incident response.
Make deployment boring in the best way: build golden paths, production readiness checks, safe rollouts, and useful automation so engineers have fewer places to look before they ship.
Mentor engineers and raise the technical bar through code review, design review, and pairing.
Participate in our on-call rotation and help our developer on-call rollout land well.
About you:
Based in the United States, with reasonable overlap with European engineering hours.
Experience with SRE/DevOps tools, processes, and culture.
5+ years of experience as part of an SRE on-call rotation.
Analytical approach to designing, diagnosing, and optimizing infrastructure.
Experience with managing scalable, highly available, cloud-based applications, ideally with high request volume and customer-facing uptime expectations.
Experience with Kubernetes for orchestrating, scaling, and managing containerized applications in cloud-based environments.
Experience building CI/CD pipelines.
Experience with an observability stack (Prometheus, et al.).
Comfortable working across CDNs, edge, gateways, and caching layers, or eager to go deep there.
You improve on-call and reliability by building systems, standards, and feedback loops that make production healthier over time.
You are comfortable dealing with incidents and outages and have built a practical, thoughtful communication style for handling high-pressure situations.
An open but considered approach to new technologies.
There are many roads leading up to being an SRE. Our team is already a mix of self-taught and formally educated people. Don't self-select out!
What we can offer:
A highly-skilled, inspiring, and supportive team
Real infrastructure scale and meaningful, hands-on work changing how it runs
Positive, flexible, and trust-based work environment that encourages long-term professional and personal growth
A global, multi-culturally diverse group of colleagues and customers
Comprehensive health plans and perks
A healthy work-life balance that accommodates individual and family needs
Competitive stock options program and location-based salary
Who we are:
Sanity.io is a modern content operating system that replaces rigid legacy content management systems. We treat content as data, so teams can keep one governed source of truth and adapt it across websites, apps, workflows, and AI agents with less duplicated content work.
Sanity recently raised an $85m Series C led by GP Bullhound and is backed by ICONIQ Growth, Threshold Ventures, Heavybit, Shopify, and founders from Vercel, WP Engine, Twitter, Mux, Netlify, and Heroku.
Sanity is a 200+ person company with committed, ambitious people. We are pioneers, we exist for our customers, we are hel ved, and we love type 2 fun. Read more about our values here.
Sanity.io pledges to be an organization that reflects the globally diverse audience our product serves. We believe that hiring the best talent and bringing together a diversity of perspectives, ideas, and cultures leads to better products and services. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, marital status, disability, or gender identity.