Reliability Engineer, Supercomputing
San FranciscoNetwork Engineer, Supercomputing
San FranciscoNetwork Engineer role at AI/ML research lab focused on large-scale GPU cluster networking, RDMA/RoCE fabric debugging, NVLink management, and infrastructure instrumentation. Full-time on-site position in San Francisco requiring backend engineering expertise in Python or Rust.
Reception & Workplace Experience Coordinator
San Francisco, CAAssistant Controller
San Francisco, CAAssociate General Counsel, Corporate & Commercial
San FranciscoSite Reliability Engineer (SRE)
San FranciscoThinking Machines Lab is hiring a Site Reliability Engineer for Tinker, their fine-tuning API platform that lets researchers customize frontier AI models. You'll own end-to-end reliability across the distributed training infrastructure, including CI/CD pipelines, production observability, incident response, and multi-tenant resource isolation for GPU workloads. The role requires a CS degree or equivalent, proven experience with distributed systems and cloud infrastructure, strong software engineering skills for building reliability tooling, and a track record of production incident management—preferred qualifications include operating large-scale cloud services, experience with distributed training frameworks, and Kubernetes expertise at scale. Based in San Francisco, the position offers $350k–$475k annual compensation plus equity, generous benefits, unlimited PTO, and visa sponsorship. This is a hands-on infrastructure role supporting a rapidly scaling platform with novel use cases in AI model fine-tuning.
Research Engineer, Tinker, Developer Experience
San FranciscoSoftware Engineer, Platform, Tinker
San FranciscoThinking Machines Lab (creators of ChatGPT, Character.ai, Mistral, and PyTorch) is hiring a Platform Software Engineer to own core infrastructure systems for Tinker, their fine-tuning API that lets researchers customize frontier AI models. You'll design the authorization layer, build billing and usage metering end-to-end, manage organizations/teams/SSO, implement compliance pipelines, and own audit logging—work that touches nearly every new feature and enterprise deal. Must have a bachelor's in CS or equivalent, backend proficiency in Python or Rust, and demonstrated experience in at least one of: billing/payments, identity/access control, or multi-tenant systems; 4+ years building production backends is preferred, especially with billing at scale, enterprise-readiness patterns, or event-driven metering. Based in San Francisco or New York with $350k–$475k salary and visa sponsorship. The deal-breaker is billing expertise—this role is heavily payments and financial systems focused, requiring strong opinions on idempotency and reconciliation.
Engineering Manager
San FranciscoCompensation Partner
San Francisco, CAExecutive Business Partner
San Francisco, CAHR Business Partner
San Francisco, CAInfrastructure Engineer, Security
San FranciscoResearch Product Manager
San FranciscoResearch Engineer, Infrastructure, Numerics
San FranciscoResearch Engineer, Infrastructure, Kernels
San FranciscoResearch Engineer, Infrastructure, Training Systems
San FranciscoResearch Engineer, Infrastructure, RL Systems
San FranciscoResearch Engineer, Infrastructure, Inference
San FranciscoSoftware Engineer, Data Infrastructure
San Francisco
Want Thinking Machines Lab roles matched to you?
Swoopd scores fresh postings against your résumé so you only see the matches that matter.