Software Development Engineer I, ML Infra Services, Annapurna Labs
AWS Neuron is the complete software stack for AWS Inferentia and Trainium cloud-scale machine learning accelerators and the Trn1 and Inf1 servers that use them.
We're looking for a Software Development Engineer to help build and evolve machine learning tools that run, optimize, and analyze ML workloads on custom AI accelerators. You'll work across the stack, from infrastructure orchestration to developer-facing tooling - alongside hardware engineers, system architects, and ML researchers both within and outside Amazon.
Key job responsibilities
- Design and implement tooling for profiling, optimization, and resource management of ML workloads on custom accelerators.
- Build high-impact solutions that ship to a large and growing customer base.
- Participate in design discussions, code reviews, and cross-functional collaboration with hardware, software, and customer-facing teams.
- Create metrics, implement automation, and resolve root causes of software defects.
- Work in a startup-like environment where you're always focused on the most important problems.
About the team
This is a high-impact, high-visibility team where your work directly accelerates every Neuron team's ability to ship, effectively multiplying the output of 100+ engineers. We're a small, senior group actively building greenfield capabilities, which means significant design ownership for SDEs and the opportunity to own major components and drive architectural decisions. You'll work at the cutting edge of AI infrastructure, at the intersection of Kubernetes, custom silicon, and large-scale ML workloads.
Basic qualifications
- Bachelor's degree or above in computer science or equivalent- Experience in Kubernetes, Docker or containers ecosystem, or experience in deploying identity and access management systems and experience demonstrating software engineering skills in a previous intership, work experience, coding competitions, or publications
- Knowledge of software development lifecycle, including design, development, test, build, deployment processes and timelines
- Proficiency in Java and at least one of Go, Python, or TypeScript.
- Familiarity with Git and CI/CD pipelines.
Preferred qualifications
- Internship or project experience with AWS services (EKS, EC2, Lambda, S3, DynamoDB, or SQS).- Familiarity with distributed systems or big data architectures.
- Experience with Linux systems and performance profiling.
- Exposure to compiler toolchains, code generation, or instruction set architectures (CPU, NPU, GPU).
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.
USA, WA, Seattle - 110,500.00 - 160,000.00 USD annually