Inference ML API SDET

Headquarters · Toronto OfficeFullTimePosted Jun 29, 2026

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. This architecture allows Cerebras to deliver industry-leading training and inference speeds; over 10 times faster than GPU-based hyperscale cloud inference services.

This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

Cerebras works with the leading model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.

About The Team

The Cloud Quality team is responsible for the confidence behind every production release shipped to Cerebras Inference Cloud. We work closely with platform, infrastructure, ML systems, and product engineering teams to ensure that rapid iteration never comes at the expense of customer trust. Our environment spans distributed cloud systems, multi-region deployments, APIs, orchestration layers, and hardware-backed inference services.

We are scaling quickly. The systems are growing in complexity, traffic is increasing rapidly, and release velocity remains high. We need engineers who can build quality systems that scale with the business.

About The Role

As a Senior Software Engineer in Test for the ML API features team, you will lead testing strategy and execution for AI/ML models, evaluating accuracy, fairness, and performance at scale. You will serve as a key technical leader in delivering and validating all software and hardware components for Cerebras API Features. You will own software components feature integration quality and drive pre-deployment and production validation for Cerebras inference solutions. In this role, you will define and champion best testing practices, establish robust debugging methodologies, and mentor junior engineers while advocating for world-class product quality.

Responsibilities

Architect and own end-to-end test strategies for new features, developing scalable tests, frameworks, and tooling to ensure quality.
Lead contributions to industry-standard benchmarks and drive adoption of rigorous evaluation methodologies.
Define and drive automation initiatives to significantly improve internal engineering efficiency and test coverage.
Make strategic decisions around coverage trade-offs, resource requirements, and risk-based testing priorities.
Serve as a technical anchor in a highly agile environment, adapting quickly to shifting priorities while maintaining quality standards.
Mentor and guide junior SDETs on testing methodology, debugging practices, and automation development.
Proactively identify systemic quality gaps and drive cross-functional initiatives to address them.
Lead and facilitate effective technical communication across teams and time zones.

Skills & Qualifications

5+ years of relevant industry experience in software integration, development, or quality engineering.
Deep expertise in automation and programming using one or more languages such as Python, C++, or Go; ability to design and build reusable test frameworks from the ground up.
Proven experience testing compute, machine learning, networking, or storage systems within large-scale enterprise environments.
Strong track record of debugging complex issues across distributed, scaled-out deployments.
Demonstrated ability to lead cross-functional quality initiatives involving product development, product management, customer operations, and field teams.
Excellent verbal and written communication skills, with experience presenting technical findings to both engineering and leadership audiences.
Strong organizational skills, ownership mindset, and ability to drive projects to completion independently.
Experience leading and mentoring engineers across geographically dispersed teams and time zones.

Preferred Skills & Qualifications

Hands-on experience with ML workloads including LLM and/or multimodal training or inference.
Deep familiarity with hardware architecture, performance optimizations, compilers, and ML frameworks.
Experience designing test strategies for distributed systems, cloud infrastructure, and security validation.
Experience with microservices deployment, debugging, and orchestration at scale.
Prior experience owning or significantly contributing to a team's quality engineering culture or test infrastructure.

Location

This role follows a hybrid schedule, requiring in-office presence 3 days per week. Fully remote is not an option.

Office locations: Sunnyvale, CA | Toronto, Canada

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras, we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

Build a breakthrough AI platform beyond the constraints of the GPU.
Publish and open source their cutting-edge AI research.
Work on one of the fastest AI supercomputers in the world.
Enjoy job stability with startup vitality.
Our simple, non-corporate work culture that respects individual beliefs.

Find out more about what it's like to work at Cerebras here!

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.