Part-Time Student Worker – AI Validation and Benchmarking Engineer
Zoox is an autonomous ride-hailing company building the world's first purpose-built robotaxi — fully electric, bidirectional, with no steering wheel or driver's seat. Backed by Amazon and founded to make transportation safer, cleaner, and more accessible, Zoox designs its vehicles entirely around the rider. We're currently operating in Las Vegas and San Francisco, with Austin and Miami on the horizon, and testing underway across seven U.S. markets.
About Our Part-Time Student Worker Program
Zoox's part-time student worker program puts you at the center of one of the most ambitious challenges in transportation. You'll contribute to real projects, work alongside engineers and researchers pushing the boundaries of autonomous technology, and gain experience that goes well beyond the classroom. We're looking for students who bring strong academic foundations, curiosity that doesn't stop at coursework, and a drive to be part of something that matters. Role Overview This role requires supporting the end-to-end validation pipeline for AI tools: maintaining test datasets, running benchmarks, and measuring agent accuracy across routing decisions, classification labels, and structured output fields. About Zoox
Zoox is an autonomous ride-hailing company building the world's first purpose-built robotaxi — fully electric, bidirectional, with no steering wheel or driver's seat. Backed by Amazon and founded to make transportation safer, cleaner, and more accessible, Zoox designs its vehicles entirely around the rider. We're currently operating in Las Vegas and San Francisco, with Austin and Miami on the horizon, and testing underway across seven U.S. markets.
About Our Part-Time Student Worker Program
Zoox's part-time student worker program puts you at the center of one of the most ambitious challenges in transportation. You'll contribute to real projects, work alongside engineers and researchers pushing the boundaries of autonomous technology, and gain experience that goes well beyond the classroom. We're looking for students who bring strong academic foundations, curiosity that doesn't stop at coursework, and a drive to be part of something that matters. Role Overview This role requires supporting the end-to-end validation pipeline for AI tools: maintaining test datasets, running benchmarks, and measuring agent accuracy across routing decisions, classification labels, and structured output fields.
Responsibilities
- Run and maintain the benchmark pipeline, analyzing results to identify routing errors and regressions across agent variants
- Build and expand ground truth datasets used to evaluate agent outputs against known-correct answers
- Identify and address gaps in benchmark validation and support building a more comprehensive evaluation infrastructure to improve validation prior to release
- Develop new evaluation dimensions such as label accuracy and structured output correctness beyond the existing team classification benchmarks
- Investigate failure modes in agent outputs and work with engineers to surface actionable improvements
- Write scripts and tooling to automate data collection, result parsing, and metric reporting
- Document findings, track benchmark trends over time, and present results to the team
Program Requirements
- Currently enrolled in a B.S. or M.S. in Computer Science, Data Science, Engineering or a related field
- Available to commit to a minimum three-month assignment
- Able to commit to a minimum of 20 hours per week
- Able to work on-site at one of our office locations
- Must adhere with Zoox confidentiality requirements, including refraining from using or sharing proprietary company information outside of Zoox, such as in academic research, theses, publications, or presentations
Qualifications
- Familiar with Cursor or Claude
- Familiar with Python
- Familiar with evaluation concepts: precision, recall, F1 score, and confusion matrices
- Comfortable working with structured data (CSV, JSON)
- Experience modifying or writing reproducible analysis scripts
Bonus Qualifications
- Prior exposure to LLM-based systems, prompt engineering, or AI agent evaluation
- Experience with Jira or Slack (e.g. ticketing systems, messaging apps)
Closing
We want to be transparent: this is not an internship. The Part-Time Student Worker Program is designed to complement your academic experience by providing meaningful, ongoing work alongside your studies. Rather than participating in a cohort-based program, you'll join a team directly and contribute to real projects with real impact. While the program does not include structured intern programming or a pathway to full-time employment, it offers valuable opportunities to learn, develop new skills, and gain hands-on experience in a professional environment.
Compensation for this role is $30/hour.
This is a contract position; employment will be through a vendor contracted with Zoox. The hourly rate is as posted, and benefits eligibility is determined by the vendor.
About ZooxZoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We’re looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.
Follow us on LinkedIn
AccommodationsIf you need an accommodation to participate in the application or interview process please reach out to accommodations@zoox.com or your assigned recruiter.
A Final Note:You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.