Senior Software Engineer — AI Evaluation & Benchmarks

Alignerr·LinkedIn

United StatesCONTRACTORPosted Jun 29, 2026

Role responsibilities

Design and implement coding benchmarks to evaluate AI models and build scalable data pipelines for AI evaluation workflows. Analyze AI-generated code and create structured evaluation scenarios to rigorously test reasoning and debugging capabilities.

Requirements

Candidates must have at least 4 years of professional software engineering experience and expert proficiency in Python. Experience with LLM coding benchmarks, version control systems, and strong written communication skills are also required.

Key skills

Python, Software Engineering, Data Pipelines, AI Evaluation, Debugging, Code Quality, Version Control, CI/CD, Unit Testing, Machine Learning, Security Engineering, Open Source Contributions, JavaScript, Go, C++

Keywords

AI, Software Engineering, Python, Data Pipelines, Benchmarks, Machine Learning, Debugging, Code Quality, Version Control, CI/CD, Unit Testing, Security Engineering, Open Source, JavaScript, Go, C++