Speech LLM Engineer, Voice-First Agentic AI

NVIDIA·Workday
VietnamFull-timePosted Jul 2, 2026
Open original posting

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. 

Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

At NVIDIA, we advance innovation and help customers build the next generation of Sovereign AI platforms. Our Speech team seeks a Senior Speech LLM Engineer. This role involves developing spoken language and multimodal AI systems that power secure, scalable, and sovereign voice-first AI solutions. It supports enterprises, governments, and regional ecosystems in deploying and operating AI systems. These systems maintain control over data, models, infrastructure, and regulatory requirements.

What you'll be doing:

  • Develop and advance speech, audio, and multimodal LLMs that power next-generation voice-first agentic AI experiences.

  • Leverage AI-native and agentic workflows to accelerate research, development, evaluation, and deployment of AI systems.

  • Design and deploy AI models and platforms with strong consideration for AI safety, security, privacy, and Sovereign AI requirements.

  • Establish and drive benchmarking frameworks for voice agents, including speech quality, reasoning, tool use, latency, reliability, and user experience.

  • Lead technical initiatives, mentor engineers, and foster a One Team culture through close collaboration across research, engineering, product, and customer teams.

What we need to see:

  • Master's degree in Computer Science, AI, Electrical Engineering, or related field (or equivalent experience).

  • 2+ years of experience building and deploying speech, multimodal AI, or LLM systems.

  • Strong Python skills and experience with PyTorch or TensorFlow.

  • Hands-on experience with ASR, TTS, speech understanding, audio-language models, or multimodal LLMs.

  • Experience building production ML pipelines and MLOps infrastructure.

  • Proven technical leadership and mentoring experience.

  • Strong problem-solving, communication, and teamwork skills.

Ways to stand out from the crowd:

  • Hands-on experience with NVIDIA AI technologies such as NeMo, NeMo Agent Toolkit, Nemotron, Riva, NIM, and Voice Chat.

  • Experience building voice-first agentic AI systems with reasoning, tool use, and multimodal capabilities.

  • Strong expertise in speech AI, including ASR, TTS, speech-to-speech, and conversational AI.

  • Experience benchmarking AI agents for quality, latency, reliability, safety, and user experience.

  • Familiarity with Sovereign AI, enterprise AI deployment, and data governance requirements.

Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/

Want jobs like this matched to you?

Swoopd scores fresh postings against your résumé so you only see the matches that matter.

Get started free