**Summary:**
Meta is seeking a Research Scientist to advance the field of multi-modal understanding. This role focuses on developing models and systems that can reason across multiple modalities including text, images, video, and audio. You will work on cutting-edge research to enable AI systems to perceive, interpret, and generate content across diverse data types, contributing to products that impact billions of users worldwide.
**Required Skills:**
AI Research Scientist Responsibilities:
1. Conduct research on multi-modal learning, including vision-language models, audio-visual understanding, and cross-modal reasoning
2. Develop novel architectures and training methodologies for models that integrate and reason across multiple modalities
3. Design and execute experiments to evaluate multi-modal model capabilities and identify areas for improvement
4. Publish research findings at top-tier conferences and contribute to Meta's research community
5. Collaborate with cross-functional teams to translate research innovations into product applications
6. Mentor and guide other researchers on multi-modal AI projects
**Minimum Qualifications:**
Minimum Qualifications:
7. Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
8. PhD in Computer Science, Machine Learning, Artificial Intelligence, or a related field
9. Experience with multi-modal learning, vision-language models, or cross-modal representation learning demonstrated through publications or projects
10. Experience programming in Python and with deep learning frameworks such as PyTorch
11. Experience with large-scale model training and distributed computing
**Preferred Qualifications:**
Preferred Qualifications:
12. Experience building end-to-end multi-modal systems from research to production
13. Experience with video understanding or audio-visual learning
14. Publications at venues such as NeurIPS, ICML, ICLR, CVPR, ACL, or EMNLP focused on multi-modal learning
15. Experience with large language models, vision transformers, or foundation models
**Public Compensation:**
$154,000/year to $217,000/year + bonus + equity + benefits
**Industry:** Internet
**Equal Opportunity:**
Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.
Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@meta.com.
Want jobs like this matched to you?
Swoopd scores fresh postings against your résumé so you only see the matches that matter.