company logo

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)

Meta

Office

Redmond, WA

Internship

The Meta Reality Labs Research Team brings together a world-class team of researchers, developers, and engineers to create the future of contextual AI and robotics. The Surreal Vision group at RL Research is seeking exceptional Research Scientists to research and help build the egocentric machine perception functionalities that will underpin future contextual AI-enabled devices. The research intern will work on cutting edge research problems to innovate novel computer vision and machine learning techniques.

Work with researchers to advance frontier generative AI in the following areas:
-Develop unified predictive models that integrate language, vision, human motion, and actions.
-Investigate techniques to enable long-horizon, consistent and physically grounded generation.
-Benchmark against state-of-the-art approaches in world modeling, video generation, and vision–language–action model.
-Leverage multimodal generation to accelerate robot learning and control.
Build contextual and embodied AI models using large-scale egocentric multimodal datasets.


Our internships are twelve (12) to twenty four (24) weeks long and we have various start dates throughout the year. Some projects may require a minimum of 24 consecutive weeks.Research Scientist Intern, Multimodal Generative AI and Robotics (PhD) Responsibilities
  • Plan and execute cutting-edge research and development to advance the state-of-the-art in machine learning and large-scale training.
  • Collaborate with other researchers and engineers across machine perception teams at Meta to develop experiments, prototypes, and concepts that advance the state-of-the-art contextual AI and robotic systems.
  • Work with the team to help design, setup, and run practical experiments and prototype systems related to large-scale high-quality sensing and machine reasoning.
Minimum Qualifications
  • Currently has, or is in the process of obtaining a PhD degree in the domain of computer-vision, computer graphics, 3D machine perception or deep learning
  • Knowledge in deep learning, computer vision, graphics, generative modeling, LLMs and VLMs
  • Hands-on experience with implementing deep learning algorithms, large-scale training, benchmark and evaluation
  • Experience working within Python environments such as pytorch
  • Experience working in a Unix environment
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Preferred Qualifications
  • Preference for 24 week full time internship
  • Intent to return to a degree-program after the completion of the internship
  • Proven track record of achieving significant results as demonstrated by grants, fellowships, patents, as well as first-authored publications at top tier conferences such as CVPR, ECCV, ICCV, SIGGRAPH, ICLR and NeurIPS
  • Strong track-record of published research in the fields of LLMs, VLMs, video generation, world modeling, VLA, human motion modeling, policy learning, generative modeling etc
  • Strong programming experience using python and pytorch
  • Demonstrated software engineer experience via an internship, work experience, coding competitions, or widely used contributions in open source repositories (e.g. GitHub)
  • Experience working and communicating cross functionally in a team environment
For those who live in or expect to work from California if hired for this position, please click here for additional information. About Meta Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today—beyond the constraints of screens, the limits of distance, and even the rules of physics.
$७,६५०/month to $१२,१३४/month + benefits

Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base hourly rate, monthly rate, or annual salary only, and do not include bonus, equity or sales incentives, if applicable. In addition to base compensation, Meta offers benefits. Learn more about benefits at Meta.



Equal Employment Opportunity Meta is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. You may view our Equal Employment Opportunity notice here. Meta is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, fill out the Accommodations request form.


Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)

Office

Redmond, WA

Internship

September 10, 2025

company logo

Meta

Meta.com

Meta