This job was posted more than 40 days ago and might be expired.
Twelve Labs logo

Research Scientist, Public Sector

Posted 3 months ago

RemoteRemote US

Who we are

At Twelve Labs, we are pioneering the development of cutting-edge multimodal foundation models that have the ability to comprehend videos just like humans do. Our models have redefined the standards in video-language modeling, empowering us with more intuitive and far-reaching capabilities, and fundamentally transforming the way we interact with and analyze various forms of media.

With a remarkable $107 million in Seed and Series A funding, our company is backed by top-tier venture capital firms such as NVIDIA’s NVentures, NEA, Radical Ventures, and Index Ventures, and prominent AI visionaries and founders such as Fei-Fei Li, Silvio Savarese, Alexandr Wang and more. Headquartered in San Francisco, with an influential APAC presence in Seoul, our global footprint underscores our commitment to driving worldwide innovation.

We are a global company that values the uniqueness of each person’s journey. It is the differences in our cultural, educational, and life experiences that allow us to constantly challenge the status quo. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world. Join us as we revolutionize video understanding and multimodal AI.

About the Role

As a Research Scientist on the Public Sector team, you play a major role in bringing TwelveLabs' video AI capabilities - including our multimodal foundation models - to mission-critical government applications. This role focuses on applying our video intelligence technology to classified and government-specific use cases, including model training, finetuning and evaluation grounded in operational requirements.

You will be the dedicated research scientist for the Public Sector team, bridging TwelveLabs' cutting-edge multimodal AI research and the unique requirements of U.S. federal, defense, and intelligence community customers. This is an opportunity to have direct operational impact and define major components of this applied science practice from the ground up.

In this role, you will:

  • Adapt TwelveLabs' video understanding and multimodal models for government-specific use cases (defense, intelligence analysis, federal records management)

  • Run training and fine-tuning experiments on cutting edge GPU infrastructure

  • Develop supervised fine-tuning pipelines tailored to government-specific datasets and annotation workflows

  • Design rigorous evaluation frameworks, including domain-specific benchmarks and operational performance metrics, that are tailored to public sector requirements

  • Work closely with Solutions Engineering and the engineering team to translate customer requirements into technical implementations

You may be a good fit if you have:

  • Strong research experience in one or more of: deep learning, computer vision, multimodal representation learning, temporal video understanding, or neural networks.

  • Hands-on experience with model fine-tuning, supervised fine-tuning (SFT), or domain adaptation at scale

  • Experience leading or contributing to research projects for government, DoD, or Intelligence Community programs - including applied research, model development, or technical delivery in mission-driven environments

  • Proficiency in Python and PyTorch

  • Comfort working within constrained or regulated compute environments

  • Active Top Secret clearance or ability to obtain

  • PhD or Master's in Computer Science, Mathematics, or related field

Strong candidates may also have:

  • Active TS/SCI clearance

  • Experience leading or contributing to research projects for government, DoD, or Intelligence Community programs - including applied research, model development, or technical delivery in mission-driven environments

  • Experience with government deployment requirements (FedRAMP, FIPS, air-gapped networks)

  • Background in video understanding or video-language models

  • Publications in top conferences (CVPR, NeurIPS, etc.)

Candidates must be able to travel up to 10% of the time annually to attend conferences, off-site meetings, and other business-related events as required by the role. This role may require participation in on-site interviews and/or completion of in-person onboarding processes.

Benefits and Perks

🤝 An open and inclusive culture and work environment.

🚀 Work closely with a collaborative, mission-driven team on cutting-edge AI technology.

🏥 Full health, dental, and vision benefits

✈️ Extremely flexible PTO and parental leave policy. Office closed the week of Christmas and New Years.

🛂 VISA support where applicable

Job details
Workplace
Remote
Location
Remote US

TwelveLabs is a video intelligence API platform that enables developers to build applications with semantic video search, multimodal video analysis, and video embeddings using AI models trained natively for video. Its foundation models process visual, audio, speech, and on-screen text together to support search, analysis, and understanding of video content.

Employees
192
Industry
Software Development
Headquarters
San Francisco, California
Founded
2021
Company location
San Francisco, California

Key team members

James Murphy

James Murphy

Dan Germain

Dan Germain

Kelly Hackenburg

Kelly Hackenburg

Anirudh Vemprala

Anirudh Vemprala

Apply smarter with Jobr

Jobr aggregates jobs directly from company career portals — no middlemen. Our team applies on your behalf with AI-tailored resumes, reviewed by a human before submission.

Direct from company career pages
AI-personalised cover letters
Human review before every submit
Application tracking & follow-ups