This job was posted more than 40 days ago and might be expired.
Twelve Labs logo

Senior Backend Software Engineer, Video Engineering

Posted 3 months ago

RemoteRemote US- CA150k - 210k USD

Who we are

At Twelve Labs, we are pioneering the development of cutting-edge multimodal foundation models that have the ability to comprehend videos just like humans do. Our models have redefined the standards in video-language modeling, empowering us with more intuitive and far-reaching capabilities, and fundamentally transforming the way we interact with and analyze various forms of media.

With a remarkable $107 million in Seed and Series A funding, our company is backed by top-tier venture capital firms such as NVIDIA’s NVentures, NEA, Radical Ventures, and Index Ventures, and prominent AI visionaries and founders such as Fei-Fei Li, Silvio Savarese, Alexandr Wang and more. Headquartered in San Francisco, with an influential APAC presence in Seoul, our global footprint underscores our commitment to driving worldwide innovation.

We are a global company that values the uniqueness of each person’s journey. It is the differences in our cultural, educational, and life experiences that allow us to constantly challenge the status quo. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world. Join us as we revolutionize video understanding and multimodal AI.

About the Role

As a Senior Backend Software Engineer at TwelveLabs, you’ll build the server-side infrastructure powering our new agentic application layer. You'll join a small, high-impact team and own the critical transition from prototype to production-ready platform.

Must be based in the Pacific Time Zone for consideration.

Candidates must be able to travel up to 10% of the time annually to attend conferences, off-site meetings, and other business-related events as required by the role. This role may require participation in on-site interviews and/or completion of in-person onboarding processes.

In this role, you will

Backend

  • Design and build backend services for video processing workflows — ingestion, transcoding, 4K export, metadata extraction, and timeline operations

  • Architect scalable, high-availability systems to support enterprise-grade video workloads across cloud-native infrastructure (AWS, GCP)

  • Build and optimize APIs that power real-time and async frontend workflows, including streaming data delivery and long-running job orchestration

  • Own performance and reliability for distributed video processing pipelines with low latency and high throughput requirements

  • Collaborate closely with frontend engineers on API design, data models, and streaming strategies

You may be a good fit if you have

  • 5+ years building production backend systems and scalable APIs, based in Pacific Time Zone

  • Strong Python proficiency for backend services and tooling; comfort making pragmatic tradeoffs in a fast-moving product environment

  • Cloud-native development experience on AWS or GCP, including containerization (Docker, Kubernetes) and serverless patterns

  • Deep experience with distributed systems, microservices, and distributed job orchestration (queues, workers, retries, prioritization)

  • Hands-on experience with FFmpeg for transcoding, muxing, stream manipulation, and segment-based processing

  • Strong understanding of video codecs (H.264, H.265/HEVC, AV1), containers (MP4, MKV, WebM), and adaptive bitrate formats (HLS, DASH)

  • Experience building and operating scalable video transcoding and processing pipelines, including proxy generation and media normalization

  • Experience with backend-driven video editing workflows including timeline-based rendering, clip segmentation, and incremental re-processing

  • Experience with large-scale media storage and delivery (S3, blob storage, pre-signed URLs, range requests) and performance optimization techniques (parallel encoding, chunked processing, intermediate artifact caching)

  • Background in media, entertainment, or video streaming platforms; exposure to cloud-based media services (AWS MediaConvert, GCP Transcoder API, or similar) is a plus

Preferred Qualifications

  • Advanced API design skills (RESTful, streaming, async patterns)

  • Familiarity with model serving platforms (TorchServe, Triton, SageMaker endpoints, or similar)

  • Experience with MLOps practices — model deployment, monitoring, versioning

  • Exposure to CI/CD pipelines and observability tools (Prometheus, Grafana) for production systems

  • Experience with AI-powered product features or agentic application architectures

  • Hands-on experience running inference on ML/CV models in production — not research, but engineering models into reliable services

Even if there are a few checkboxes that aren’t ticked through your prior experience, we still encourage you to apply! If you are a 0-1 achiever, a ferocious learner, and a kind and fun team player who motivates others, you will find a home at TwelveLabs.

We are a global company that values the uniqueness of each person’s journey. It is the differences in our cultural, educational, and life experiences that allow us to constantly challenge the status quo. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world. Join us as we revolutionize video understanding and multimodal AI.

Benefits and Perks

🤝 An open and inclusive culture and work environment.

🚀 Work closely with a collaborative, mission-driven team on cutting-edge AI technology.

🏥 Full health, dental, and vision benefits

✈️ Extremely flexible PTO and parental leave policy. Office closed the week of Christmas and New Years.

🛂 VISA support where applicable

Job details
Workplace
Remote
Location
Remote US- CA
Salary
150k - 210k USD
per year

TwelveLabs is a video intelligence API platform that enables developers to build applications with semantic video search, multimodal video analysis, and video embeddings using AI models trained natively for video. Its foundation models process visual, audio, speech, and on-screen text together to support search, analysis, and understanding of video content.

Employees
192
Industry
Software Development
Headquarters
San Francisco, California
Founded
2021
Company location
San Francisco, California

Key team members

James Murphy

James Murphy

Dan Germain

Dan Germain

Kelly Hackenburg

Kelly Hackenburg

Anirudh Vemprala

Anirudh Vemprala

Apply smarter with Jobr

Jobr aggregates jobs directly from company career portals — no middlemen. Our team applies on your behalf with AI-tailored resumes, reviewed by a human before submission.

Direct from company career pages
AI-personalised cover letters
Human review before every submit
Application tracking & follow-ups