Infrastructure, Speech

Posted about 1 month ago

OfficeSan Jose180k - 450k USD

About Hark

Hark is an artificial intelligence company building advanced, personalized intelligence. One that is proactive, multimodal, and capable of interacting with the world through speech, text, vision, and persistent memory.

We're pairing that intelligence with next-generation hardware to create a universal interface between humans and machines. While today's AI largely operates through chat boxes and decade-old devices, Hark is focused on what comes next: agentic systems that interact naturally with people and the real world.

To get there, we're developing multimodal models and next-generation AI hardware together - designed from the ground up as a single, unified interface for a new era of intelligent systems.

We are seeking a Member of Technical Staff, Infrastructure Speech to lead and scale the backbone of Hark's real-time speech-to-speech engine. Positioned at the nexus of systems engineering and speech AI, you will be accountable for the reliability, latency, and performance of the infrastructure supporting our live speech models. This is a high-impact technical role designed for someone who excels in low-latency distributed environments and approaches infrastructure with a product-driven mindset.

Responsibilities

Facilitate the repeatable, auditable, and scalable provisioning of our speech inference stack.
Harden CI/CD pipelines to guarantee the secure, ultra-low-latency deployment of real-time speech services across all production environments.
Lead the evolution of the end-to-end infrastructure powering Hark's speech-to-speech models, including streaming pipelines, session management, and fault tolerance.
Collaborate with speech ML researchers to identify latency bottlenecks and translate complex requirements into robust infrastructure enhancements.
Oversee system health and incident response, defining critical SLOs for real-time speech workloads where performance and uptime are paramount.
Manage capacity planning, cost efficiency, and the hardware lifecycle for the global speech inference fleet.
Build internal tooling and platform abstractions to streamline the developer experience for teams operating on speech infrastructure.

Requirements

Possess 5+ years of expertise in infrastructure, systems, or platform engineering, including a minimum of 2 years dedicated to real-time or low-latency environments.
Demonstrated success in deploying and managing large-scale inference frameworks or streaming infrastructure at scale.
Showcase advanced proficiency in at least one systems-level or infrastructure-centric programming language.
Maintain a deep technical understanding of networking fundamentals critical to real-time audio and low-latency inference, such as WebRTC or gRPC.
Proven experience with container orchestration, sophisticated job scheduling, and multi-tenant resource management.
Track record of technical ownership over production systems where high reliability and rigorous latency constraints are the standard.
Exhibit robust debugging and observability capabilities across the entirety of the infrastructure stack.

Bonus Qualifications

Deep expertise in Kubernetes (K8s), with a focus on GPU-aware orchestration and the management of latency-sensitive workloads.
Proficiency in Pulumi or comparable modern Infrastructure as Code (IaC) frameworks.
Advanced command of Rust or Go for developing systems-level tooling and performance-critical services.
Technical familiarity with speech model architectures—including ASR, TTS, and end-to-end speech-to-speech—and their unique inference characteristics.
Hands-on experience with streaming data pipelines and transport layers, such as Kafka, WebSockets, or custom audio protocols.

Compensation

The US base salary range for this full-time position is between $180,000 - $450,000 annually.

The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.

Other open roles at Hark(6)

Audio Software Engineer

San Jose

On-site

Model Distillation Engineer

See all Hark jobs on Jobr

Job details

Workplace

Office

Location

San Jose

Salary

180k - 450k USD

per year

Hark

View company page

Celebrating 90 years of Ultimate Moviegoing®, Harkins Theatres is the premier movie exhibitor of the Western U.S. and the largest family owned theatre chain in the country. Founded by showman, inventor and community leader Dwight “Red” Harkins in 1933, Harkins Theatres operates 500 screens and is renowned for its commitment and passion for providing the Ultimate Moviegoing® experience. Harkins Theatres is known as a trailblazer for advancements in the motion picture exhibition industry with amenities such as: the Ciné Capri, CINÉ 1, CINÉ 1 XL, pristine state-of-the-art digital projection and sound, curved wall-to-wall screens, Ultimate Lounger® leather reclining seats, plush Ultimate Rocker® loveseats, Loyalty Cups, Ciné Bar and an expanded selection of gourmet concessions.

Website LinkedIn

Employees

1415

Industry

Entertainment Providers

Headquarters

Scottsdale, Arizona

Founded

1933

Company location

8901 E McDonald Dr, Scottsdale, Arizona 85250, US

Specialties

Movies, Movie Exhibition, Movie Theatre, Cinema, Group Events, Meetings, Celebrations, Presentations, Parties, Employee Outings, and Team Outings

Key team members

Damon Stephens

Kristen H. Magnuson

Mark Rooney, NCIDQ, IIDA

Ricardo Silva

Apply smarter with Jobr

Jobr aggregates jobs directly from company career portals — no middlemen. Our team applies on your behalf with AI-tailored resumes, reviewed by a human before submission.

Direct from company career pages

AI-personalised cover letters

Human review before every submit

Application tracking & follow-ups