company logo

Senior Platform Engineer - USDS

TikTok.com

Office

San Jose, California, United States

Full Time

About the Team
The Cyber Defense & Engineering team is missioned to run and operate security infrastructures, platforms and technologies, as well as to support cross-functional teams to protect our users, products and infrastructures. This team is responsible for enhancing security tools and identifying vulnerabilities, with a specific focus on content assurance and the application of large language models (LLMs). You'll collaborate cross-functionally with partners inside and outside TikTok to fortify our products and users' security, helping to establish TikTok as the most trusted platform.

In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department. We regularly review our hybrid work model, and the specific requirements may change at any time.

About the Role
We are seeking a highly skilled and hands-on technical person to design, build, and operate the on-premise platforms and systems that power our core technology. You will focus on creating highly available, reliable, scalable, and efficient infrastructure and tools. This role is ideal for someone with a strong background in systems engineering, distributed infrastructure, and backend development. You should enjoy solving complex technical problems and writing high-quality code.

A key part of this role is to build a greenfield, AI-native development initiative focused on solving complex internal infrastructure and productivity challenges at scale. You will be responsible for the entire on-premise stack, leveraging technologies such as Apache Kafka, Apache Flink , Elasticsearch , PostgreSQL, Redis or Kubernetes. This will be a highly cross-functional role to foster a culture of innovation, collaboration, and continuous improvement.

Responsibilities
- Lead and perform hands-on technical work, including architecture design and code development for an on-premise, highly scalable, and parallelized infrastructure. The role includes developing internal tools to manage the entire lifecycle of a large scale RAG pipeline .
- Architect, implement, and manage a high-performance compute cluster for LLM workloads. This involves the selection and configuration of specialized hardware like GPUs, as well as the design of a robust network fabric to facilitate efficient inter-node communication for parallel processing.
- Oversee the end-to-end project lifecycle, from planning and requirements gathering to execution and delivery. You'll ensure that the infrastructure design aligns with our business goals for deploying LLM-powered applications. This includes developing internal tools and automation to support infrastructure operations.
- Develop and maintain automation scripts and configuration management to automate the deployment and management of the on-premise hardware and software stack. This ensures consistency and reproducibility across the entire environment.
- Implement security best practices for a private data center environment. This includes configuring network firewalls, managing access controls, and encrypting data at rest and in transit.
- Establish comprehensive monitoring and alerting systems to track the health and performance of the compute cluster and LLM workloads. This involves analyzing metrics related to GPU utilization, memory usage, network throughput, and model inference latency. You will proactively resolve performance issues to enhance platform reliability and operational support for internal teams.
- Collaborate with internal stakeholders to optimize resource utilization and improve the platform's efficiency. You'll work closely with data scientists and machine learning engineers to understand their compute needs and ensure the infrastructure is optimized for their specific workloads.

Senior Platform Engineer - USDS

Office

San Jose, California, United States

Full Time

September 17, 2025

company logo

TikTok