Sandisk logo

HPC & Cloud Engineer

Posted about 5 hours ago

OfficeBengaluru, KA, India

Job Description

Cloud Architecture & Operations

  • Build and operate HPC environments on cloud platforms such as:
    • Amazon Web Services (AWS)
    • Microsoft Azure
    • Google Cloud Platform
  • Design hybrid-cloud and multi-cloud architectures for HPC workloads.

  • Implement cloud-native storage, networking, security, and disaster recovery solutions.

Infrastructure Automation & DevOps

  • Develop Infrastructure as Code (IaC) using:
    • Terraform
    • CloudFormation
    • Ansible

    • Python code

  • Build CI/CD pipelines for infrastructure and platform deployments.
  • Automate cluster provisioning, configuration management, monitoring, and patch management.
  • Develop self-service provisioning frameworks for research and engineering teams.

AI & Data Engineering

  • Design and implement scalable AI/ML data pipelines.
  • Build data ingestion, transformation, and orchestration frameworks.
  • Support distributed AI training and inference workloads.
  • Optimize GPU utilization for deep learning applications.
  • Collaborate with Data Scientists and ML Engineers to deploy production AI solutions.

Platform Monitoring & Reliability

  • Implement observability solutions using: Prometheus, Grafana, ELK Stack, OpenTelemetry
  • Monitor system performance, capacity planning, and SLA compliance.
  • Troubleshoot performance bottlenecks across compute, storage, network, and AI frameworks.

HPC Infrastructure Engineering

  • Design, deploy, and manage large-scale HPC clusters across on-premises and cloud environments.
  • Administer compute, storage, networking, and GPU resources for AI/ML and data-intensive workloads.
  • Optimize cluster performance, scheduling, and resource utilization using workload managers such as: Slurm, LSF, PBS Pro, Kubernetes

Security & Governance

  • Implement security best practices for HPC and cloud environments.
  • Manage IAM, secrets management, encryption, and compliance controls.
  • Support regulatory requirements and enterprise governance standards.

Qualifications

5+ years of experience in DevOps and Cloud infrastructure management

Technical Skills

  • Bachelor's or Master's degree in Computer Science, Engineering, Information Systems, or related field.
  • Strong experience with Linux system administration (RHEL, Rocky Linux, Ubuntu).
  • Experience managing HPC clusters and distributed computing environments.
  • Proficiency in Python, Bash, or Go.
  • Hands-on experience with: Terraform, Ansible, Git, Jenkins/GitHub Actions
  • Experience with container technologies: Docker, Kubernetes, Singularity/Apptainer
  • Knowledge of AI/ML frameworks: TensorFlow, PyTorch, Ray, Spark
  • Experience with GPU technologies and accelerator platforms.

Cloud Skills

  • AWS, Azure, or GCP architecture and operations.
  • Cloud networking, storage, and security services.
  • Hybrid cloud and HPC workload migration experience.

Additional Information

Sandisk thrives on the power and potential of diversity. As a global company, we believe the most effective way to embrace the diversity of our customers and communities is to mirror it from within. We believe the fusion of various perspectives results in the best outcomes for our employees, our company, our customers, and the world around us. We are committed to an inclusive environment where every individual can thrive through a sense of belonging, respect and contribution.

Sandisk is committed to offering opportunities to applicants with disabilities and ensuring all candidates can successfully navigate our careers website and our hiring process. Please contact us at [email protected] to advise us of your accommodation request. In your email, please include a description of the specific accommodation you are requesting as well as the job title and requisition number of the position for which you are applying.

Job details
Workplace
Office
Location
Bengaluru, KA, India

High-performance SSDs, memory cards, and USB Flash Drives designed to prioritize speed, reliability, and energy efficiency for gamers, digital photography, and every day users.

Key team members

James Hong

James Hong

Apply smarter with Jobr

Jobr aggregates jobs directly from company career portals — no middlemen. Our team applies on your behalf with AI-tailored resumes, reviewed by a human before submission.

Direct from company career pages
AI-personalised cover letters
Human review before every submit
Application tracking & follow-ups