Senior Software Engineer - Data Platform

Cognite.com

Office

Bengaluru

Full Time

About CogniteEmbark on a transformative journey with Cognite, a global SaaS forerunner in leveraging AI and data to unravel complex business challenges through our cutting-edge offerings including Cognite Atlas AI, an industrial agent workbench, and the Cognite Data Fusion (CDF) platform. We were awarded the 2022 Technology Innovation Leader for Global Digital Industrial Platforms & Cognite was recognized as 2024 Microsoft Energy and Resources Partner of the Year. In the realm of industrial digital transformation, we stand at the forefront, reshaping the future of Oil & Gas, Chemicals, Pharma and other Manufacturing and Energy sectors. Join us in this venture where AI and data meet ingenuity, and together, we forge the path to a smarter, more connected industrial future.
Learn more about Cognite hereCognite Product Tour 2024Cognite Product Tour 2023Data Contextualization Masterclass 2023
Our valuesImpact: Cogniters strive to make an impact in all that they do. We are result-oriented, always asking ourselves.Ownership: Cogniters embrace a culture of ownership. We go beyond our comfort zones to contribute to the greater good, fostering inclusivity and sharing responsibilities for challenges and success. Relentless: Cogniters are relentless in their pursuit of innovation. We are determined and deliverable (never ruthless or reckless), facing challenges head-on and viewing setbacks as opportunities for growth.

About Cognite & This Role
Cognite is revolutionizing industrial data management through our flagship product, Cognite Data Fusion - a state-of-the-art SaaS platform that transforms how industrial companies leverage their data. We're seeking a Senior DataPlatform Engineer who excels at building high-performance distributed systems and thrives in a fast-paced startup environment. You'll be working on cutting-edge data infrastructure challenges that directly impact how Fortune 500 industrial companies manage their most critical operational data.

What You'Ll Build & Own

HighPerformance Data Systems

Design and implement robust data processing pipelines using Apache Spark, Flink, and Kafka for terabyte-scale industrial datasets
Build efficient APIs and services that serve thousands of concurrent users with sub-second response times
Optimize data storage and retrieval patterns for time-series, sensor, and operational data
Implement advanced caching strategies using Redis and in-memory data structures
Engineer Spark applications with deep understanding of Catalyst optimizer, partitioning strategies, and performance tuning
Develop real-time streaming solutions processing millions of events per second with Kafka and Flink
Design efficient data lake architectures using S3/GCS with optimized partitioning and file formats (Parquet, ORC)
Implement query optimization techniques for OLAP data stores like ClickHouse, Pinot, or Druid

Scalability & Performance

Scale systems to 10K+ QPS while maintaining high availability and data consistency
Optimize JVM performance through garbage collection tuning and memory management
Implement comprehensive monitoring using Prometheus, Grafana, and distributed tracing
Design fault-tolerant architectures with proper circuit breakers and retry mechanisms

Technical Innovation

Contribute to open-source projects in the big data ecosystem (Spark, Kafka, Airflow)
Research and prototype new technologies for industrial data challenges
Collaborate with product teams to translate complex requirements into scalable technical solutions
Participate in architectural reviews and technical design discussions

What We'Re Looking For- Core Technical Requirements

Distributed Systems Experience (4-6 years) - Production Spark experience - built and optimised large-scale Spark applications with understanding of internals - Streaming systems proficiency - implemented real-time data processing using Kafka, Flink, or Spark Streaming - JVM Language expertise - strong programming skills in Java, Scala, or Kotlin with performance optimisation experience.
Data Platform Foundations (3+ years) - Big data storage systems - hands-on experience with data lakes, columnar formats, and table formats (Iceberg, Delta Lake) - OLAP query engines - worked with Presto/Trino, ClickHouse, Pinot, or similar high-performance analytical databases - ETL/ELT pipeline development - built robust data transformation pipelines using tools like DBT, Airflow, or custom frameworks
Infrastructure & Operations - Kubernetes production experience -deployed and operated containerised applications in production environments. Cloud platform proficiency - hands-on experience with AWS, Azure, or GCP data services.
Monitoring & observability - implemented comprehensive logging, metrics, and alerting for data systems.
Performance Engineering - System optimisation experience - delivered measurable performance improvements (2x+ throughput gains).
Resource efficiency - optimised systems for cost while maintaining performance requirements.
Concurrency expertise - designed thread-safe, high-concurrency data processing systems.
Data Engineering Best Practices - Data quality frameworks -implemented validation, testing, and monitoring for data pipelines.
Schema evolution - managed backwards-compatible schema changes in production systems.
Data modelling expertise - designed efficient schemas for analytical workloads
Technical Collaboration - Cross-functional partnership - worked effectively with product managers, ML engineers, and data scientists.
Codereview excellence - provided thoughtful technical feedback and maintained high code quality standards.
Documentation & knowledge sharing - created technical documentation and participated in knowledge transfer.
Continuous Learning - Technology adoption - quickly learned and applied new technologies to solve business problems.
Industry awareness - stayed current with big data ecosystem developments and best practices.
Problem-solving approach - demonstrated a systematic approach to debugging complex distributed system issues.
Execution Excellence - Rapid delivery - consistently shipped high-quality features within aggressive timelines.
Technical pragmatism - made smart trade-offs between technical debt, velocity, and system reliability.
End-to-end ownership - took responsibility for features from design through production deployment and monitoring.
Ambiguity comfort - thrived in environments with evolving requirements and unclear specifications.
Technology flexibility - adapted to new tools and frameworks based on project needs.
Customer focus - understood how technical decisions impact user experience and business metrics.
Open-source contributions to major Apache projects in the data space (e. g. Apache Spark or Kafka) are a big plus.
Conference speaking or technical blog writing experience, Industrial domain knowledge - previous experience with IoT, manufacturing, or operational technology systems.

Technical Stack

Primary Technologies:

Languages: Kotlin, Scala, Python, Java.
Big Data: Apache Spark, Apache Flink, Apache Kafka.
Storage: PostgreSQL, ClickHouse, Elasticsearch, S3-compatible systems.
Infrastructure: Kubernetes, Docker, Terraform.
Table Formats: Apache Iceberg, Delta Lake, Apache Hudi.
Query Engines: Trino/Presto, Apache Pinot, DuckDB.

Join the global Cognite community! 🌐- Join an organization of 70 different nationalities 🌐 with Diversity, Equality and Inclusion (DEI) in focus 🤝- Office location Rathi Legacy (Rohan Tech Park ) Hoodi (Bengaluru)- A highly modern and fun working environment with sublime culture across the organization, follow us on Instagram @cognitedata 📷 to know more- Flat structure with direct access to decision-makers, with minimal amount of bureaucracy- Opportunity to work with and learn from some of the best people on some of the most ambitious projects found anywhere, across industries- Join our HUB 🗣️ to be part of the conversation directly with Cogniters and our partners.- Hybrid work environment globally
Why choose Cognite? 🏆 🚀Join us in making a real and lasting impact in one of the most exciting and fastest-growing new software companies in the world. We have repeatedly demonstrated that digital transformation, when anchored on strong DataOps, drives business value and sustainabilityfor clients and allows front-line workers, as well as domain experts, to make better decisions every single day. We were recognized as one of CNBC's top global enterprise technology startups powering digital transformation! And just recently, Frost & Sullivan named Cognite a Technology Innovation Leader! 🥇 Most recently Cognite Data Fusion® Achieved Industry First DNV Compliance for Digital Twins 🥇
Apply today!If you're excited about the opportunity to work at Cognite and make a difference in the tech industry, we encourage you to apply today! We welcome candidates of all backgrounds and identities to join our team.
We encourage you to follow us on Cognite LinkedIn; we post all our openings there.