company logo

Senior Observability Engineer – Team Lead

DataCrunch.com

Hybrid

Maria01 (Helsinki), Remote (EU)

Full Time

We’re an ambitious, mission-driven group focused on making the world a better place by delivering affordable, environmentally sustainable AI compute for training and deploying machine learning models at scale.
  • Lead the design, deployment, and scaling of a 360-degree unified observability stack across infrastructure assets (network, storage, cloud, power, servers, VMs, services, security, compliance, customer-facing dashboards, Kubernetes, etc).
  • Use Grafana, Loki, and ELK stacks to build advanced monitoring, logging, and alerting solutions.
  • Identify and resolve critical flaws/issues, with a proven track record of saving organisations significant time or cost.
  • Orchestrate observability data to detect trends, forecast issues, and move from reactive to proactive monitoring.
  • Partner with engineering, SRE, and operations teams to create dashboards, alerts, and visualisations that enable actionable insights.
  • Manage end-to-end workflows at a senior level, ensuring observability practices are embedded across projects and aligned with business goals.
  • Define best practices, set standards for log/metadata organisation, and maintain clear documentation.
  • Deep experience with the Grafana stack (Grafana, Loki, Mimir, Alloy).
  • Strong familiarity with the ELK/Opensearch stack (Elasticsearch, Logstash, Kibana, Fluentd, Filebeat, Metricbeat).
  • Solid understanding of Prometheus and related tooling (Prometheus, Thanos, Cortex, Exporters).
  • Strong background working across Linux environments at scale.
  • Knowledge of network observability tools such as NetFlow and syslog.
  • Experience with automation/configuration management (e.g., Ansible or similar).
  • Excellent written and spoken English communication skills, with the ability to influence both technical and non-technical stakeholders.
Nice-to-haves:
  • Leadership experience in observability or infrastructure teams.
  • Experience monitoring Kubernetes environments.
  • Exposure to the Influx stack (Telegraf, InfluxDB).
  • Familiarity with OpenStack environments.
  • Company equity - a true stake in our journey.
  • Competitive salary and benefits, including health insurance, lunch benefit, and an annual personal budget (for sport, transport, wellness, or culture).
  • Flexible working environment.
  • Opportunity to work with cutting-edge AI technologies.
  • Career growth within a mission-driven company.
1. Introductory chat (45 mins) - Meet with our Talent Partner to learn more about DataCrunch and share your career goals.
2. Technical interview (60 mins) - A deeper discussion of your expertise and technical experience with future colleagues.
3. Final interview (60 mins) - Meet with our CEO, CTO, and wider team.

Senior Observability Engineer – Team Lead

Hybrid

Maria01 (Helsinki), Remote (EU)

Full Time

September 1, 2025

company logo

DataCrunch