Lead Observability Platform Engineer
United IT
Lead Observability Platform Engineer
Location: Remote
As a Lead Observability Platform Engineer, you will design, build, and operate large-scale observability services that process billions of logs, metrics, and traces daily. You will develop high-performance backend services using Go, Java, and Node.js, and lead the adoption of Open Telemetry-based instrumentation and standards across the enterprise.
In this role, you will partner closely with SRE, Cloud Engineering, CI/CD, Infrastructure, Security, and application teams to shape platform strategy, enhance developer experience, and ensure reliable, secure, and cost-efficient observability at scale. You will provide senior technical leadership, influence architectural direction, and help deliver a world-class, self-service observability ecosystem that accelerates engineering productivity and operational excellence.
Key Responsibilities
- Design, build, and operate core observability platform services using Go, Java (Spring Boot), and Node.js.
- Lead enterprise-wide adoption of OpenTelemetry, including client libraries, semantic conventions, instrumentation patterns, and Collector/agent strategy.
- Architect and scale high-throughput, fault-tolerant telemetry pipelines (logs, metrics, traces) with a focus on performance, reliability, and cost efficiency.
- Develop self-service observability capabilities that simplify onboarding, troubleshooting, and adoption for application teams.
- Implement end-to-end monitoring of the observability platform itself, defining SLOs, health checks, and alerting.
- Collaborate with SRE, Platform, and Cloud teams to establish reliability standards, error budgets, and incident response practices.
- Participate in on-call rotations and lead incident mitigation, root-cause analysis, and post-incident reviews.
- Automate operational workflows and eliminate manual toil through tooling, CI/CD enhancements, and platform automation.
- Ensure secure telemetry pipelines through mTLS, secrets management, and zero-trust design patterns.
- Produce and maintain high-quality technical documentation, standards, and best practices.
- Engage with internal engineering teams to gather requirements, influence roadmap prioritization, and deliver platform improvements.
- Provide technical leadership through mentorship, design reviews, architectural guidance, and cross-team collaboration with principal engineers and engineering leadership.
Required Qualifications
- 7+ years of experience in Software Engineering, Platform Engineering, or SRE.
- 5+ years of experience with observability practices, including SLIs/SLOs/SLAs, alerting, and incident management.
- 5+ years building production-grade backend services in Go and/or Java.
- 5+ years implementing and operating Open Telemetry, including OTLP, semantic conventions, and instrumentation patterns.
- 5+ years with cloud-native and containerized platforms (Docker, Kubernetes, Argo CD).
- 5+ years working with public cloud platforms (AWS, GCP, or Azure).
- 3+ years designing and scaling distributed, high-volume data pipelines.
- 3+ years working with Grafana OSS or comparable observability backends (e.g., Grafana, Loki, Tempo, Mimir).
- 3+ years with relational databases (PostgreSQL, MySQL).
Preferred Qualifications
- Experience with service meshes and networking technologies such as Envoy and Istio.
- Experience integrating or operating commercial observability platforms (Datadog, New Relic, AppDynamics, etc.)
- Experience with streaming and data platforms such as Kafka, Pulsar, or similar technologies.
- Familiarity with time-series, NoSQL, or analytical databases (ClickHouse, Bigtable, Cassandra, etc.)
- Experience with Infrastructure as Code tools such as Terraform or CloudFormation.
- Experience with cost optimization and capacity planning for large-scale telemetry systems.
- Experience with chaos engineering, resiliency testing, or fault injection.
- Background in security-aware platform design, including secure service-to-service communication.
- Experience mentoring senior engineers and influencing platform standards across organizations.
- Strong operational experience supporting 24x7 production systems, including on-call responsibilities.
- Strong technical communication and cross-team collaboration skills.
- Experience operating in regulated or compliance-heavy environments (e.g., healthcare, finance).
Education: Bachelor's degree from accredited university or equivalent work experience (HS diploma + 4 years relevant experience).
$230k - $270k
LangChain is seeking a Principal/Lead Software Engineer based in Boston to drive the technical direction of their core platform. You will lead architectural decisions, mentor engineers, and ensure system reliability across their full stack. With 10+ years of experience...SuggestedFlexible hours- ...Infrastructure-as-Code Docker containers (JavaScript, Python) Grafana or observability tools SonarQube (code quality/security) JFrog Artifactory AI-assisted tools (e.g., GitHub Copilot) Internal Developer Platforms Qualifications Experience supporting enterprise DevOps or...Suggested
- ...LLM Platform Engineer/Lead Remote from EMEA or Bangalore Hello I am Servesh, Co-founder and CTO at Kayzen, and I am now looking for... ...Build reusable components for prompt management, evaluation, observability, and safety Define best practices for AI usage, cost...SuggestedRemote workWork from homeWorldwideHome officeFlexible hours
- ...VCF Platform Engineer Lead At HDR, our employee-owners are fully engaged in creating a welcoming environment where each of us is valued... ...strategy across provisioning, compliance, patching, lifecycle, observability, and recovery operations. Drive platform integration with...SuggestedFull timeTemporary workPart timeMonday to FridayShift work
$153.84k - $246.15k
...differences. We believe that belonging leads to better outcomes and a stronger... ...learn new ones "I can succeed as a Platform Engineer Lead - Disaster Recovery and Resiliency... ...configuration management, monitoring and observability, resyncing and reconciliation, and...SuggestedTemporary workLocal areaFlexible hours$139.74k - $209.62k
...Platform Engineer Lead PLEASE NOTE: This position is not eligible for current or future visa sponsorship Location : This role requires... ..., and agentic runtimes preferred. Experience with AI Observability is a huge plus (both GenAI and ML) Skills working with CI...Temporary workWork experience placementWork at officeLocal area2 days per week1 day per week$153.84k - $246.15k
...differences. We believe that belonging leads to better outcomes and a stronger... ...new ones "I can succeed as a Lead Platform Engineer at Capital Group." As a senior individual... ...automation, security integration, observability, and AIassisted development-that reduce...Temporary workLocal areaFlexible hours- ...Systems Engineering Manager Lead identification of program objectives and technical strategies;... ...system engineering projects and cloud platform initiatives. Lead and manage systems... ...& Cloud-Native Design. Monitoring, Observability & Performance Optimization....
- ...Platform Engineer Location: 5 days onsite in Cleveland, Ohio (they have a relocation package if needed) Cannot submit anyone that has... ...(e.g., HashiCorp Vault)· Experience deploying and managing observability tools, such as Sysdig for monitoring and CVE scanning, Fluentd...Relocation package
- W. R. Berkley Corporation is looking for a Senior DevOps Platform Engineer in Wilmington, Delaware. In this role, you will ensure the reliability... ...enterprise CI/CD pipelines, and implement monitoring and observability solutions. Candidates should have 5+ years in DevOps and...
$131.4k - $243.8k
...Sr Mgr.* Mission Statement The Observability & Middleware Platforms team within the Platform & Reliability Engineering organization empowers Securian's technology... ...mainframe, and observability backbone. This role leads a team of Engineers in defining and...Work at officeFlexible hours3 days per week- ...Position: Senior NDR & Platform Observability Engineer Location : Remote Senior NDR & Platform Observability Engineer will support the operational health, visibility, and performance of the enterprise Network Detection & Response (NDR) environment, with a primary...For contractorsRemote work
- ...Vantara Corporation is looking for a Site Reliability Engineer (SRE) to design and operate the enterprise observability stack, including Azure Monitor and Managed... ...required, and the role involves defining SLOs and leading incident responses. #J-18808-Ljbffr Hitachi Vantara...
- ...admired brands, Toyota is growing and leading the future of mobility through innovative... ...collaborative environment. DevOps/Platform Engineer, Security Intelligence Location:... ...the platform is reliable, secure, observable, and cost-efficient as it scales. You...
- ...A technology company based in the United States is seeking a Sr. Platform Engineer to manage AWS, GCP, and cloud infrastructure. In this role, you will plan monitoring and observability mechanisms, develop tooling in Rust, and ensure operations meet reliability standards...Remote workFlexible hours
- ...We're seeking a Senior Platform Engineer (Observability & Telemetry) to join a high-performing Monitoring Engineering team within a fast-paced financial... ...for the Enterprise Monitoring Center using industry-leading tools, such as: Grafana, OpsRamp, ElasticStack, BigPanda...
$118.45k - $236.9k
...Lead Platform Reliability Engineer We're building a world of health around every individual — shaping a more connected, convenient and compassionate... ...Engineer, you will design and implement metrics and observability frameworks with a strong focus on service level...Hourly payFull timeTemporary work$124k - $156k
...Insight Software is seeking a Principal Software Engineer for the Platform Services team in the United States. The role involves overseeing the reliability and observability of the Certent Equity Management platform, focusing on cloud-native modernization. Candidates...- ...innovative, scalable, and secure platforms and services for patients,... ...platform. As a Software Engineer, you must possess world class... ...delivering real value to customers. Lead the evolution of our platform... ...Build and operate reliable, observable systems, ensuring high...Flexible hours
- ...Staff Platform Engineer | Observability Brazil (Remote) Your wellbeing, our mission. Join a company shaping a healthier world. At Wellhub we're revolutionizing workplace wellness. Our platform connects employees worldwide to the best partners for fitness, mindfulness...Part timeRemote workWorldwideHome officeFlexible hours
- ...Role: Senior NDR & Platform Observability Engineer / Architect Location: Remote Contract & FTE Both Senior NDR & Platform Observability Engineer will support the operational health, visibility, and performance of the enterprise Network Detection...Contract workFor contractorsRemote work
$141.6k - $212.4k
...EngineeringGeneral Summary:Looking for a SRE Platform Lead resource to implement and support... ...contributor role combining platform engineering + SRE + integration/event streaming expertise... ..., Junit, PyTestDeep expertise in observability using Datadog (APM/logs/metrics,...Work experience placementWork from homeWeekend workWeekday work$153.84k - $246.15k
...differences. We believe that belonging leads to better outcomes and a stronger... ...ones "I can succeed as a Lead AI Platform Engineer at Capital Group." As a Lead AI Platform... ...protocols You ensure observability and responsible AI: Monitor model...Temporary workLocal areaFlexible hours- ...Helius is seeking a Staff Platform Engineer to design and implement observability systems from the ground up. In this role, you'll architect new pipelines for metrics, logs, and performance debugging, ensuring reliability and scaling. With 8+ years of programming expertise...Remote work
- Zyphra in San Francisco is hiring a Platform Engineer responsible for designing and maintaining robust infrastructure. You will collaborate with teams to enhance system observability, manage cloud environments and ensure deployment safety. The ideal candidate has strong...
$118.45k - $236.9k
Koitecc Solutions is looking for a seasoned professional with over 10 years of experience in Software Engineering, specifically focusing on observability and system reliability. The successful candidate will have strong expertise in developing metrics, managing error budgets...Full time$140k - $180k
A global trading firm in Chicago is seeking a Platform Engineer to join their Platform Infrastructure team. The role focuses on deploying, observing, and scaling systems critical to trading operations. Responsibilities include automating deployment patterns, driving CI...- ...for architecture and development of a compute platform for HPC workloads. This role emphasizes improving observability and requires expertise in designing large-scale... ...systems, contributing to high-quality solutions and leading design discussions. #J-18808-Ljbffr Quant...
- Career Techniques is looking for a candidate to design and scale observability platforms, focusing on telemetry from GPU clusters and large-scale systems. You will collaborate with skilled engineers to enhance metrics pipelines and logging systems, improving the reliability...
- Palantir is seeking a Senior Software Engineer for their New York office to own the observability platform. The successful candidate will work on log ingestion, processing... ...coding skills in Go or Java, and experience in leading engineering teams. Benefits include flexible...Work at officeFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Lead Observability Platform Engineer. Be the first to apply!
- lead maintenance engineer United States
- lead support engineer United States
- lead c# developer United States
- lead sharepoint developer United States
- lead process engineer United States
- lead operating engineer United States
- lead software test engineer United States
- lead engineer United States
- lead infrastructure engineer United States
- lead sales engineer United States

