Principal Observability Platform Engineer
$150k - $215kNscale
Principal Observability Platform Engineer US Principal Observability Platform Engineer – Nscale About Nscale Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale simplifies AI development while enabling superior results, supporting strategic business outcomes such as cost management, rapid innovation, and environmental responsibility. We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you’ll build trust through openness and transparency while contributing to the technology that powers the future. About the Role As a Principal/Staff Observability Platform Engineer, you'll own the technical direction of Nscale's observability platform: the systems that give us deep visibility into GPU clusters, AI workloads, and the infrastructure running them. You treat observability as a product and a discipline, not a tooling exercise. You'll set the architectural roadmap, raise the engineering bar across teams, and ensure our platform scales ahead of the business, not behind it. You understand that complexity is a cost. Solutions that require constant babysitting don't scale, and neither does operational burden. The platforms you build should be simple to operate, easy to understand, and self-evidently correct when something goes wrong. This isn't a "maintain and operate" role. It's a "define, build, and lead" role. What You'll Do Own the technical strategy and architecture for observability across metrics, logs, traces, and alerting at scale. Drive platform decisions that have multi-year impact: tooling, data models, ingestion patterns, retention, cardinality management. Identify systemic gaps before they become incidents; design platforms that make failure visible and fast to diagnose. Partner with SRE, infrastructure, and AI/ML teams to embed observability natively into how Nscale builds and operates. Define standards and patterns that other engineers adopt, not by mandate, but because they're clearly better. Mentor and technically grow the observability team; raise the ceiling on what the team can build and own. Lead incident postmortems and use them to drive durable platform improvements. Evaluate and introduce tooling that meaningfully improves signal quality, operational efficiency, or scalability, and retire what doesn't. About You 8+ years in SRE, infrastructure engineering, platform engineering, or observability-focused roles. You've operated observability infrastructure at serious scale. You know what breaks at 10x and you design for it. You have a strong bias toward simplicity. You've seen over-engineered observability stacks collapse under their own weight and you build accordingly. Deep hands-on experience with a significant subset of: Prometheus, Thanos, VictoriaMetrics, Grafana, Loki, Tempo, OpenTelemetry, ClickHouse, Elastic. Strong engineering fundamentals, proficient in Python, Go, or similar; comfortable owning complex systems end to end. Experience with Kubernetes at scale; familiarity with GPU infrastructure or HPC environments (Slurm) is a strong plus. You can architect systems, write the code, review others' work, and explain the tradeoffs clearly, all in the same week. Infrastructure-as-Code is default, not optional (Terraform, Ansible, or equivalent). You influence without authority. Teams want your opinion because it makes their work better. Preferred Experience with high-volume streaming pipelines for observability data (Kafka, Vector, Fluent Bit, etc.). Background in AI/ML infrastructure observability: GPU utilisation, training job visibility, inference latency. Prior experience defining observability strategy at an organisation level. We strongly encourage applications from people of color, the LGBTQ+ community, people with disabilities, neurodivergent individuals, parents, carers, and people from lower socio-economic backgrounds. If there’s anything we can do to accommodate your specific situation, please let us know. Note: Responsibilities outlined are not exhaustive and may evolve as business needs change. The range below reflects the base salary for the position. Actual compensation may vary based on job-related factors such as skill set, experience, education, and location. In addition to base salary, this role may be eligible for bonus, equity, and/or commission programs. Nscale may offer a competitive benefits package including medical, dental, vision, flexible paid time off, parental leave, and retirement plan participation. Salary Range $150,000 - $215,000 USD #J-18808-Ljbffr
$150k - $215k
...Nscale is looking for a Principal Observability Platform Engineer to lead the technical direction of their observability platform. This role demands expertise in owning observability infrastructure, driving impactful decisions, and simplifying complex systems. Candidates...Suggested- A leading technology company is seeking a Principal Engineer to be the technical expert for their AI Platform & Operations. This pivotal role involves defining the... ..., optimizing GPU efficiency, and enhancing observability specifically tailored for AI workloads. Ideal candidates...PrincipalFlexible hours
- ...from idea to execution faster. Our platform turns intent into action, automating... ...best work here. The Role As a Principal Platform Engineer at Gradial, you will shape the... ...the evolution of Kubernetes, CI/CD, observability, and infrastructure as code across the...Principal
$99.6k - $234.6k
...Principal Software Engineer Join Oracle's Health Data Intelligence (HDI) team as a Principal Software... ...the next generation of cloud-native platforms, distributed systems, and... ...available services, reliability platforms, observability systems, automation frameworks, and...PrincipalTemporary workFlexible hours- ...and implementation of a comprehensive observability strategy across the entire SIEM modernization... ...(Blob), and multiple downstream platforms (Splunk, Snowflake, ADX, Log Analytics,... ...views). Partner closely with Security Engineering, Platform Engineering, and Data Engineering...Suggested
$251k - $352k
...A leading software development company is hiring a Senior Principal Engineer in Seattle, WA, to define and drive the technical strategy for its foundational platform. This remote-first role requires deep technical expertise across multiple domains, focusing on accounts...PrincipalRemote work$154.85k - $189.26k
...full potential. Why We Need You: We are seeking a Principal AI Platform Engineer, to join our community. As AI agents become mission-... ...enterprise agent marketplace, establishing production-grade observability, and ensuring governance and compliance across all AI...PrincipalFull timeTemporary workPart timeImmediate startWork from homeFlexible hoursShift work- ...Israelvcforum in Seattle is seeking a Principal Software Engineer to lead the evolution of our financial services platform. The ideal candidate will direct strategic initiatives, mentor engineers, and collaborate across teams to enhance engineering efficiency. The role...Principal
- ...Israelvcforum is seeking a Principal Software Engineer in Seattle, WA to lead the evolution of our financial services platform. As a technical leader, you will collaborate across teams to define and execute the technical vision, ensuring alignment with business objectives...Principal
- ...Expedia Group is seeking a Principal Software Development Engineer based in Seattle to lead architectural designs and technical strategies for AI-driven customer servicing. The ideal candidate brings over a decade of experience in software development, especially in cloud...Principal
$168k - $230k
A leading tech platform is seeking a Principal Engineer to lead the technical vision for their AI platform. The role involves owning the technical roadmap, establishing engineering benchmarks, and mentoring teams. The ideal candidate has over 12 years of experience, specialized...PrincipalFlexible hours$78k - $185k
...Morgan Stanley seeks a Senior Platform Engineer to join their Parametric team in Seattle, focusing on delivering robust applications while collaborating with development teams. Your expertise in AWS, Azure, and Kubernetes will play a key role in empowering teams to achieve...$117.86k - $200.36k
...sits within the Data Analytics and Data Engineering team, serving as the cloud infrastructure authority for our enterprise data platform — partnering closely with data engineers... ...decision making across the enterprise. The Principal Cloud Platform Engineer is an...PrincipalTemporary work$168k - $230k
Serko is a cutting‑edge tech platform in global business travel & expense technology.... ...team and product. We are looking for a Principal Engineer to serve as the technical expert for... ...batching, and latency) at scale. Enhance Observability: Design sophisticated monitoring and...PrincipalFlexible hours$249k
...Principal Software Engineer, Observability Expedia Group brands power global travel for everyone, everywhere. We design cutting-edge tech to make travel... ..., partners, and our employees. A singular technology platform powered by data and machine learning provides secure,...PrincipalFlexible hours- ...We are looking for an innovative and hands-on principal data platform engineer to join our team and lead the design, build, and scaling of secure, governed platforms on Azure, Databricks, and Snowflake. You'll architect cloud-native solutions, enable AI-driven analytics...Principal
- ...Senior Principal Software Engineer Join JPMorgan Chase as a Senior Principal Software Engineer where you will own the Databricks platform architecture on AWS, build Terraform/Python automation, mentor teams, and drive high-impact data innovation. The Chief Data &...PrincipalWork at office
$148.5k - $313.7k
...Job Category Software Engineering Job Details About Salesforce... ...to the Senior / Lead / Principal Software Engineer - Foundations... ...robust CI/CD pipelines, advanced observability, and top-tier engineering... ...on backend systems, cloud platforms, and infrastructure. Languages...PrincipalWork experience placement- ...Principal Software Engineer Join a forward-thinking team at JPMorganChase and help shape the future of cloud platform engineering. As a Principal Software Engineer, you'll play a critical... ...for large-scale platform observability and analytics. Champion adoption...PrincipalWork at officeShift work
- ...The Walt Disney Company (Germany) GmbH is looking for a Senior Principal Machine Learning Engineer to lead and mentor teams in Seattle. This role focuses on applying sophisticated machine learning and AI techniques to improve ad technologies. The ideal candidate will have...Principal
$231k
...using AI to re‑invent how we do engineering to deliver hyper‑... ...is data. The Customer Data Platform team is at the heart of Expedia... ...engineering practices, code quality, observability, and reliability for... ...efficient AI use appropriate to a principal level AI engineer. The total...PrincipalImmediate start- ...technology company in Seattle is seeking an accomplished Software Engineer to join their rising Agentforce Planner team. The successful... ...drive the execution and delivery of features within the AI-driven platform utilized by millions of users. Candidates should have extensive...Principal
$160k - $210k
...We are looking for an innovative and hands-on principal data platform engineer to join our team and lead the design, build, and scaling of secure, governed platforms on Azure, Databricks, and Snowflake. You'll architect cloud-native solutions, enable AI-driven analytics...PrincipalTemporary workH1bRemote workFlexible hours$229.2k - $319.5k
Principal Software Engineer - Content Access Platform Mercer Island, USA - Job Id: REQ-0009063 Riot engineers bring deep knowledge of specific technical areas but also value the opportunity to work in a variety of broader domains. As a Principal Software Engineer, you’ll...PrincipalTemporary workLocal areaWorldwideFlexible hours- ...Line - The Walt Disney Company is seeking a seasoned Software Engineer to lead advancements in ad technology. This role involves collaborating... ..., and optimizing foundational architecture for Disney's media platforms. Candidates should have over 12 years of experience and a...Principal
- ...Axon in Seattle is seeking a Senior Engineer for its observability team. You'll design and evolve the observability platform, working on distributed tracing, logging, and metrics across Axon's infrastructures. The ideal candidate has strong engineering experience, ideally...
- ...TwinThread in Seattle is seeking a Principal Software Engineer to enhance and deliver market-leading technology products. The role requires 7+ years of software engineering experience with strong coding, system design skills, and proficiency in languages like Rust, Java...Principal
- A leading video game developer in Mercer Island seeks a Principal Software Engineer to lead the design of back end services for games. You will ensure high standards and collaborate with teams to implement scalable solutions. Qualifications include a relevant degree and...Principal
- ...Summary Senior Platform Engineer on the Productivity and Collaboration team owning the end-to-end lifecycle of client endpoint platforms... ...identity/infrastructure technologies. Familiarity with monitoring/observability tools and integrating endpoint telemetry into monitoring...
$155k - $190k
A pioneering AI healthcare startup seeks a Principal Data Engineer to lead the development of innovative data pipelines and infrastructure. Candidates should have a strong background in Azure Databricks and real-time data solutions, with at least 2 years of relevant experience...PrincipalFull time
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal Observability Platform Engineer. Be the first to apply!
- senior civil engineer project manager Seattle, WA
- senior chief engineer Seattle, WA
- director of product engineering Seattle, WA
- engineering director Seattle, WA
- chief engineer Seattle, WA
- chief design engineer Seattle, WA
- principal network engineer Seattle, WA
- data center chief engineer Seattle, WA
- principal infrastructure engineer Seattle, WA
- project engineer assistant project manager Seattle, WA

