Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Observability Engineer: Scale Metrics & Reliability

Pantera Capital

A technology company is seeking engineers to join their observability team in Palo Alto. This role involves designing and implementing scalable observability infrastructure, developing high-performance telemetry pipelines, and ensuring the reliability and performance of the observability stack. Ideal candidates should have production-level proficiency in programming languages like Go, Rust, or Scala, and a strong understanding of distributed systems. Competitive salary range and various benefits are included. #J-18808-Ljbffr

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Staff Observability Engineer: Scale Metrics & Reliability in Palo Alto, CA vacancy
  • $235k - $295k

     ...data and AI infrastructure company is seeking a Sr. Staff Software Engineer to join their Observability team in Mountain View, California. In this role,...  ...develop key observability solutions and ensure product reliability across cloud regions. Candidates should have 15+... 
    Suggested

    Databricks Inc.

    Mountain View, CA
    5 days ago
  • A leading technology company is seeking a Staff Software Engineer focusing on fault management to enhance server reliability and influence team designs. The ideal candidate...  ...extensive experience in C++ programming and large-scale systems development. Key responsibilities... 
    Suggested

    Google Inc.

    Sunnyvale, CA
    2 days ago
  • $126k - $204.5k

     ...maintaining a large‑scale GCP environment, including...  ...of our comprehensive observability systems. To meet the...  ...high cardinality metrics, implemented tracing,...  ...collaborate closely with our engineering teams to develop...  ...and ensure the reliability and availability of our... 
    Suggested

    Palo Alto Networks, Inc.

    Santa Clara, CA
    2 days ago
  • $150k - $180k

     ...onsite energy infrastructure (i.e. large scale BESS) through our proprietary...  ...serve as software-focused Senior Site Reliability Engineer at Verrus. This is a full‑time position...  ...allocation for workloads. Reliability & Observability : design and implement comprehensive monitoring... 
    Suggested
    Full time
    Work at office
    Local area
    Flexible hours

    Verrus, LLC

    Mountain View, CA
    3 days ago
  • $135k - $179k

     ...organization of scientists, engineers, and physicians...  ...(NGS), population‑scale clinical studies,...  ...companies. As a Staff Network Engineer...  ...to ensure reliable, predictable network...  ...Logs, CloudWatch metrics/logs, and Route 53...  ...scalability, reliability, observability, and security.... 
    Suggested
    Full time
    Local area
    Flexible hours

    Initial Therapeutics, Inc.

    Menlo Park, CA
    4 days ago
  •  ...Staff Network EngineerSkip to main contentGEICO...  ....#Staff Network Engineer page is loaded## Staff...  ..., security, and reliability · Implement and maintain observability for the network platform, including metrics, alerts, and dashboards...  ...large-scale IP fabrics, including... 
    Hourly pay
    Work experience placement
    Local area
    Flexible hours

    GEICO

    Palo Alto, CA
    4 days ago
  • $190k - $300k

     ...with their user base. AI Engineers, Data Science, and...  ...applications at production scale across industries have...  ...in the field of AI Observability and has received...  ...challenges in AI safety and reliability. Working on exciting...  ...agentic observability metrics (e.g., response relevancy... 
    Work at office
    3 days per week

    Fiddler AI

    Palo Alto, CA
    3 days ago
  • $169k - $224k

     ...organization of scientists, engineers, and physicians and we...  ...(NGS), population-scale clinical studies, and state...  ...com GRAIL is seeking a Staff Site Reliability / DevOps Engineer to...  ...Establish and evolve observability platforms (metrics, logs, traces) and define... 
    Full time
    Work at office
    Local area
    Flexible hours
    Shift work

    GRAIL

    Menlo Park, CA
    2 days ago
  • $227k - $290k

     ...industry-leading Reasoning Engine that uses a combination...  ...Will Do As a site reliability engineer, you will be an...  ...teams to rapidly deploy and scale Moveworks infrastructure...  ...maintain monitoring, metrics, and reporting systems for observability and actionable alerting.... 
    Full time
    Immediate start

    Moveworks

    Mountain View, CA
    more than 2 months ago
  • $214k - $289.5k

    Senior Staff Machine Learning Engineer Category: Software Engineering...  ...business value at Intuit scale. In this role, you...  ...for adaptability, observability, and secure...  ...continuous improvement of reliability, fairness, and...  ...hypotheses, success metrics, and iterative validation... 
    Worldwide

    ATX Venture Partners

    Mountain View, CA
    3 days ago
  • $220k - $240k

     ...Staff Data Engineer We're ALSO, an electric mobility company...  ..., and large-scale data processing — ensuring...  ...telemetry flows are reliable, scalable, cost-efficient...  ...telemetry data (events, metrics, time-series) with...  ...Develop fault-tolerant, observable, and debuggable... 
    Local area
    Flexible hours

    ALSO

    Palo Alto, CA
    1 day ago
  • $220k - $255k

     ...enjoyable and 10-50x more efficient. ALSO is looking for a Reliability Engineer to play a key role in developing and leading the reliability...  ...What You Will Do Establish reliability targets and metrics for new product development that include actuators, batteries... 
    Work at office
    Local area
    Remote work
    Flexible hours
    1 day per week

    ALSO

    Palo Alto, CA
    27 days ago
  • $206.5k - $258.1k

     ...contributor to build and scale our AI solutions...  ...Technology teams. As a Staff AI Engineer, you will design,...  ...instrument offline and online metrics & telemetry to ensure...  ..., CI/CD, testing, observability); familiarity with...  ...and contributions to reliability/SLOs and operational... 
    Full time
    Contract work
    Temporary work
    Part time
    Local area
    Shift work

    Rivian

    Palo Alto, CA
    1 day ago
  •  ...company in California seeks a Member of Technical Staff — Training to design and optimize large-scale distributed training systems for frontier AI models...  ...involves collaborating with researchers and improving the reliability of long-running training jobs. Competitive... 

    RadixArk

    Palo Alto, CA
    3 days ago
  • $200k - $240k

     ...and 10-50x more efficient. ALSO is looking for a Field Reliability Engineer to play a key role in tracking and improving the reliability...  ...identify any gaps in reliability test plan. Develop novel damage metrics to more accurately model failure mechanisms. Work with... 
    Work at office
    Local area
    Remote work
    Flexible hours
    1 day per week

    ALSO

    Palo Alto, CA
    a month ago
  • $186k - $232.5k

     ...Summary Are you a Staff or Lead-level Platform Engineer passionate about developer...  ...teams to reliably and securely ship products...  ...loop. DevEx Metrics & Advocacy: Track...  ...standards for quality, observability, compliance,...  ..., supporting large-scale software engineering... 
    Full time
    Contract work
    Temporary work
    Part time
    Local area
    Shift work

    Rivian

    Palo Alto, CA
    2 days ago
  • $165k - $242k

     ...innovators to build and scale AI with confidence....  ...We're looking for a Staff Storage Engineer to play a key role...  ...systems by building reliable, scalable, and high-throughput...  ..., durability, and observability of our storage stack....  ...using telemetry, metrics, and dashboards to improve... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    1 day ago
  • $180k

    xAI in Palo Alto, California, is seeking a talented engineer for the X Search team. This team focuses on building the core search engine...  ...candidates have experience with vector databases and large-scale search systems, with a proven track record in production ML systems... 

    xAI

    Palo Alto, CA
    5 days ago
  • $190k - $240k

     ...weekly software builds Establish processes and metrics to measure software quality, performance, and...  ...efficiently Collaborate with software engineering teams on architecture, observability, infrastructure, and reliability needs Support production readiness reviews,... 
    Hourly pay
    Local area
    Flexible hours

    General Motors

    Mountain View, CA
    5 days ago
  • $181k - $262k

    Hardware Engineering Mountain View, California Staff Hardware Reliability Engineer - Sensors Who we are Aurora’s mission is to deliver the benefits of self-driving...  ...Aumovio (formerly Continental) to bring a robust, scaled product to market. In this role you will Lead... 
    Contract work
    Work at office
    Local area
    3 days per week

    Australian Competition and Consumer Commission

    Mountain View, CA
    4 days ago
  •  ...California in 2004 when a visionary engineer, Fred Luddy, saw the...  ...hybrid indexing technology at scale across large clusters,...  ...performance, scalability, and observability of search, including query latency...  .... ~ Drive reliability and operability across the platform... 
    Full time
    Work at office
    Remote work
    Flexible hours
    Shift work

    ServiceNow

    Mountain View, CA
    1 day ago
  • $180k

     ...xAI is seeking a Software Engineer in Palo Alto, California, to join their small, innovative...  ...design to ensure scalability and reliability for applications used by millions. The ideal...  ...least 2 years of experience with large scale applications, and strong collaboration skills... 

    Xai

    Palo Alto, CA
    4 days ago
  •  ...week) We are seeking a Staff Software Engineer to join the Wallet –...  ...most critical and high-scale engineering domains. You will...  ...engineering quality, security, and reliability. You’ll collaborate...  .... Lead initiatives around observability, alert hygiene, capacity planning... 

    Walter Services

    Mountain View, CA
    4 days ago
  • $180k - $260k

     ...integration into customers’ logistics operations. About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our growing fleet of autonomous vehicles. In this role, you will work closely with our... 
    Odd job
    Work at office
    Remote work

    Booster

    Mountain View, CA
    4 days ago
  • $126k - $203.5k

     ...Summary The Production Engineering team is responsible for building, scaling, and operating the cloud...  .... As a Senior Staff Production Engineer, Platform...  ...infrastructure, and production reliability, you will develop...  ...Design and implement observability, monitoring, and telemetry... 

    Palo Alto Networks, Inc.

    Santa Clara, CA
    4 days ago
  •  ...About the Role As a Senior Staff Software Engineer at Hippocratic AI, you’ll define...  ...systems that power reliable, testable, and incrementally...  ...pipelines, feature flag strategy, observability, and developer tooling—...  ...who have built and scaled software systems across multiple... 
    Work at office
    Local area

    Hippocratic-Ai

    Palo Alto, CA
    6 days ago
  • $152k - $248k

     ...Job Description Position: Staff Network Engineer – Data Center & Core Network Engineering Location...  ...network performance, capacity, reliability, and observability. Responsibilities Review...  ...Design, deploy, and operate large-scale network infrastructure for multiple... 
    Work at office

    LinkedIn

    Mountain View, CA
    1 day ago
  • $152k - $248k

     ...Center & Core Network Engineering team is responsible for...  ..., security, and reliability within campus and across...  ...hypergrowth. As a Staff Network Engineer, you'...  ...capacity, reliability, and observability. Design, deploy, and operate a large-scale network for data... 
    For contractors
    Work at office
    Flexible hours

    LinkedIn

    Mountain View, CA
    12 hours ago
  • $198.9k - $304.8k

     ...of transportation on a global scale. Role As a Technical Lead you...  ...align multiple teams to ship reliable, scalable autonomy...  ...technical reviews and drive software engineering best practices across the team...  ...features and defining useful metrics for analyzing performance. Mentor... 
    Work experience placement
    Local area
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    3 days ago
  • $189k - $300k

     ...of transportation on a global scale. The Data Scaling team...  ...collaborative, high-impact team of AI/ML engineers, data scientists and...  ...Contribute to the safety, reliability, and scalability of next-generation...  ...autonomous vehicles. As a Staff AI/ML Engineer in the... 
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Observability Engineer: Scale Metrics & Reliability. Be the first to apply!