Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Site Reliability Engineer - 2

$86.33k - $191.9k

Traveltechessentialist

What You'll Do Building a fast moving, high growth service. Navan is revolutionizing travel and expense services for the enterprise, and the product is evolving quickly. You are comfortable in a startup environment, enjoy seeing the product take shape, and have strong ownership of the success of your services. Designing, implementing and operating cloud infrastructure . You’re a fit for us if you think in terms of infrastructure as code, deployment pipelines, and building the guardrails to make going fast also going safely. Identifying reliability anti-patterns and solving them systemically . You dive deep into the data to evaluate the health of your systems, and you use it to improve visibility and reliability across the fleet of services. Finding and automating the toil out of our processes . You’d prefer to automate it entirely, or build a tool to empower your users rather than be the gatekeeper to the tool. Leveraging AI tools and platforms in your daily work to achieve autonomous operations, reduce toil, and improve system observability. Contributing to the definition and adoption of system reliability standards, including formalizing SLO/SLI frameworks, observability standards, and blameless post‑mortem practices. Assisting in the adoption of AI‑assisted developer tools and platforms to increase engineering productivity, enforce code quality standards, and enable real‑time architectural validation. What We’re Looking For 2+ years of progressive experience as an SRE or equivalent role. Passionate about solving problems and learning new tools and technologies. Excellent communication skills working with stakeholders and domain experts across the company to design solutions to user problems. Thrive in a fast‑paced environment. Demonstrated ability to contribute to and take ownership of technical infrastructure projects. Operate with a strong sense of ownership demonstrated through shipping production‑quality code and infrastructure equipped with testing, monitoring and documentation. Hands‑on operational experience with Java based applications and services including JVM profiling and performance tuning (python, Node.js and Go are a plus). Hands‑on experience building and operating distributed systems in a public cloud environment (preferably AWS), using CI/CD to deploy, manage and operate production systems, focusing on tooling and automation using tools such as maven and Jenkins. Hands‑on experience with microservice architecture and related reliability and resiliency patterns such as throttling, queueing, and retries. Hands‑on experience with writing Infrastructure as Code in Terraform or Cloudformation or similar tools. A passion for automating away everything, using scripting languages such as python, bash, groovy (we prefer lazy engineers). Built, using, and automating monitoring systems such as NewRelic, DataDog, SignalFX, Kibana. Hands‑on experience deploying, operating, and monitoring production‑grade AI/ML microservices (e.g., RAG pipelines, agentic systems) on cloud platforms like AWS Fargate/ECS. Experience leveraging AI/LLM platforms (e.g., Gemini, Braintrust) and managing their secrets and infrastructure using Infrastructure as Code (Terraform) and AWS SSM. Demonstrated ability to integrate AI‑specific telemetry and advanced observability practices to enable predictive insights and systemic root‑cause analysis. Pay Range $86,325 – $191,900 USD Benefits Navan offers a comprehensive benefits program designed to support your well‑being, financial security, and life outside of work. Our benefits, thoughtfully tailored by country to meet local needs, include healthcare coverage, insurance offerings, and wellness resources for you and your family. We support long‑term financial growth through retirement savings programs and opportunities to participate in our equity plans, so you can share in Navan’s success. To promote balance, we offer flexible time off, country‑specific holidays, and paid parental leave for all new parents. Additional benefits include connectivity and commuting support, mental health resources, and exclusive travel‑related perks. Wherever you’re based, our benefits evolve with you. Equal Opportunity Navan is an equal opportunity employer. We make all employment decisions based solely on merit. We provide equal employment opportunity to all applicants and employees without discrimination on the bases of race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We prohibit any such discrimination or harassment. This policy applies to all terms and conditions of employment, including hiring. Accommodations Navan complies with the Americans with Disabilities Act (ADA), as amended by the ADA Amendments Act, and all applicable state or local law. Navan will reasonably accommodate qualified individuals with a disability in connection with applications for employment as required by law. #J-18808-Ljbffr Traveltechessentialist

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Site Reliability Engineer - 2 in Palo Alto, CA vacancy
  • $210k - $270k

    Your Impact on our Mission: Zocdoc is looking for a Senior Site Reliability Engineer to help develop, monitor, and maintain our distributed production...  ...Reliability Engineering or Production Engineering role 2+ years of on-call experience in a 24/7 cloud-based production... 
    Suggested
    Flexible hours

    GoTo Meeting

    Palo Alto, CA
    1 day ago
  • $250k

     ...the single source of truth—explainable, reliable, and maintainable—that serves as the...  ...Position Overview As Director of Site Reliability Engineering, you will ensure that eGain’s AI knowledge...  ...– this is a take-home test Step 2 Panel interview (in-person at eGain... 
    Suggested
    Work at office

    eGain Corporation

    Sunnyvale, CA
    4 days ago
  • $140k - $220k

    About the Job You’ll own reliability and operational excellence for Pylon’s production systems. This means designing and implementing...  ...scale as we grow. You’ll build tooling that makes the entire engineering team more effective, establish on‑call rotations and runbooks... 
    Suggested

    Pylon

    Palo Alto, CA
    3 days ago
  •  ...technologies. Our mission is to double America’s compute capacity without building new data centers. We are seeking a skilled Site Reliability Engineer to join our growing team. The ideal candidate will help ensure the reliability, scalability, and performance of our hybrid... 
    Suggested
    Work at office
    Weekend work

    FLUIX

    Palo Alto, CA
    1 day ago
  • $210k - $270k

    Zocdoc is seeking a Senior Site Reliability Engineer to develop and maintain distributed production systems. The ideal candidate will have over 5 years of experience in site reliability or production engineering, particularly in cloud environments like AWS. Responsibilities... 
    Suggested

    GoTo Meeting

    Palo Alto, CA
    1 day ago
  • The Role We're looking for a Senior Site Reliability Engineer to own the reliability, scalability, and operational excellence of the production systems that power Nectar's platform. We run high-volume data ingestion pipelines and real-time AI agents on top of a fast-growing... 

    Nectar

    Palo Alto, CA
    1 day ago
  • $174k - $252k

    Senior Software Engineer, Site Reliability Engineering X Applicants in San Francisco: Qualified applications with arrest or conviction records will...  ...analyzing, and troubleshooting large-scale distributed systems. 2 years of experience leading projects and providing... 
    Full time

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $180k - $360k

     ...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who...  ...Cybersecurity / SRE team is focused on ensuring the security and reliability of X Money. This role will primarily focus on the X Money platform... 
    Temporary work
    Relocation

    Pantera Capital

    Palo Alto, CA
    4 days ago
  •  ...join our small team focused on growth and productivity. The role involves scaling our platform and infrastructure while enhancing reliability and the overall developer experience. Ideal candidates will have strong expertise in distributed systems, cloud-native... 
    Remote job

    BuildBuddy

    Palo Alto, CA
    1 day ago
  • $150k - $180k

     ...financial, environmental, and innovation outcomes. Role Verrus is looking for candidates to serve as software-focused Senior Site Reliability Engineer at Verrus. This is a full‑time position based out of the Mountain View, CA office. Verrus takes a very technology‑forward... 
    Full time
    Work at office
    Local area
    Flexible hours

    Verrus, LLC

    Mountain View, CA
    12 hours ago
  • $232.9k - $335.81k

    ## Principal Site Reliability EngineerApplylocations: USA - CA - Palo Altotime type: Full timeposted on: Posted Yesterdayjob requisition id...  ...About the Role:** We're looking for a Principal Site Reliability Engineer to join our Platform Engineering team — someone equally at... 
    Permanent employment

    Uniphore Technologies Inc.

    Palo Alto, CA
    4 days ago
  •  ...Infrastructure Footprint: Global production infrastructure across AWS, South America, and Europe Role Overview Seeking a Senior Site Reliability Engineer / DevOps Engineer to design, scale, and operate highly available global infrastructure supporting production systems... 

    Prophet Town

    Mountain View, CA
    3 days ago
  • $180k - $260k

     ...facilitating effortless integration into customers’ logistics operations. About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our growing fleet of autonomous vehicles. In this role, you will work... 
    Odd job
    Work at office
    Remote work

    Booster

    Mountain View, CA
    1 day ago
  • $175k - $215k

     ...Software Reliability Engineer, Waymo Fleet Waymo is an autonomous driving technology company with the mission to be the world's most trusted...  ...retrospectives to drive continuous improvement You have: ~2+ years of experience writing clean, efficient code in C++,... 
    Full time
    Remote work

    Waymo

    Mountain View, CA
    12 hours ago
  •  ...environment. Required Qualifications: Bachelor's degree in System Engineering, Electrical Engineering, or Computer Science. 5+ years of...  .... Several years of working experience at automotive Tier 1/2 suppliers or OEMs, with a deep understanding of automotive processes... 
    Work experience placement

    Tranzeal

    Newark, CA
    12 hours ago
  • A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS. The role involves maintaining high availability through Kubernetes clusters and improving CI/CD pipelines with Terraform. Ideal... 

    Amiri Recruiting

    Mountain View, CA
    1 day ago
  • $141.3k - $226k

     ...account before you apply for a job. (Click Sign In Create Account) 2. If you already have a Candidate Account, please Sign-In before...  ...Job Description: OS kernel and system software development engineer ESX CPU and Server Platform At VMware by Broadcom, we are building... 
    Local area

    Broadcom Corporation

    Palo Alto, CA
    3 days ago
  • $135k - $200k

     ...Forward Deployed Software Engineer - Edge Autonomous Systems Title...  ...operational settings, ensuring high reliability and performance. Work...  ...Ideal Candidate Background ~2–4 years of software...  ...environments, such as the field or on-site with customers, under... 
    Work at office

    Recruiting from Scratch

    Palo Alto, CA
    7 days ago
  • $140k - $200k

     ...planet. We are a team of mission-driven engineers with experience across aerospace, robotics...  ...the systems that enable a robust, highly reliable data link between the remote pilot and aircraft...  ..., or equivalent experience ~2+ years systems-level programming experience... 
    Permanent employment
    Remote work

    Reliable Robotics Corporation

    Mountain View, CA
    1 day ago
  •  ...real time. This is not an ML role. This is a distributed systems engineering role at the heart of the agentic AI wave. Our AI agents can...  ...interface design, error handling patterns Required: ~2+ years building production backend/infrastructure systems ~ Strong... 
    Work at office
    Remote work
    Flexible hours

    ServiceNow

    Mountain View, CA
    3 days ago
  • $141.3k - $226k

     ...a job. (Click Sign In Create Account) 2. If you already have a Candidate Account...  ...Broadcom is looking for a Software Systems Engineer (P5) to join VMware Cloud Foundation's (...  ...automated tests to ensure the quality and reliability of the feature set Participate in code... 
    Full time
    Work at office
    Local area

    Broadcom Corporation

    Palo Alto, CA
    5 days ago
  •  ...ServiceNow's leading workflow automation with Moveworks' Reasoning Engine and natural language capabilities, we deliver the AI platform for...  ...interface design, error handling patterns Required: ~2+ years building production backend/infrastructure systems ~ Strong... 
    Work at office
    Remote work
    Flexible hours

    ServiceNow

    Mountain View, CA
    8 days ago
  • $135k - $200k

     ...more. The Role We are seeking a Forward Deployed Software Engineer to join a newly-formed team focused on developing advanced...  ...and maintaining robust production systems What We Require ~2+ years of professional software development experience ~2+ years... 
    Work experience placement
    Work at office
    Remote work
    Work from home
    Relocation package

    Palantir Technologies

    Palo Alto, CA
    2 days ago
  •  ...Python, Eclipse RCP and other software components. The simulation engineer will be responsible for integrating simulator components...  ...Model a complex robotics system, including sensors, in Gazebo ~2+ years of experience with Python: Write Python scripts to analyze... 
    Permanent employment

    Qualified Technical Services

    Mountain View, CA
    20 days ago
  • $152k - $241.5k

     ...infrastructure platforms for automated host lifecycle management, fleet reliability/auto‑healing, E2E observability or data‑driven operations (...  ...languages such as Python, Go, Perl, or Ruby. Mentored other engineers and influenced technical direction through design reviews,... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $117k - $195k

     ..., from designing the high gain antennas for NASA's Voyager 1 and 2 which have raced beyond the heliosphere through interstellar space...  ...assets. Lanteris Space is seeking a Sr. Systems Engineer for our Palo Alto, CA offices who can apply their knowledge and experience... 
    Permanent employment

    Lanteris Space Systems

    Palo Alto, CA
    12 hours ago
  • Education Requirements, Ideal Experience: Associate’s degree in Industrial Engineering or IT related field Minimum of 0-3 years’ relevant experience Knowledge of the application of tools/techniques Experience in one coding language (Preferred) Experience in Database (Preferred... 

    FII

    Sunnyvale, CA
    4 days ago
  • Site Reliability Engineer Onsite- Bay Area, CA Skills Relevant Skills and Experience What You’ll Do (Day-to-Day) Own and manage our cloud infrastructure (GCP or AWS, on-prem). Build, maintain, and optimize Kubernetes clusters (including GPU-backed clusters). Implement... 

    Amiri Recruiting

    Santa Clara, CA
    2 days ago
  •  ...building an AI Data Center AIOps platform that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for GPU fleets. Join our team of innovative engineers who are building this platform and operating it (not the compute cluster): uptime, performance... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  •  ...that keep the world running. Location: 5 on-site days a week in Sunnyvale, CA Headquarters. Our Team's Vision: Our Engineering team is shaping the future of cybersecurity...  ...are looking for an experienced Senior Site Reliability Engineer (SRE) with a strong background in... 
    Work experience placement

    Illumio

    Sunnyvale, CA
    12 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability Engineer - 2. Be the first to apply!