Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer

$179.2k - $268.8k
Full-time

Latitude AI

Latitude AI (lat.ai [ develops automated driving technologies, including L3, for Ford vehicles at scale. We’re driven by the opportunity to reimagine what it’s like to drive and make travel safer, less stressful, and more enjoyable for everyone. When you join the Latitude team, you’ll work alongside leading experts across machine learning and robotics, cloud platforms, mapping, sensors and compute systems, test operations, systems and safety engineering – all dedicated to making a real, positive impact on the driving experience for millions of people. As a Ford Motor Company subsidiary, we operate independently to develop automated driving technology at the speed of a technology startup. Latitude is headquartered in Pittsburgh with engineering centers in Dearborn, Mich., and Palo Alto, Calif. Meet the team: As a Site Reliability Engineer on the team, you will be responsible for helping to build and run these mission critical systems. Through the implementation of monitoring and automation, you will constantly ensure the health, reliability, scalability, and performance of the platforms. The Site Reliability team interacts with engineering teams including ingest/data processing, mapping, labeling, triage, machine learning (detection, prediction, tracking), motion planning/control, offline simulation, and release/deployment teams to provide uniform service observability and incident response. What you’ll do: * Build monitoring to ensure our platform is healthy and its reliability measurable * Build alerting and a set of runbooks to enable faster detection and remediation of platform issues * Debug complex issues that may combine multiple components of the stack and ensure proper fixes are implemented to prevent these issues from happening again * Participate in an on-call rotation and culture of continuous improvement through blameless postmortems * Design and implement components of the platform to enable features that make the work of our customers possible, simpler and more efficient * Build Kubernetes controllers to automate operations What you'll need to succeed: * Bachelor's degree in Computer Engineering, Computer Science, Electrical Engineering, Robotics or a related field and 4+ years of relevant experience (or Master's degree and 2+ years of relevant experience, or PhD) * Fundamental understanding of Linux operating system internals, TCP/IP networking, and storage subsystems * Hands on development in Go or Python to create robust software that can run reliably in production * Strong experience scaling and securing services in the cloud (AWS, GCP) or cloud native environments * Experience using infrastructure-as-code principles to automate the creation of infrastructure resources (e.g. Terraform, CloudFormation)

  • Experience authoring and maintaining Kubernetes Controllers in Go
  • Experience running Kubernetes and related core components in a large-scale,
production environment * Experience with metrics (e.g. Prometheus), logging (e.g. Elasticsearch, Loki) and tracing (e.g. Jaeger, Tempo) systems * Understanding of engineering design limitations and ability to provide guidance to teams to scale their services to achieve desired performance within budget * A focus on increasing service reliability through defining and adhering to SLOs * Strong communication skills and the ability to work effectively in a diverse and distributed team What we offer you:
  • Competitive compensation packages
  • High-quality individual and family medical, dental, and vision insurance
  • Health savings account with available employer match
  • Employer-matched 401(k) retirement plan with immediate vesting
  • Employer-paid group term life insurance and the option to elect voluntary
life insurance
  • Paid parental leave
  • Paid medical leave
  • Unlimited vacation
  • 15 paid holidays
  • Daily lunches, snacks, and beverages available in all office locations
  • Pre-tax spending accounts for healthcare and dependent care expenses
  • Pre-tax commuter benefits
  • Monthly wellness stipend
  • Adoption/Surrogacy support program
  • Backup child and elder care program
  • Professional development reimbursement
  • Employee assistance program
  • Discounted programs that include legal services, identity theft protection,
pet insurance, and more * Company and team bonding outlets: employee resource groups, quarterly team activity stipend, and wellness initiatives Learn more about Latitude’s team, mission and career opportunities at lat.ai [ The expected base salary range for this full-time position in California is $179,200 - $268,800 USD. Actual starting pay will be based on job-related factors, including exact work location, experience, relevant training and education, and skill level. Latitude employees are also eligible to participate in Latitude’s annual bonus programs, equity compensation, and generous Company benefits program, subject to eligibility requirements. Candidates for positions with Latitude AI must be legally authorized to work in the United States on a permanent basis. Verification of employment eligibility will be required at the time of hire. Visa sponsorship is available for this position. We are an Equal Opportunity Employer committed to a culturally diverse workforce. All qualified applicants will receive consideration for employment without regard to race, religion, color, age, sex, national origin, sexual orientation, gender identity, disability status or protected veteran status.

#LI-CG1

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer in Palo Alto, CA vacancy
  • $210k - $270k

    Zocdoc is seeking a Senior Site Reliability Engineer to develop and maintain distributed production systems. The ideal candidate will have over 5 years of experience in site reliability or production engineering, particularly in cloud environments like AWS. Responsibilities... 
    Senior

    GoTo Meeting

    Palo Alto, CA
    1 day ago
  •  ..., and the challenges of building in a high-growth startup, we’d love to talk. This is more than a job—it’s a journey. Site Reliability Engineers (SREs) are responsible for the overall performance and reliability of ASAPP's infrastructure and products. The team owns... 
    Senior
    Remote work

    ASAPP

    Mountain View, CA
    1 day ago
  •  ...join our small team focused on growth and productivity. The role involves scaling our platform and infrastructure while enhancing reliability and the overall developer experience. Ideal candidates will have strong expertise in distributed systems, cloud-native... 
    Senior
    Remote job

    BuildBuddy

    Palo Alto, CA
    1 day ago
  • The Role We're looking for a Senior Site Reliability Engineer to own the reliability, scalability, and operational excellence of the production systems that power Nectar's platform. We run high-volume data ingestion pipelines and real-time AI agents on top of a fast-growing... 
    Senior

    Nectar

    Palo Alto, CA
    1 day ago
  • $140k - $220k

    About the Job You’ll own reliability and operational excellence for Pylon’s production systems. This means designing and implementing...  ...scale as we grow. You’ll build tooling that makes the entire engineering team more effective, establish on‑call rotations and runbooks... 
    Senior

    Pylon

    Palo Alto, CA
    3 days ago
  • $210k - $270k

    Your Impact on our Mission: Zocdoc is looking for a Senior Site Reliability Engineer to help develop, monitor, and maintain our distributed production systems. You’ll be challenged with building frameworks and processes for ensuring uptime for our patients and providers... 
    Senior
    Flexible hours

    GoTo Meeting

    Palo Alto, CA
    1 day ago
  •  ...Infrastructure Footprint: Global production infrastructure across AWS, South America, and Europe Role Overview Seeking a Senior Site Reliability Engineer / DevOps Engineer to design, scale, and operate highly available global infrastructure supporting production systems... 
    Senior

    Prophet Town

    Mountain View, CA
    3 days ago
  • $180k - $260k

     ...facilitating effortless integration into customers’ logistics operations. About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our growing fleet of autonomous vehicles. In this role, you... 
    Senior
    Odd job
    Work at office
    Remote work

    Booster

    Mountain View, CA
    1 day ago
  • Zocdoc, located in Silicon Valley, CA, is seeking a Senior Site Reliability Engineer to monitor and maintain cloud-based systems ensuring uptime for millions of patients. You'll work with cutting-edge technology in a diverse and collaborative environment. This role requires... 
    Senior

    Dormont Manufacturing Co

    Palo Alto, CA
    4 days ago
  • A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS. The role involves maintaining high availability through Kubernetes clusters and improving CI/CD pipelines with Terraform. Ideal... 
    Senior

    Amiri Recruiting

    Mountain View, CA
    1 day ago
  • Poshmark, Inc. is seeking a talented Site Reliability Engineer to ensure the health and performance of our web-scale systems. You will collaborate with development teams and focus on automating and monitoring systems for high reliability. The ideal candidate has 5 years... 
    Senior

    Poshmark, Inc.

    Redwood City, CA
    4 days ago
  • $150k - $180k

    A technology-focused data center developer in Mountain View, CA is looking for a Senior Site Reliability Engineer to manage software infrastructure. This full-time position requires experience in Software Engineering or DevOps, with strong proficiency in Golang. The role... 
    Senior
    Full time

    Verrus, LLC

    Mountain View, CA
    3 days ago
  • $232k - $263k

     ...Join us as we define the future of SaaS security! Sr. Staff Site Reliability Engineer As a Sr. Staff SRE at Obsidian , you will define and...  ...Engineering, or related roles ~3+ years operating at a senior or technical leadership level (Staff or equivalent scope)... 
    Senior
    Work from home
    Flexible hours

    Obsidian Security

    Palo Alto, CA
    9 days ago
  •  ...building an AI Data Center AIOps platform that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for GPU fleets. Join our team of innovative engineers who are building this platform and operating it (not the compute cluster): uptime, performance... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $174k - $252k

    Senior Software Engineer, Site Reliability Engineering X Applicants in San Francisco: Qualified applications with arrest or conviction records will be considered for employment in accordance with the San Francisco Fair Chance Ordinance for Employers and the California... 
    Senior
    Full time

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $148k - $235.75k

    ## Senior Site Reliability EngineerApplylocations: US, CA, Santa Claratime type: Full timeposted on: Posted Yesterdayjob requisition id: JR201...  ...Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced crew that... 
    Senior
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...intelligence. Job Overview We’re looking for a Senior SRE to join our Compute Farm team and...  ...host lifecycle management, fleet reliability/auto‑healing, E2E observability or data‑...  ...Python, Go, Perl, or Ruby. Mentored other engineers and influenced technical direction through... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...vector database company for enterprise-grade AI. Founded by the engineers behind Milvus, the world’s most popular open-source vector...  ...you will do: Work at the intersection of development and site reliability. Creating SRE tools and systems, as well as supporting... 
    Senior

    Zilliz

    Redwood City, CA
    1 day ago
  • $200k - $322k

     ...supportive environment, where NVIDIANs are inspired to excel and make a profound global impact. NVIDIA is seeking a Senior Manager of Site Reliability Engineering to lead and reshape how IT operations function at scale. This role goes beyond traditional service management... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $126k - $204.5k

     ...As part of this role, you will collaborate closely with our engineering teams to develop innovative solutions that provide clear and...  ...team to influence the operability of the product and ensure the reliability and availability of our services. Qualifications Required... 
    Senior

    Palo Alto Networks, Inc.

    Santa Clara, CA
    4 days ago
  • $145k - $165k

    A technology solutions firm in Sunnyvale, CA is looking for a highly experienced Site Reliability Engineer (SRE). This role involves maintaining uptime and performance across systems. Exceptional Linux expertise and automation skills in Bash and Python are crucial. Key... 
    Senior

    Bolt Graphics, Inc.

    Sunnyvale, CA
    3 days ago
  • $175.8k - $264.2k

    Senior Site Reliability Engineer - Apple Services Engineering (ASE) / iCloud Cupertino, CA People at Apple don't just build products - they craft experiences our customers love and depend on. Apple Services Engineering (ASE) builds and supports the systems that make many... 
    Senior

    Hong Kong Study Skills Research Institute

    Cupertino, CA
    4 days ago
  • $176k - $276k

    Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  •  ...Job Description Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally...  ...position yourself among the top echelon in site reliability and AI-powered infrastructure automation. As a Senior Lead Site Reliability Engineer at JPMorgan... 
    Senior
    Work at office

    JPMorgan Chase & Co.

    Palo Alto, CA
    9 days ago
  • $180k

     ...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who...  ...Cybersecurity / SRE team is focused on ensuring the security and reliability of X Money. This role will primarily focus on the X Money platform... 
    Permanent employment
    Temporary work
    Relocation

    xAI

    Palo Alto, CA
    a month ago
  •  ...Job Description Job Description Site Reliability Engineer Onsite- Bay Area, CA Skills Relevant Skills and Experience What You’ll Do (Day-to-Day) Own and manage our cloud infrastructure (GCP or AWS, on-prem). Build, maintain, and optimize Kubernetes... 

    Amiri Recruiting

    Mountain View, CA
    26 days ago
  •  ...join one of America's most beloved eCommerce companies as a Senior Release Engineer. This role will work across all web based brands and you'll...  ...Skill Set Specific experience deploying large scale web sites/products Experience deploying cloud based apps Strong... 
    Senior

    Black Swan Search

    Mountain View, CA
    3 days ago
  • $217.57k - $260k

     ...Identity Left Behind" to enable all people to have a secure digital identity. To learn more, visit Role Overview The Staff Site Reliability Engineer, Infrastructure role is building a high-scale infrastructure team responsible for owning environments with thousands of... 
    Full time
    Temporary work
    Work at office
    Remote work
    Flexible hours
    Shift work

    ID.me

    Mountain View, CA
    10 days ago
  • ATX Venture Partners seeks a Principal Engineer to drive technology initiatives and create scalable solutions. You'll develop systems in a highly collaborative environment, utilizing both front-end and back-end technologies, particularly in AI domains. The ideal candidate... 
    Senior

    ATX Venture Partners

    Mountain View, CA
    9 hours ago
  • $169k - $224k

     ...disciplinary organization of scientists, engineers, and physicians and we are using the...  ...grail.com GRAIL is seeking a Staff Site Reliability / DevOps Engineer to lead the reliability...  ...collaboration with cross-functional and senior stakeholders Fast-paced, dynamic... 
    Full time
    Work at office
    Local area
    Flexible hours
    Shift work

    GRAIL

    Menlo Park, CA
    24 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!