Senior Site Reliability Engineer

$179.2k - $268.8k

Full-time

Latitude AI

Latitude AI (lat.ai [ develops automated driving technologies, including L3, for Ford vehicles at scale. We’re driven by the opportunity to reimagine what it’s like to drive and make travel safer, less stressful, and more enjoyable for everyone. When you join the Latitude team, you’ll work alongside leading experts across machine learning and robotics, cloud platforms, mapping, sensors and compute systems, test operations, systems and safety engineering – all dedicated to making a real, positive impact on the driving experience for millions of people. As a Ford Motor Company subsidiary, we operate independently to develop automated driving technology at the speed of a technology startup. Latitude is headquartered in Pittsburgh with engineering centers in Dearborn, Mich., and Palo Alto, Calif. Meet the team: As a Site Reliability Engineer on the team, you will be responsible for helping to build and run these mission critical systems. Through the implementation of monitoring and automation, you will constantly ensure the health, reliability, scalability, and performance of the platforms. The Site Reliability team interacts with engineering teams including ingest/data processing, mapping, labeling, triage, machine learning (detection, prediction, tracking), motion planning/control, offline simulation, and release/deployment teams to provide uniform service observability and incident response. What you’ll do: * Build monitoring to ensure our platform is healthy and its reliability measurable * Build alerting and a set of runbooks to enable faster detection and remediation of platform issues * Debug complex issues that may combine multiple components of the stack and ensure proper fixes are implemented to prevent these issues from happening again * Participate in an on-call rotation and culture of continuous improvement through blameless postmortems * Design and implement components of the platform to enable features that make the work of our customers possible, simpler and more efficient * Build Kubernetes controllers to automate operations What you'll need to succeed: * Bachelor's degree in Computer Engineering, Computer Science, Electrical Engineering, Robotics or a related field and 4+ years of relevant experience (or Master's degree and 2+ years of relevant experience, or PhD) * Fundamental understanding of Linux operating system internals, TCP/IP networking, and storage subsystems * Hands on development in Go or Python to create robust software that can run reliably in production * Strong experience scaling and securing services in the cloud (AWS, GCP) or cloud native environments * Experience using infrastructure-as-code principles to automate the creation of infrastructure resources (e.g. Terraform, CloudFormation)

Experience authoring and maintaining Kubernetes Controllers in Go
Experience running Kubernetes and related core components in a large-scale,

production environment * Experience with metrics (e.g. Prometheus), logging (e.g. Elasticsearch, Loki) and tracing (e.g. Jaeger, Tempo) systems * Understanding of engineering design limitations and ability to provide guidance to teams to scale their services to achieve desired performance within budget * A focus on increasing service reliability through defining and adhering to SLOs * Strong communication skills and the ability to work effectively in a diverse and distributed team What we offer you:

Competitive compensation packages
High-quality individual and family medical, dental, and vision insurance
Health savings account with available employer match
Employer-matched 401(k) retirement plan with immediate vesting
Employer-paid group term life insurance and the option to elect voluntary

life insurance

Paid parental leave
Paid medical leave
Unlimited vacation
15 paid holidays
Daily lunches, snacks, and beverages available in all office locations
Pre-tax spending accounts for healthcare and dependent care expenses
Pre-tax commuter benefits
Monthly wellness stipend
Adoption/Surrogacy support program
Backup child and elder care program
Professional development reimbursement
Employee assistance program
Discounted programs that include legal services, identity theft protection,

pet insurance, and more * Company and team bonding outlets: employee resource groups, quarterly team activity stipend, and wellness initiatives Learn more about Latitude’s team, mission and career opportunities at lat.ai [ The expected base salary range for this full-time position in California is $179,200 - $268,800 USD. Actual starting pay will be based on job-related factors, including exact work location, experience, relevant training and education, and skill level. Latitude employees are also eligible to participate in Latitude’s annual bonus programs, equity compensation, and generous Company benefits program, subject to eligibility requirements. Candidates for positions with Latitude AI must be legally authorized to work in the United States on a permanent basis. Verification of employment eligibility will be required at the time of hire. Visa sponsorship is available for this position. We are an Equal Opportunity Employer committed to a culturally diverse workforce. All qualified applicants will receive consideration for employment without regard to race, religion, color, age, sex, national origin, sexual orientation, gender identity, disability status or protected veteran status.

#LI-CG1

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer in Palo Alto, CA vacancy

Senior Site Reliability Engineer — Hybrid + Unlimited PTO
$210k - $270k
Zocdoc is seeking a Senior Site Reliability Engineer to develop and maintain distributed production systems. The ideal candidate will have over 5 years of experience in site reliability or production engineering, particularly in cloud environments like AWS. Responsibilities...
Senior
GoTo Meeting
Palo Alto, CA
1 day ago
Senior Site Reliability Engineer
..., and the challenges of building in a high-growth startup, we’d love to talk. This is more than a job—it’s a journey. Site Reliability Engineers (SREs) are responsible for the overall performance and reliability of ASAPP's infrastructure and products. The team owns...
Senior
Remote work
ASAPP
Mountain View, CA
1 day ago
Senior Site Reliability Engineer - Remote & Scalable Impact
...join our small team focused on growth and productivity. The role involves scaling our platform and infrastructure while enhancing reliability and the overall developer experience. Ideal candidates will have strong expertise in distributed systems, cloud-native...
Senior
Remote job
BuildBuddy
Palo Alto, CA
1 day ago
Senior Site Reliability Engineer
The Role We're looking for a Senior Site Reliability Engineer to own the reliability, scalability, and operational excellence of the production systems that power Nectar's platform. We run high-volume data ingestion pipelines and real-time AI agents on top of a fast-growing...
Senior
Nectar
Palo Alto, CA
1 day ago
Senior Site Reliability Engineer
$140k - $220k
About the Job You’ll own reliability and operational excellence for Pylon’s production systems. This means designing and implementing... ...scale as we grow. You’ll build tooling that makes the entire engineering team more effective, establish on‑call rotations and runbooks...
Senior
Pylon
Palo Alto, CA
3 days ago
Senior Site Reliability Engineer
$210k - $270k
Your Impact on our Mission: Zocdoc is looking for a Senior Site Reliability Engineer to help develop, monitor, and maintain our distributed production systems. You’ll be challenged with building frameworks and processes for ensuring uptime for our patients and providers...
Senior
Flexible hours
GoTo Meeting
Palo Alto, CA
1 day ago
Senior Site Reliability Engineer / DevOps Engineer
...Infrastructure Footprint: Global production infrastructure across AWS, South America, and Europe Role Overview Seeking a Senior Site Reliability Engineer / DevOps Engineer to design, scale, and operate highly available global infrastructure supporting production systems...
Senior
Prophet Town
Mountain View, CA
3 days ago
Senior/Staff Site Reliability Engineer
$180k - $260k
...facilitating effortless integration into customers’ logistics operations. About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our growing fleet of autonomous vehicles. In this role, you...
Senior
Odd job
Work at office
Remote work
Booster
Mountain View, CA
1 day ago
Senior Site Reliability Engineer | Uptime, Cloud & GenAI
Zocdoc, located in Silicon Valley, CA, is seeking a Senior Site Reliability Engineer to monitor and maintain cloud-based systems ensuring uptime for millions of patients. You'll work with cutting-edge technology in a diverse and collaborative environment. This role requires...
Senior
Dormont Manufacturing Co
Palo Alto, CA
4 days ago
Senior Site Reliability Engineer: Cloud, Kubernetes & CI/CD
A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS. The role involves maintaining high availability through Kubernetes clusters and improving CI/CD pipelines with Terraform. Ideal...
Senior
Amiri Recruiting
Mountain View, CA
1 day ago
Senior Site Reliability Engineer: Scale, Automation & Cloud
Poshmark, Inc. is seeking a talented Site Reliability Engineer to ensure the health and performance of our web-scale systems. You will collaborate with development teams and focus on automating and monitoring systems for high reliability. The ideal candidate has 5 years...
Senior
Poshmark, Inc.
Redwood City, CA
4 days ago
Senior SRE & Software Engineer: Infra-as-Code & Cloud
$150k - $180k
A technology-focused data center developer in Mountain View, CA is looking for a Senior Site Reliability Engineer to manage software infrastructure. This full-time position requires experience in Software Engineering or DevOps, with strong proficiency in Golang. The role...
Senior
Full time
Verrus, LLC
Mountain View, CA
3 days ago
Sr. Staff Site Reliability Engineer
$232k - $263k
...Join us as we define the future of SaaS security! Sr. Staff Site Reliability Engineer As a Sr. Staff SRE at Obsidian , you will define and... ...Engineering, or related roles ~3+ years operating at a senior or technical leadership level (Staff or equivalent scope)...
Senior
Work from home
Flexible hours
Obsidian Security
Palo Alto, CA
9 days ago
Senior Site Reliability Engineer, AIOPs
...building an AI Data Center AIOps platform that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for GPU fleets. Join our team of innovative engineers who are building this platform and operating it (not the compute cluster): uptime, performance...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior Software Engineer, Site Reliability Engineering
$174k - $252k
Senior Software Engineer, Site Reliability Engineering X Applicants in San Francisco: Qualified applications with arrest or conviction records will be considered for employment in accordance with the San Francisco Fair Chance Ordinance for Employers and the California...
Senior
Full time
Google Inc.
Sunnyvale, CA
1 day ago
Senior Site Reliability Engineer
$148k - $235.75k
## Senior Site Reliability EngineerApplylocations: US, CA, Santa Claratime type: Full timeposted on: Posted Yesterdayjob requisition id: JR201... ...Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced crew that...
Senior
Remote work
NVIDIA Corporation
Santa Clara, CA
1 day ago
Senior Site Reliability Engineer - HPC
$152k - $241.5k
...intelligence. Job Overview We’re looking for a Senior SRE to join our Compute Farm team and... ...host lifecycle management, fleet reliability/auto‑healing, E2E observability or data‑... ...Python, Go, Perl, or Ruby. Mentored other engineers and influenced technical direction through...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Site Reliability Engineer Cloud Platform
...vector database company for enterprise-grade AI. Founded by the engineers behind Milvus, the world’s most popular open-source vector... ...you will do: Work at the intersection of development and site reliability. Creating SRE tools and systems, as well as supporting...
Senior
Zilliz
Redwood City, CA
1 day ago
Senior Manager, Site Reliability Engineering
$200k - $322k
...supportive environment, where NVIDIANs are inspired to excel and make a profound global impact. NVIDIA is seeking a Senior Manager of Site Reliability Engineering to lead and reshape how IT operations function at scale. This role goes beyond traditional service management...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Staff Site Reliability Engineer
$126k - $204.5k
...As part of this role, you will collaborate closely with our engineering teams to develop innovative solutions that provide clear and... ...team to influence the operability of the product and ensure the reliability and availability of our services. Qualifications Required...
Senior
Palo Alto Networks, Inc.
Santa Clara, CA
4 days ago
Senior Site Reliability Engineer — Scale, Automation & Uptime
$145k - $165k
A technology solutions firm in Sunnyvale, CA is looking for a highly experienced Site Reliability Engineer (SRE). This role involves maintaining uptime and performance across systems. Exceptional Linux expertise and automation skills in Bash and Python are crucial. Key...
Senior
Bolt Graphics, Inc.
Sunnyvale, CA
3 days ago
Senior Site Reliability Engineer - Apple Services Engineering (ASE) / iCloud at Apple Cupertino, CA
$175.8k - $264.2k
Senior Site Reliability Engineer - Apple Services Engineering (ASE) / iCloud Cupertino, CA People at Apple don't just build products - they craft experiences our customers love and depend on. Apple Services Engineering (ASE) builds and supports the systems that make many...
Senior
Hong Kong Study Skills Research Institute
Cupertino, CA
4 days ago
Senior Site Reliability Engineer - Observability and Telemetry Platform
$176k - $276k
Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized...
Senior
NVIDIA Corporation
Santa Clara, CA
2 days ago
Senior Lead Site Reliability Engineer
...Job Description Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally... ...position yourself among the top echelon in site reliability and AI-powered infrastructure automation. As a Senior Lead Site Reliability Engineer at JPMorgan...
Senior
Work at office
JPMorgan Chase & Co.
Palo Alto, CA
9 days ago
Site Reliability Engineer - Cybersecurity
$180k
...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who... ...Cybersecurity / SRE team is focused on ensuring the security and reliability of X Money. This role will primarily focus on the X Money platform...
Permanent employment
Temporary work
Relocation
xAI
Palo Alto, CA
a month ago
Site Reliability Engineer
...Job Description Job Description Site Reliability Engineer Onsite- Bay Area, CA Skills Relevant Skills and Experience What You’ll Do (Day-to-Day) Own and manage our cloud infrastructure (GCP or AWS, on-prem). Build, maintain, and optimize Kubernetes...
Amiri Recruiting
Mountain View, CA
26 days ago
Senior Release Engineer
...join one of America's most beloved eCommerce companies as a Senior Release Engineer. This role will work across all web based brands and you'll... ...Skill Set Specific experience deploying large scale web sites/products Experience deploying cloud based apps Strong...
Senior
Black Swan Search
Mountain View, CA
3 days ago
Staff Site Reliability Engineer
$217.57k - $260k
...Identity Left Behind" to enable all people to have a secure digital identity. To learn more, visit Role Overview The Staff Site Reliability Engineer, Infrastructure role is building a high-scale infrastructure team responsible for owning environments with thousands of...
Full time
Temporary work
Work at office
Remote work
Flexible hours
Shift work
ID.me
Mountain View, CA
10 days ago
Senior Principal Engineer, AI Systems & Platforms
ATX Venture Partners seeks a Principal Engineer to drive technology initiatives and create scalable solutions. You'll develop systems in a highly collaborative environment, utilizing both front-end and back-end technologies, particularly in AI domains. The ideal candidate...
Senior
ATX Venture Partners
Mountain View, CA
9 hours ago
Staff Site Reliability Engineer (SRE) \u007C Dev Ops Engineer #4770
$169k - $224k
...disciplinary organization of scientists, engineers, and physicians and we are using the... ...grail.com GRAIL is seeking a Staff Site Reliability / DevOps Engineer to lead the reliability... ...collaboration with cross-functional and senior stakeholders Fast-paced, dynamic...
Full time
Work at office
Local area
Flexible hours
Shift work
GRAIL
Menlo Park, CA
24 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!