Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

SRE Lead: ML/HPC Infra & Automation

$162.6k - $302k

Dormont Manufacturing Co

Dormont Manufacturing Co is looking for a Principal Site Reliability Engineer in South San Francisco. This role focuses on designing scalable cloud-based solutions, emphasizing Infrastructure as Code (IaC) to manage ML and HPC workloads. Responsibilities include leading technical initiatives, automating infrastructure management and ensuring optimal operational integrity. The ideal candidate has 7+ years of experience and expertise in cloud environments like AWS, Azure, or GCP. The expected salary range is $162,600 - $302,000 with additional bonusing opportunities. #J-18808-Ljbffr Dormont Manufacturing Co

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the SRE Lead: ML/HPC Infra & Automation in South San Francisco, CA vacancy
  • $127k - $249k

     ...experienced Senior or Staff Engineer for their SRE, InfraSec team based in San Francisco....  ...the security of cloud infrastructure, leading teams, and implementing security solutions...  ...or similar roles, with strong skills in automation and communication. A compensation range of... 
    Suggested
    Flexible hours

    Insider, Inc.

    San Francisco, CA
    16 hours ago
  •  ...tasks and help maintain performance across our systems, working collaboratively with a global team. Ideal candidates will have extensive SRE experience, proficiency in scripting, and a strong commitment to open-source principles. This role offers a chance to contribute to... 
    Suggested
    Remote job

    Wikimedia Foundation

    San Francisco, CA
    1 day ago
  • A data collaboration platform company in San Francisco seeks a Senior Site Reliability Engineer to manage global product deployments and enhance operational systems. You will support deployment, provide 24/7 engineering support, and optimize performance across highly scalable...
    Suggested

    LiveRamp

    San Francisco, CA
    3 days ago
  • $170k - $215k

     ...deployment confidence, and drive AWS infrastructure solutions. This role demands strong software development skills and experience in SRE or DevOps roles. The ideal candidate will enjoy high autonomy, welcome the challenge of a startup environment, and will need... 
    Suggested

    Bonfirevc

    San Francisco, CA
    16 hours ago
  • A technology firm is seeking a Lead Site Reliability Engineer to design and implement automated infrastructure and manage Kubernetes workloads. The role involves refining CI/CD pipelines and leading incident response efforts, requiring expertise in Terraform, Prometheus... 
    Suggested

    Axiom Pursuits

    San Francisco, CA
    2 days ago
  • E2B is a fast-growing Series A startup based in San Francisco, seeking an Infrastructure Engineer to manage Terraform and Kubernetes for AI agent sandboxes. Your role involves migrating to Kubernetes, building reusable components, and enhancing infrastructure observability...

    E2B

    San Francisco, CA
    16 hours ago
  • Lawrence Berkeley National Laboratory is hiring a Building Infrastructure Group Lead for NERSC. This role involves leading operations of a high-performance computing facility, ensuring reliability, safety, and efficient upgrades. Candidates should have a Bachelor's in... 
    Remote work

    Lawrence Berkeley National Laboratory

    San Francisco, CA
    4 days ago
  •  ...their demand generation efforts. You will lead and innovate in creating a scalable...  ...engine that combines marketing with AI-driven automation. The ideal candidate will have over 5 years...  ...products, particularly in the AI/ML sector. This role offers a competitive salary... 

    Jack & Jill/External ATS

    San Francisco, CA
    4 days ago
  • A leading global software company in San Francisco is looking for a Senior Site Reliability Engineer to ensure system reliability and performance. You will lead initiatives to implement scalable infrastructure and will collaborate closely with development teams. The ideal... 

    OutSystems Inc.

    San Francisco, CA
    2 days ago
  •  ...reliability, and a focus on incident prevention. The role involves significant ownership of AWS and Kubernetes infrastructure, coding, automation, and building CI/CD pipelines. Ideal candidates are proactive, possess strong AWS and Kubernetes knowledge, have a coding... 

    Dormont Manufacturing Co

    San Francisco, CA
    3 days ago
  • $216k - $324k

    A leading marketing automation platform is seeking a Senior Lead Software Engineer in San Francisco to focus on backend architecture and optimize development processes. You will define technical strategies, manage backend dependencies, and mentor engineering talent. Candidates... 

    Klaviyo

    San Francisco, CA
    16 hours ago
  • Zipline is seeking a Regional Construction Manager in South San Francisco, California. The role encompasses managing construction projects, securing entitlements, and fostering relationships with city officials to facilitate infrastructure development. Candidates should...

    Zipline

    South San Francisco, CA
    2 days ago
  • Zipline is hiring a Construction Innovation Manager based in South San Francisco. In this role, you will improve construction processes and methods across projects, focusing on cost, schedule, and scalability. Looking for candidates with 6-10+ years in construction management...

    Zipline

    South San Francisco, CA
    2 days ago
  • $140k - $185k

     ...UK, Canada, and Europe, partnering with leading health systems including the NHS, Beth Israel...  ...: What we’re looking for 3-6+ years in SRE, DevOps, Platform, or operations-heavy...  ...Datadog, Prometheus, etc). Scripting or automation experience (Python, Bash, or similar). The... 
    Work at office
    Worldwide

    Dormont Manufacturing Co

    San Francisco, CA
    16 hours ago
  •  ...focus on backend systems and database management. You will optimize data infrastructures, improve system performance, and build automation tooling to enhance reliability. Ideal candidates will have strong programming skills and a collaborative attitude, ready to work... 

    Unify

    San Francisco, CA
    16 hours ago
  • deCircle is seeking a Site Reliability Engineer based in San Francisco to ensure operational excellence for our GPU marketplace and AI infrastructure. The role involves defining service level objectives, managing capacity for a distributed system, and ensuring security ...

    deCircle

    San Francisco, CA
    4 days ago
  • Dormont Manufacturing Co is looking for a seasoned SRE Engineer Lead to enhance the reliability and resilience of our systems. You will lead the charge on infrastructure scalability, ensure fast services for millions, and manage incidents effectively. The ideal candidate... 

    Dormont Manufacturing Co

    San Francisco, CA
    3 days ago
  • $180k - $260k

     ...Taipei, and Ljubljana. About this role As an SRE Engineer, Lead at Speak , you’ll be the driving force...  ...engineers—think safer deploys and infrastructure automation Collaborate cross‑functionally with Product, Engineering, and ML teams to ensure reliability is baked into... 
    Temporary work
    Work experience placement
    Live in

    Dormont Manufacturing Company

    San Francisco, CA
    2 days ago
  • $175k - $225k

     ...Role We are looking for a builder to help lead our 'AI for Work' efforts. Together with...  ...tomorrow. This is not about bolting automation onto existing processes; it's about redesigning...  ...and prompt versioning. AI & ML Knowledge: You possess a solid understanding... 
    Hourly pay
    Work at office
    Immediate start
    Flexible hours
    Shift work

    Taskrabbit

    San Francisco, CA
    6 days ago
  •  ...skilled Computational Scientist for its Research Pathology Department in South San Francisco, CA. This role involves leading computer vision and AI/ML projects supporting spatial omics initiatives. The ideal candidate will hold a Ph.D in a related field with strong skills... 
    Relocation package

    Dormont Manufacturing Co

    South San Francisco, CA
    1 day ago
  •  ...Adversaries are now using LLMs to automate polymorphism, crafting high-...  ...Architect & Adversarial Lead You won't just be writing detections...  ...PCAPs, behavioral traces) and ML/AI systems. Execution-First:...  ...it consumes. As a Detection Infra lead, you’ll build the "Sensors... 
    Live in

    Cerebras

    San Francisco, CA
    4 days ago
  • Jobr.pro is looking for a strategic Staff/Senior Staff Site Reliability Engineer (SRE) to define the future of our cloud platform. This hybrid role requires office attendance at least twice a week in San Francisco to foster collaboration and connection. You will work to... 
    Work at office

    jobr.pro

    San Francisco, CA
    2 days ago
  • The Stellar Development Foundation (SDF) is seeking a Director of Site Reliability Engineering to lead a dynamic SRE team. This senior role involves shaping engineering culture while improving production services and core infrastructure. With a focus on operational excellence... 

    P2P

    San Francisco, CA
    2 days ago
  • $150.7k - $279.9k

    Genentech is seeking a Principal Scientist in South San Francisco to lead statistical genetics and AI/ML research. You will engage in end‑to‑end analyses across genetics and collaborate with teams to drive impactful results. Candidates should have a PhD, expertise in human... 
    Relocation package

    Dormont Manufacturing Co

    South San Francisco, CA
    2 days ago
  • $152k - $187k

     ...this hybrid role, you will report to a Director of Global Place Workplace & eMobility You will: Lead the smart buildings design and implementation of automated test scripts and standardized workflows for EV charging features, from initial development through... 
    Full time
    Remote work

    Waymo

    San Francisco, CA
    16 hours ago
  • $150k - $200k

     ...Strategic Partnerships Lead Tavrn is a pioneering legal AI company transforming the plaintiff law sector through automation of medical chronologies, demand letters, and medical record retrieval. We are dedicated to becoming the leading AI-driven solution provider for... 
    Remote work

    Tavrn

    San Francisco, CA
    1 day ago
  •  ...company founded in 2017 in Europe. We collect data where automation is not possible. We count features, take pictures, make videos,...  ...to accept new challenges. Role Overview: The Shift Lead is responsible for organizing shift schedules, ensuring shift adherence... 
    Flexible hours
    Shift work
    Weekend work
    Afternoon shift
    Early shift

    TSMG

    San Francisco, CA
    1 day ago
  • $132.5k - $338.3k

     ...We Are: Accenture is a leading global professional services company that helps the world's leading businesses, governments and other...  ...Are: The Red Hat Offering Lead - Private AI, Containers & Automation is accountable for shaping, defining, and scaling Accenture's... 
    Work experience placement
    Live in
    Work at office
    Local area

    Accenture

    San Francisco, CA
    2 days ago
  •  ...About the Role As the Lead Product Manager, Agentic AI on Hinge Health's Intelligent Care...  ...Hinge Health's vision towards automating the delivery of care, and work with a dedicated...  ...You'll partner deeply with Engineering, ML Scientists, Design, Data Science, Clinical... 
    Work at office
    Local area
    3 days per week

    Hinge Health

    San Francisco, CA
    2 days ago
  • $164.7k - $266k

     ...lifecycle management (CLM). What you'll do We are seeking a Lead AI Architect to turn enterprise data, metadata, relationships,...  ...intelligence, discovery, and semantic retrieval Partner with AI/ML, product, and engineering teams to operationalize AI copilots,... 
    Contract work
    Work at office
    Local area
    Remote work
    2 days per week

    DocuSign

    San Francisco, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to SRE Lead: ML/HPC Infra & Automation. Be the first to apply!