Principal Site Reliability Engineer (Intelligent Automation)
$162.6k - $302kGenentech
The Position A healthier future. It’s what drives us to innovate. To continuously advance science and ensure everyone has access to the healthcare they need today and for generations to come. Creating a world where we all have more time with the people we love. That’s what makes us Roche. Advances in AI, data and computational sciences are transforming drug discovery and development. Roche’s Research and Early Development organizations at Genentech (gRED) and Pharma (pRED) have demonstrated how these technologies accelerate R&D, leveraging data and novel computational models to drive impact. Seamless data sharing and access to models across gRED and pRED are essential to maximising these opportunities. The Computational Sciences Center of Excellence (CS CoE) is a strategic, unified group whose goal is to harness the transformative power of data and Artificial Intelligence (AI) to assist our scientists in both pRED and gRED to deliver more innovative and life-changing medicines for patients worldwide. Within the CS CoE organisation, the Data and Digital Catalyst (DDC) organization leads the modernization of our computational and data ecosystems by integrating digital technologies across Research and Early Development to empower stakeholders, advance data-driven science and accelerate decision-making. The Solutions team within the DDC Organization develops modernized and interconnected computational and data ecosystems. As a Site Reliability Engineer in the Solutions Engineering capability, you will work closely with our engineering colleagues to play a pivotal role in designing, implementing, and maintaining scalable, resilient, and supportable cloud-based platform solutions. The focus will be on enabling research Application, Machine Learning (ML) workloads and HPC environments through automation, efficient resource management, and Infrastructure as Code (IaC) using tooling. As a member of the DDC team you will help mature the scalable platforms that help unlock the potential of our diverse scientific data, accelerating the discovery and development of life-changing treatments for patients. The Opportunity Architect and implement IaC solutions using tools like Terraform, Spacelift, or CloudFormation to provision and manage cloud infrastructure for ML and HPC workloads. Automate the deployment of scalable ML pipelines, HPC clusters, and supporting services across global regions. Architect resilient and highly available solutions for ML and HPC workloads using cloud-native practices such as auto-scaling, load balancing, and failover mechanisms. Implement disaster recovery (DR) and business continuity plans for critical systems to ensure global operational integrity. Conduct chaos engineering experiments to validate system reliability and identify potential weaknesses. Develop automation scripts and workflows to streamline infrastructure management, deployment, and scaling for ML and HPC use cases. Implement robust monitoring, logging, and alerting frameworks using tools like Prometheus, Grafana, Datadog, or ELK Stack to provide deep insights into system health and performance. Knowledge of AIOps incident management, processes and tooling. Provide technical leadership to a team of engineers, fostering a culture of collaboration, innovation, and continuous improvement. Partner with cross-functional teams to align infrastructure solutions with business objectives and ML/HPC workload requirements. Mentor and train junior engineers in IaC practices, ML, and HPC infrastructure design. Monitor and optimize cloud infrastructure usage and costs for ML and HPC workloads. Ensure compliance with organizational security, governance, and regulatory policies in all IaC and cloud implementations. Who You Are Bachelor’s or Master’s degree in Computer Science or similar technical field, or equivalent experience and 7+ years of experience in software engineering Site Reliability Engineering (SRE). Proven expertise in supporting and deploying IaC solutions in cloud environments (AWS, Azure, or GCP) for ML and HPC workloads. Background in MLOps pipelines, including model versioning, CI/CD for ML, and feature store integration including experience with managed ML services (e.g., AWS SageMaker, Google AI Platform, or Azure ML). Deep understanding of cloud-native architectures, including autoscaling, serverless, and multi-region deployments. Technical Skills: Advanced proficiency with IaC tools: Terraform, Pulumi, or CloudFormation. Expert in scripting and automation: Python, Bash, or Go. Strong understanding of GPU-accelerated computing (e.g., NVIDIA CUDA, TensorFlow) and HPC workload scaling. Knowledge of distributed systems, storage solutions, and data pipelines. Familiar with monitoring and observability tools: Prometheus, Grafana, Datadog, or similar. Soft Skills: Strong problem-solving skills, with a methodical approach to troubleshooting. Excellent communication, leadership, and mentoring abilities. Ability to work collaboratively across teams in a fast-paced, dynamic environment. Preferred Qualifications Certifications in cloud platforms (e.g., AWS Certified Solutions Architect, GCP Professional Cloud Architect, or Azure Solutions Architect). Experience with distributed ML frameworks and data engineering pipelines (e.g., Horovod, TensorFlow Distributed, Apache Airflow, Apache Spark ). Experience with compliance frameworks (e.g., GDPR, SOC 2, ISO 27001). Onsite presence, on our South San Francisco campus, is expected for at least 3 days a week. Relocation benefits are not available for this job posting. The expected salary range for this position based on the primary location of California is $162,600 - $302,000. Actual pay will be determined based on experience, qualifications, geographic location, and other job-related factors permitted by law. A discretionary annual bonus may be available based on individual and Company performance. This position also qualifies for the benefits detailed at the link provided below. Benefits Genentech is an equal opportunity employer. It is our policy and practice to employ, promote, and otherwise treat any and all employees and applicants on the basis of merit, qualifications, and competence. The company's policy prohibits unlawful discrimination, including but not limited to, discrimination on the basis of Protected Veteran status, individuals with disabilities status, and consistent with all federal, state, or local laws. If you have a disability and need an accommodation in relation to the online application process, please contact us by completing this form Accommodations for Applicants. #J-18808-Ljbffr Genentech
$162.6k - $302k
...power of data and Artificial Intelligence (AI) to assist our... ...accelerate decision-making. As the Principal Engineer/Tech Lead for the... ...within the Engineering - Lab Automation capability, you will be a key... ...resulting systems are scalable and reliable. Your work will be vital in...PrincipalLocal areaWorldwideRelocation package$162.6k - $302k
...power of data and Artificial Intelligence (AI) to assist our... ...and performance, and mentor engineers across the capability. Lead... ...for delivering scalable and reliable instrument data integration... ...data systems. Qualifications - Principal Engineer B.S. in Computer Science...PrincipalLocal areaWorldwideRelocation package$164.7k - $266k
...simplify people's lives. With intelligent agreement management,... ...has a dedicated Intelligent Automation Center of Excellence (CoE) that... ...Senior Intelligent Automation Engineer, you will be a key contributor... ...role reporting to the Principal Automation Engineering Manager...SuggestedContract workWork at officeLocal areaRemote work2 days per week$285k - $315k
Ironclad Inc. is seeking a Principal Engineer in San Francisco to drive the development of AI-powered contract solutions. The role requires over 10 years of experience in software engineering, especially in designing and evolving distributed systems. You'll collaborate...PrincipalContract work- ...grows. A key element is evolving Business Intelligence across QuickBooks and Intuit Enterprise... ...profitability. We seek a visionary Principal Product Designer with excellent craft skills... ...‑functionally with product managers, engineers, data analysts, and marketers to...Principal
- ...ML Engineer - AI-Powered Automation & Workflow Intelligence Location: Bay Area, CA (On-site) Industry: Artificial Intelligence, Enterprise Software, Workflow Automation Employment Type: Full-Time Level: Senior Individual Contributor A pioneering tech...Full time
$250.5k - $335.9k
P5/P6: SRE Lead, Content Distribution Engineering Media Engineering. SF CA / LA CA / NYC Team Intro On any given day at Disney Entertainment... ..., conducting audits and reviews across each domain. Drive automation strategy for more rapid safe releases, tighter content SLAs,...PrincipalWorldwide$117.2k - $176.7k
...efforts. Job Category: Software Engineering Job Details About... ...of Salesforce. Our Threat Intelligence team focuses on defending our... ...capacity of a Threat Intelligence Automation Developer, you operate at... ...AI to operate accurately and reliably. Minimum Requirements A minimum...Remote work$140k - $205k
...Senior Technology Site Reliability Engineer Cooley is seeking a Senior Site Reliability Engineer... ...engineering to build and maintain automated, resilient, and observable systems that... ...have a high degree of emotional intelligence and the ability to work as a team towards...Full timeTemporary workWork at officeFlexible hoursWeekend work$163k - $203k
...team, responsible for the reliability, scalability, and security... ...This is as much of a platform engineering role as it is SRE role —... ...guardrails or policy engines for automated systems Nice to have: 2... .... We may use artificial intelligence (AI) tools to support parts...Work experience placementWork at officeLocal areaRemote workFlexible hours2 days per week$200k
...About the Role: AngelList is seeking Senior Software Engineers to join our Intelligence team. You will design, build, and operate services that... ...embeddings, and retrieval, and able to turn prototypes into reliable, instrumented services. Experience working with LLM APIs...Work at office2 days per week$132k - $222.2k
...scientific agentic AI, lab automation, and unified data platforms... ...laboratory science. You will engineer the connective tissue between... ...engineers to deploy intelligent systems that accelerate molecule... ...transition prototypes into reliable lab operations Deploy and maintain...PrincipalFull timeFlexible hours$227.2k - $324.5k
...About the Role: Site Reliability Engineering (SRE) at Tubi is not a traditional operations team... ...blameless learning, and relentless automation. We are seeking an experienced and... ...potential and the pitfalls of integrating intelligent systems into critical operations....Full timeContract workTemporary workLocal areaFlexible hours- ...Job Description Zoox is seeking a Site Reliability Engineer to help ensure the availability, performance... ...a robotics company, Zoox embraces automation at every layer of our infrastructure... ...skills. We may use artificial intelligence (AI) tools to support parts of the...
$160k - $300k
...manage over $30 trillion in assets globally. We deliver the intelligence that gives finance professionals a definitive edge. Our AI... ...performance, alpha, and market leadership. The Role Platform engineering at Hebbia is about excellent, scalable enablement. You are...Work experience placement$163k - $203k
...team, responsible for the reliability, scalability, and security... ...This is as much a platform engineering role as it is an SRE role—... ...guardrails or policy engines for automated systems. Track record of... ...rights under the Artificial Intelligence Notice for Applicants. #J-...Work experience placementWork at officeRemote workFlexible hours2 days per week$190k - $230k
...AI-powered personalization engine delivers bespoke experiences... ...Senior Software Engineer in our Intelligent Messaging, you will pioneer... ...high performance and reliability. Collaborate with cross-functional... ...esbuild, and PlaywrightOur automation is driven by custom and open...Full time$164.2k - $225.7k
...business impact. Founded by engineers and driven by customer... ...Engineer for Customer Experience Intelligence, you’ll shape the future of... ...architecture for Databricks’ Support Automation and Tooling ecosystem... ...quality, safety, and reliability standards Design agentic workflows...Local areaWorldwide- ...seeking a highly skilled Software Engineer with deep expertise in AI-... ...building enterprise-grade intelligent systems powered by Large... ...architectures for production reliability Implement agent skills,... ...Validation & Reliability Build automated evaluation frameworks for...
$75 per hour
...Fleet Operations Business Intelligence Engineer Our client, a leader in autonomous transportation technology, is seeking a Business Intelligence Engineer to join their team. As a Business Intelligence Engineer, you will be part of the Fleet Operations department supporting...Weekly payTemporary workFlexible hours- ...Automotive / Fleet Operations Business Intelligence Engineer A leading company in the autonomous mobility space is seeking an Automotive... ...and Data Engineering teams to enhance data availability and reliability What We’re Looking For Required Qualifications...
$75 per hour
...Our client, a leader in autonomous transportation technology, is seeking a Business Intelligence Engineer to join their team. As a Business Intelligence Engineer, you will be part of the Fleet Operations department supporting operational decision-making and data analysis...Weekly payTemporary workFlexible hours$145k - $175k
...Avive Solutions, Inc. ( is a growth stage Automated External Defibrillator (AED) company... ...Required Skills BS degree in Electrical Engineering, Computer Engineering, Physics, or... ...office We may use artificial intelligence (AI) tools to support parts of the...Work at officeLocal area- Arbor is looking for a talented individual to join our team in San Francisco. In this hybrid role, you will help us develop an intelligence system that dynamically interacts with a live marketplace, pricing in real time. With 2-3 years of experience shipping LLM products...
- ...Senior AI Engineer – Health Intelligence On-site - San Francisco, California Our mission at Oura is to empower every person to own their inner... ...Design and implement services and workflows that meet reliability and performance expectations. Take ownership of operational...Temporary workWork at officeLocal areaRemote workFlexible hours
- ...teams for Google Workspace. What You'll Do Build the Intelligence layer at Sierra. You'll work on systems that analyze millions... ...being on the frontier of AI products. ~ Strong software engineering fundamentals and experience building production systems. ~...Full timeFlexible hours
- ...Job Description Job Description Principal Scientist / Senior Scientist, Analytical Chemistry (Biopolymers) Location: Brisbane... ...envision for ourselves, and the world. We may use artificial intelligence (AI) tools to support parts of the hiring process, such as...Principal
- ...significantly outperforms individual engineers. We combine language models... ...are seeking an experienced Site Reliability Engineer to join our... ...foundational platforms and automation that enable our engineering... ...eliminating toil through intelligent tooling and processes...
$260k - $340k
...mission to accelerate the abundance of energy and intelligence. As the only vertically integrated AI... ...each other, come build with us at Crusoe. Principal Systems Software Engineer San Francisco, Sunnyvale (On-site) About This Role: As the Principal Systems...PrincipalFull timeTemporary work- ...leading biotechnology firm in South San Francisco is seeking a Site Reliability Engineer to architect and implement Infrastructure as Code (IaC)... ...AWS, Azure, or GCP. This role involves developing automation for scalable ML pipelines, technical leadership, and mentoring...3 days per week
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal Site Reliability Engineer (Intelligent Automation). Be the first to apply!
- chief engineer South San Francisco, CA
- principal developer South San Francisco, CA
- general engineer South San Francisco, CA
- data center chief engineer South San Francisco, CA
- hotel chief engineer South San Francisco, CA
- engineering director South San Francisco, CA
- principal engineer South San Francisco, CA
- director software engineering South San Francisco, CA
- automation engineer South San Francisco, CA
- automation specialist South San Francisco, CA


