Principal III, SRE
Herbalife
Overview THE ROLE The SRE Principal Engineer III will work a hybrid schedule, with a requirement to be onsite at our Torrance, CA facility at least two days per week or more if needed, while also having the flexibility to work remotely. This role is responsible for leading, designing, and implementing robust Site Reliability Engineering (SRE) practices to ensure high availability, scalability, and resilience of critical business systems and applications. The SRE Principal Engineer III will focus on improving system reliability through automation, monitoring, and performance tuning, working closely with development and operations teams to champion a culture of continuous improvement and operational excellence.
The SRE team consists of:
• SRE Engineers
• Deployment Automation
• Incident Response and Postmortem Analysis
• Observability and Monitoring
This role will drive the adoption of best practices in multi-cloud and hybrid-cloud platforms, managing services from major cloud providers like Microsoft Azure, Amazon AWS, Oracle OCI, Google GCP, and Alibaba Cloud. The SRE Principal Engineer III will focus on automation, incident management, performance monitoring, and optimizing infrastructure to support scalable, reliable systems. The position will also be responsible for fostering collaboration between development, operations, and security teams to streamline system operations across the organization. HOW YOU WOULD CONTRIBUTE: • Lead the implementation and optimization of SRE practices, ensuring system reliability, performance, and scalability.
• Architect and maintain automation for infrastructure provisioning, deployment, and incident response.
• Establish and implement SLOs (Service Level Objectives) and SLIs (Service Level Indicators) for key services.
• Collaborate with development teams to design and deliver reliable software systems, ensuring that production environments are optimized for uptime and performance.
• Create and maintain monitoring, alerting, and observability solutions to provide real-time insights into system health and performance.
• Respond to production incidents, perform root cause analysis, and implement corrective measures to prevent recurrence.
• Continuously improve system performance, capacity planning, and reliability through infrastructure tuning and automation.
• Facilitate post-incident reviews, fostering a blameless culture that focuses on learning from incidents.
• Collaborate with security teams to ensure infrastructure meets compliance, security standards, and best practices.
• Champion a collaborative environment across development, operations, and security teams to enhance operational efficiency and knowledge sharing.
• Drive the adoption of automation tools and frameworks to minimize manual intervention and optimize systems. Qualifications Skills Required:
• Proven expertise in SRE practices, with a focus on automation, incident management, observability, and infrastructure scalability.
• Extensive knowledge of cloud platforms (Azure, AWS, GCP, Alibaba) and hybrid-cloud environments, with a focus on reliability and performance optimization.
• Experience with automation tools and scripting languages, such as Python, Go, Terraform, or Ansible, for leading infrastructure and incident response.
• Strong understanding of containerization (Docker, Kubernetes) and orchestration systems.
• Solid grasp of monitoring and observability tools (Prometheus, Grafana, Dynatrace, Splunk) to ensure real-time system health monitoring.
• Expertise in capacity planning, performance tuning, and failure management techniques.
• Strong background in incident management, root cause analysis, and postmortem processes to improve system resilience.
• Deep understanding of security and compliance requirements, and the ability to ensure production environments meet industry standards.
• Experience with Agile and DevOps methodologies to ensure fast, reliable delivery of services. Experience Required:
• 10+ years of experience in IT, with a focus on SRE, DevOps, or infrastructure engineering roles.
• Extensive hands-on experience with cloud infrastructure management and automation tools such as Terraform, CloudFormation, or equivalent.
• Proficiency in scripting and automation languages like Python, Bash, Go, or Ruby for infrastructure automation.
• Proven experience in managing large-scale systems, ensuring reliability, high availability, and scalability.
• Expertise in container orchestration technologies, including Kubernetes, OpenShift, and Docker Swarm.
• Deep knowledge of monitoring and observability platforms (Prometheus, Grafana, ELK, Dynatrace), including experience building and maintaining alerting and dashboard systems.
• Strong understanding of version control systems and CI/CD practices to optimize code deployment as it relates to infrastructure.
• Demonstrated ability to optimize performance in multi-cloud and hybrid-cloud environments, ensuring uptime and performance at scale. Education Required:
• Bachelor's degree in computer science, Information Technology, or related field, or equivalent experience. Certificates / Training Preferred:
• Relevant cloud certifications such as AWS Certified Solutions Architect, Azure Solutions Architect Expert, or Google Cloud Professional Cloud Architect.
• SRE-related certifications like Certified Kubernetes Administrator (CKA) or Google Professional Cloud DevOps Engineer. US Benefits Statement Herbalife offers a variety of benefits to eligible employees in the U.S. (limited to the 50 States and the District of Columbia), which includes Group Health Programs, other Voluntary Benefit Programs, and Paid Time Off. Group Health Programs include Medical, Dental, Vision, Health Savings Account (HSA), Flexible Spending Accounts (FSA), Basic Life/AD&D; Short-Term and Long-Term Disability, and an Employee Assistance Program (EAP). Other Voluntary Benefit Programs include a 401(k) plan, Wellness Incentive Program, Employee Stock Purchase Plan (ESPP), Supplemental Life/Critical Illness/Hospitalization/Accident Insurance, and Pet Insurance. Paid time off includes Company-observed U.S. Holidays, Floating Holidays, Vacation, Sick Time, a Volunteer Program, Paid Maternity and Paternity Leave, Bereavement Leave, Personal Leave and time off for voting.
The SRE team consists of:
• SRE Engineers
• Deployment Automation
• Incident Response and Postmortem Analysis
• Observability and Monitoring
This role will drive the adoption of best practices in multi-cloud and hybrid-cloud platforms, managing services from major cloud providers like Microsoft Azure, Amazon AWS, Oracle OCI, Google GCP, and Alibaba Cloud. The SRE Principal Engineer III will focus on automation, incident management, performance monitoring, and optimizing infrastructure to support scalable, reliable systems. The position will also be responsible for fostering collaboration between development, operations, and security teams to streamline system operations across the organization. HOW YOU WOULD CONTRIBUTE: • Lead the implementation and optimization of SRE practices, ensuring system reliability, performance, and scalability.
• Architect and maintain automation for infrastructure provisioning, deployment, and incident response.
• Establish and implement SLOs (Service Level Objectives) and SLIs (Service Level Indicators) for key services.
• Collaborate with development teams to design and deliver reliable software systems, ensuring that production environments are optimized for uptime and performance.
• Create and maintain monitoring, alerting, and observability solutions to provide real-time insights into system health and performance.
• Respond to production incidents, perform root cause analysis, and implement corrective measures to prevent recurrence.
• Continuously improve system performance, capacity planning, and reliability through infrastructure tuning and automation.
• Facilitate post-incident reviews, fostering a blameless culture that focuses on learning from incidents.
• Collaborate with security teams to ensure infrastructure meets compliance, security standards, and best practices.
• Champion a collaborative environment across development, operations, and security teams to enhance operational efficiency and knowledge sharing.
• Drive the adoption of automation tools and frameworks to minimize manual intervention and optimize systems. Qualifications Skills Required:
• Proven expertise in SRE practices, with a focus on automation, incident management, observability, and infrastructure scalability.
• Extensive knowledge of cloud platforms (Azure, AWS, GCP, Alibaba) and hybrid-cloud environments, with a focus on reliability and performance optimization.
• Experience with automation tools and scripting languages, such as Python, Go, Terraform, or Ansible, for leading infrastructure and incident response.
• Strong understanding of containerization (Docker, Kubernetes) and orchestration systems.
• Solid grasp of monitoring and observability tools (Prometheus, Grafana, Dynatrace, Splunk) to ensure real-time system health monitoring.
• Expertise in capacity planning, performance tuning, and failure management techniques.
• Strong background in incident management, root cause analysis, and postmortem processes to improve system resilience.
• Deep understanding of security and compliance requirements, and the ability to ensure production environments meet industry standards.
• Experience with Agile and DevOps methodologies to ensure fast, reliable delivery of services. Experience Required:
• 10+ years of experience in IT, with a focus on SRE, DevOps, or infrastructure engineering roles.
• Extensive hands-on experience with cloud infrastructure management and automation tools such as Terraform, CloudFormation, or equivalent.
• Proficiency in scripting and automation languages like Python, Bash, Go, or Ruby for infrastructure automation.
• Proven experience in managing large-scale systems, ensuring reliability, high availability, and scalability.
• Expertise in container orchestration technologies, including Kubernetes, OpenShift, and Docker Swarm.
• Deep knowledge of monitoring and observability platforms (Prometheus, Grafana, ELK, Dynatrace), including experience building and maintaining alerting and dashboard systems.
• Strong understanding of version control systems and CI/CD practices to optimize code deployment as it relates to infrastructure.
• Demonstrated ability to optimize performance in multi-cloud and hybrid-cloud environments, ensuring uptime and performance at scale. Education Required:
• Bachelor's degree in computer science, Information Technology, or related field, or equivalent experience. Certificates / Training Preferred:
• Relevant cloud certifications such as AWS Certified Solutions Architect, Azure Solutions Architect Expert, or Google Cloud Professional Cloud Architect.
• SRE-related certifications like Certified Kubernetes Administrator (CKA) or Google Professional Cloud DevOps Engineer. US Benefits Statement Herbalife offers a variety of benefits to eligible employees in the U.S. (limited to the 50 States and the District of Columbia), which includes Group Health Programs, other Voluntary Benefit Programs, and Paid Time Off. Group Health Programs include Medical, Dental, Vision, Health Savings Account (HSA), Flexible Spending Accounts (FSA), Basic Life/AD&D; Short-Term and Long-Term Disability, and an Employee Assistance Program (EAP). Other Voluntary Benefit Programs include a 401(k) plan, Wellness Incentive Program, Employee Stock Purchase Plan (ESPP), Supplemental Life/Critical Illness/Hospitalization/Accident Insurance, and Pet Insurance. Paid time off includes Company-observed U.S. Holidays, Floating Holidays, Vacation, Sick Time, a Volunteer Program, Paid Maternity and Paternity Leave, Bereavement Leave, Personal Leave and time off for voting.
Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Principal III, SRE in Torrance, CA vacancy
- Herbalife is seeking an SRE Principal Engineer III to implement robust Site Reliability Engineering practices in Torrance, CA. The role demands someone with over 10 years of relevant experience in IT, specifically in SRE or DevOps. This position offers a hybrid work schedule...Principal2 days per week
$147k - $237.5k
...Principal SRE Palo Alto Networks is disrupting the Cyber Security industry! We're definitely not business-as-usual and that goes for the talent we hire. We're looking for a Principal SRE to join our InfoSec SRE team that owns the process of securing and delivering secure...PrincipalVisa sponsorshipWork visa$167k - $196.8k
...Principal System Engineer (Contract) Torrance, California, United States About Epirus Epirus is a high-growth technology company... ...(ii) U.S. lawful, permanent resident (aka green card holder), (iii) Refugee under 8 U.S.C. § 1157, or (iv) Asylee under 8 U.S.C. §...PrincipalPermanent employmentContract workWork at office- Honda South Carolina Manufacturing is seeking a Senior CPA to serve as a subject matter expert in technical accounting for North America. This role involves ensuring accurate application of IFRS and U.S. GAAP, evaluating complex contracts, and advising leadership on accounting...PrincipalRemote work
$117.5k - $176.3k
Northrop Grumman is seeking a Contracts Administrator - Level 4 (Senior Principal) for their Space Sector in Redondo Beach, CA. The role emphasizes contract oversight, proposal coordination, and cross-functional collaboration. Ideal candidates possess a Bachelor's degree...PrincipalContract workWork experience placement$155k - $168k
...Job Title: Senior Scheduler (Scheduler III) Location: Harbor-UCLA Medical Center located at 1000 W Carson St, Torrance, CA 90502 Client: Los Angeles County Department of Public Works Pay Range: $155,000-168,000/year plus benefits Anticipated...For contractorsWork at officeImmediate start$200k - $285k
...possible, with the ultimate goal of enabling human life on Mars. Principal ASIC Design Verification Engineer (Starshield) Starshield... ..., (ii) U.S. lawful permanent resident (aka green card holder), (iii) Refugee under 8 U.S.C. § 1157, or (iv) Asylee under 8 U.S.C. §...PrincipalPermanent employmentTemporary workWeekend work- Senior Individual-Contributor Cpa And Technical Accounting Subject Matter Expert Serve as the senior individual-contributor CPA and technical accounting subject matter expert for North America consolidation and financial reporting, ensuring accurate and consistent application...Principal
$75.5k - $100k
...Series 24 or 9/10; Series 63/65 or 66; Series 53) with the ability to obtain and maintain required state licenses, IAR registration, principal registration, and bonding. ~ Strong working knowledge of FINRA, SEC, and MSRB rules, with the ability to analyze sales activity,...PrincipalWork at officeHome office- ...of NDE experience in aerospace or defense. Current ASNT level III certification or equivalent in ultrasonic testing (UT). Preferred... ...role, we will consider candidates at levels Staff Engineer - Principal Engineer as evaluated through our interview process. Staff NDE...PrincipalTemporary workWork at office
- ...Overview THE ROLE As a Principal, Application Development, you'll serve as the technical go-to for our mobile application ecosystem... ...reliability through CI/CD and DevOps practices, supporting SRE needs (SLOs/SLAs) and owning production support and follow-ups...PrincipalTemporary workFlexible hours
$114k - $171k
Northrop Grumman Corp. (JP) is seeking a Principal Engineer Electrical (Integration and Test) in Redondo Beach, California. This role involves leading the integration and test lifecycle for spacecraft systems, developing automated test scripts, and ensuring compliance with...Principal- SPACE EXPLORATION TECHNOLOGIES CORP is seeking a Principal Antenna Engineer to work on advanced antenna and feed systems for national security efforts. The ideal candidate will have over 8 years of experience in RF and antennas, with a focus on small antenna design and...Principal
$200k - $270k
Principal Software Engineer, Flight Software (Starship) Design and develop software to control and simulate SpaceX's Starship flight systems... ...(ii) U.S. lawful, permanent resident (aka green card holder), (iii) Refugee under 8 U.S.C. § 1157, or (iv) Asylee under 8 U.S.C. §...PrincipalPermanent employmentTemporary workWeekend work- A pioneering aerospace firm in Hawthorne, California, is seeking an experienced Principal FPGA Engineer. You will be responsible for FPGA RTL development and design verification in a dynamic environment. Candidates should have over 15 years of experience in FPGA firmware...PrincipalFlexible hours
$94.2k - $141.2k
An aerospace and defense technology company in Redondo Beach, CA is seeking a Principal Configuration Analyst - Level 3. This role involves managing hardware/software identification, overseeing change management, and ensuring compliance with configuration management standards...Principal- SPACE EXPLORATION TECHNOLOGIES CORP is seeking a Principal RF Engineer to work on advanced development programs that support US National Security. This role involves collaborating with various engineering teams to design RF capabilities for low earth orbit communications...PrincipalRemote work
$200k - $285k
SPACE EXPLORATION TECHNOLOGIES CORP is seeking a Principal RFIC Design Engineer to work in Hawthorne, CA. This role involves designing and modeling RFICs for both space and ground systems, requiring a Bachelor's in Electrical Engineering and 8+ years of experience. Candidates...Principal- A leading aerospace and defense company is seeking a Principal Configuration Analyst - Level 3 in Redondo Beach, CA. The role involves managing program data, supporting audits, and maintaining configuration management standards. Candidates should have 9+ years of experience...Principal
$200k - $270k
SpaceX is looking for a Principal Software Engineer, Flight Software in Hawthorne, California. The role involves designing and developing software for Starship’s flight systems. Candidates must have over 8 years of experience in software development, with expertise in C...Principal- A leading aerospace company in Hawthorne, CA, is seeking a Principal RF Software Engineer to support national security projects. This role involves designing and building RF test benches, collaborating with engineers, and applying advanced software skills in a fast-paced...Principal
$94.2k - $141.2k
...Schedule: 9/80 with work‑from‑home option on Fridays Job Description Northrop Grumman Aeronautics Systems (NGAS) is seeking a qualified Principal (Level 3) or Senior Principal (Level 4) Principal Program Cost Schedule Control Analyst to join the F‑35 Finance Team responsible...PrincipalWork from homeRelocation packageShift work- A community-focused non-profit organization in Compton, CA is seeking a Senior Case Manager III. This role requires expertise in case management to mentor a team, coordinate services for clients, and conduct assessments. The ideal candidate will have a bachelor's degree...
$160k - $220k
A leading aerospace technology company is seeking a Senior Linux Site Reliability Engineer to optimize Kubernetes and Linux systems. The ideal candidate will be responsible for managing and enhancing Kubernetes clusters, collaborating closely with engineers to ensure high...$148.6k - $176k
...Principal Technical Project Manager Torrance, California, United States About Epirus Epirus is a high-growth technology company dedicated to overcoming the asymmetric challenges inherent to the future of national security. Our flagship product, Leonidas, is a...PrincipalPermanent employmentWork at office- ...Principal Electrical Engineer Mercury Systems offers Principal Electrical Engineers a fulfilling career where you will see the direct impact of your board designs in deployed systems with true end-to-end ownership. You're not just implementing requirements; you're architecting...Principal
$110k - $240k
CHAOS Industries is redefining modern defense with omniscient systems purpose-built for today's realities. Designed and built by top U.S. military veterans and Silicon Valley innovators, CHAOS Industries' products are powered by Coherent Distributed Networks (CDN™), empowering...PrincipalFull timeWork experience placementCasual workWork at officeRelocation package$94.2k - $141.2k
Relocation assistance may be available. Clearance required for start: Yes. Clearance type: Secret. Travel: Yes, 10% of the time. The Principal Proposal Analyst Level 3 Full knowledge of industry practices. Independently demonstrates the skill and ability to perform fairly...PrincipalContract workRelocation packageShift work$190k - $260k
...Francisco, San Diego, Seattle, and London. For more information, please visit Role Overview: CHAOS is seeking a Sr. Principal RF Engineer with deep expertise in RF circuit design to serve as a technical authority for high-reliability RF and microwave...PrincipalFull timeWork experience placementCasual workRelocation package$150k - $225k
...Principal RF Engineer Torrance, California, United States Neros is a defense technology company rebuilding America's drone industrial base. We design and manufacture high-performance unmanned systems that are tested in combat, iterated at startup speed, and built...PrincipalWork experience placement
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal III, SRE. Be the first to apply!


