Director, Cloud Ops/Site Reliability

Decision Engines, Inc.

We are looking for an experienced Cloud Ops leader who will be responsible for operating what will be the world’s largest enterprise-grade intelligent business process automation platform.We are pioneering The Autonomous Enterprise by automating the work of millions of knowledge workers, through the deployment of AI-Bots to conduct straight through business processing. KEY RESPONSIBILITIES Experience with DevOps and build/release pipelines Experience with provisioning distributed applications and service lifecycle management Hands-on with Ansible, Terraform, PowerShell/Bash/Python, Docker, and Kubernetes Experience with InfoSec certifications and remediation, Patch distribution Experience with 24/7 site monitoring and own uptime & performance SLA’s Real-world experience with Disaster recovery protocols and processes Has built and managed geographically distributed teams to operate a large-scale SaaS platform Set standards and provide requirements for engineering teams to deliver ops-ready software REQUIRED SKILLS AND EXPERIENCE Qualified candidates will combine an undergraduate degree, professional experience in directly-relevant technologies, and a demonstrated appetite and aptitude for ongoing skills development. Minimum qualifications are: Bachelor’s Degree in an Engineering Field 3+ years as a Site Reliability Engineer or Dev Ops Engineer 5+ years as a Director of Cloud Ops Has been responsible for uptime, upgrades, reliability, and operations of a SaaS platform Built cloud ops teams from the ground up #J-18808-Ljbffr Decision Engines, Inc.

Apply

Vacancy posted 13 hours ago

Similar jobs that could be interesting for youBased on the Director, Cloud Ops/Site Reliability in Palo Alto, CA vacancy

Cloud Ops & Reliability Director — Scale a SaaS Platform
A tech company in California is seeking a Cloud Ops leader responsible for operating a major enterprise-grade business automation platform. This role requires managing oversight of DevOps and ensuring the uptime and performance of a large-scale SaaS solution. The ideal...
Suggested
Decision Engines, Inc.
Palo Alto, CA
13 hours ago
Software Engineering Manager 1 - Streaming & Cloud Platform Reliability
...Software Engineering Manager 1 – Streaming & Cloud Platform Reliability This role has been designed as "Onsite" with an expectation that you... ...engineering improvements. This is a hybrid role requiring on-site collaboration multiple days per week in Cupertino,...
Website
Work at office
Hewlett Packard Enterprise Development LP
Cupertino, CA
1 day ago
Technical Program Manager, Google Cloud Platform Reliability
$227k - $320k
Technical Program Manager, Google Cloud Platform Reliability corporate_fare Google place Sunnyvale, CA, USA Apply Bachelor's degree in a technical... ...management or engineering leadership. Experience with site reliability engineering, developer operations, and developer...
Website
Full time
Local area
Google Inc.
Sunnyvale, CA
4 days ago
Technical Operations Manager, Enterprise Central Ops
$85k - $120k
...0,000 clinicians across hundreds of care sites nationwide - more than $10 billion flows... ...high‑impact problems, turn messy data into reliable pipelines, and own the metrics that move... ...velocity, quality, and cost. Ship AI into ops: Identify high‑leverage use cases (triage...
Website
Full time
Work at office
Athelas
Mountain View, CA
3 days ago
Staff Site Reliability Engineer (SRE) \u007C Dev Ops Engineer #4770
$169k - $224k
...companies. For more information, please visit grail.com GRAIL is seeking a Staff Site Reliability / DevOps Engineer to lead the reliability, scalability, and security of our cloud-native platform. This role operates at the intersection of infrastructure engineering...
Website
Full time
Work at office
Local area
Flexible hours
Shift work
GRAIL
Menlo Park, CA
7 days ago
Site Reliability Engineering Manager, Google Distributed Cloud
$207k - $300k
Site Reliability Engineering Manager, Google Distributed Cloud Google Sunnyvale, CA, USA Bachelor’s degree in Computer Science, a related field, or equivalent practical experience. 8 years of experience building or managing distributed systems or cloud infrastructure...
Website
Full time
Google Inc.
Sunnyvale, CA
4 days ago
Senior Site Reliability Engineer: Cloud, Kubernetes & CI/CD
A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS. The role involves maintaining high availability through Kubernetes clusters and improving CI/CD pipelines with Terraform. Ideal...
Website
Amiri Recruiting
Mountain View, CA
4 days ago
Senior Software Engineer - Apps (AI Platform Reliability)
$142.8k - $204k
...transcriptions, and smart meeting summaries. This role requires on-site presence at our office 4 days a week to support effective... ...-complexity projects that set the standard for performance and reliability at massive scale. What kind of scale? Millions of users today...
Website
Full time
Work at office
Local area
Flexible hours
RingCentral
Belmont, CA
4 days ago
DevOps Lead, Cloud Automation & Reliability
$134.4k - $280k
...Workspace One Manager In Uem Cloud Services Omnissa is the first AI-driven digital work platform, built to support flexible,... ...success. What you'll do: Manage and support a team of site reliability engineers, focusing on technical guidance, mentoring, and...
Website
Work experience placement
Work at office
Local area
Remote work
Visa sponsorship
Flexible hours
3 days per week
Omnissa
Mountain View, CA
13 hours ago
Site Reliability Engineer - Platform Infrastructure Engineering
$168.93k - $192.5k
...identity. To learn more, visit Role Overview We are seeking a Site Reliability Engineer to join our Core Platform Engineering organization.... ...of hands-on experience managing and scaling services in cloud environments such as AWS, GCP, or Azure. ~1+ years proficiency...
Website
Full time
Temporary work
Work at office
Remote work
Flexible hours
ID.me
Mountain View, CA
23 days ago
Senior Site Reliability Engineer
$210k - $270k
Your Impact on our Mission: Zocdoc is looking for a Senior Site Reliability Engineer to help develop, monitor, and maintain our distributed... ...microservices, leveraging many interconnected services in AWS Cloud. We’re looking for someone who loves challenging the status...
Website
Flexible hours
GoTo Meeting
Palo Alto, CA
4 days ago
Senior Site Reliability Engineer
$140k - $220k
About the Job You’ll own reliability and operational excellence for Pylon’s production systems. This means designing and implementing monitoring... ...regulated, high‑stakes financial product. This is not a pure ops role. At Pylon, we believe SRE work should be a maximum of 50 %...
Website
Pylon
Palo Alto, CA
1 day ago
SRE Manager: Distributed Cloud & Kubernetes Leadership
$207k - $300k
A leading technology company is seeking a Site Reliability Engineering Manager in Sunnyvale, CA. You will lead the SRE team, ensuring reliability and performance of cloud services, with a strong focus on Kubernetes and automation. The ideal candidate has extensive experience...
Website
Full time
Google Inc.
Sunnyvale, CA
4 days ago
Site Reliability Manager, Site Reliability Engineering
$207k - $300k
Site Reliability Manager, Site Reliability Engineering Experience owning outcomes and decision making, solving ambiguous problems and influencing stakeholders; deep expertise in domain. Apply Qualifications Bachelor’s degree in Computer Science, a related field, or...
Website
Full time
Google Inc.
Mountain View, CA
4 days ago
Enterprise GPU Cloud Product Manager (Hybrid)
...The ideal candidate will combine a technical understanding of GPU cloud infrastructure with market insight. Responsibilities include... ...in areas like Kubernetes or AI/ML workloads. This hybrid role requires on-site presence three days a week. #J-18808-Ljbffr Zettabyte
Website
3 days per week
Zettabyte
Palo Alto, CA
4 days ago
Software Engineer III - Cloud Solutions Engineer
...high performing platforms in Public Cloud using JPMC best practices Improve reliability, quality, and time-to-market of... ...JavaScript Ansible and other dev ops tools is added advantage.... ...comprehensive health care coverage, on-site health and wellness centers, a retirement...
Website
Chase
Palo Alto, CA
2 days ago
Software Engineering Manager, Computer Vision
$180k - $220k
...level hardware to modern deep learning and cloud-based data pipelines. You'll lead a team... ...production-grade systems that perform reliably in complex, dynamic environments. This is... ...role where you will be expected to be on site often, working directly on our engineering...
Website
Fulfil Solutions
Mountain View, CA
13 hours ago
Senior Manager, GPU Cloud Infrastructure - GeForce NOW
$256k - $414k
Senior Manager, GPU Cloud Infrastructure - GeForce NOW page is loaded## Senior Manager,... ...low-latency, high-throughput, and highly reliable interconnects across data centers and cloud... ...: Cloud Gaming, Cloud Streaming, Network Site Reliability/. If you're a creative...
Website
Local area
NVIDIA Corporation
Santa Clara, CA
1 day ago
Senior Site Reliability Engineer: Scale, Automate & On-Call
$124.7k - $208.85k
A leading fashion resale marketplace is seeking a Site Reliability Engineer to oversee the health and performance of web-scale systems. The... ...engineering within a fast-growing environment, with deep knowledge of cloud infrastructures like AWS. Responsibilities include developing...
Website
Poshmark, Inc.
Redwood City, CA
4 days ago
Senior Site Reliability Engineer — Hybrid + Unlimited PTO
$210k - $270k
Zocdoc is seeking a Senior Site Reliability Engineer to develop and maintain distributed production systems. The ideal candidate will have... ...site reliability or production engineering, particularly in cloud environments like AWS. Responsibilities include monitoring and...
Website
GoTo Meeting
Palo Alto, CA
4 days ago
Hybrid Cloud Reliability Architect - SRE Lead
$184.12k - $275.45k
...seeking a Staff Engineer for the Hybrid Services & Reliability team. This role involves ensuring the reliability of the 'bench cloud' crucial for autonomous vehicle systems.... ...should have extensive experience in Site Reliability Engineering and Linux systems. The...
Website
General Motors
Sunnyvale, CA
1 day ago
Software Engineering Manager II, Site Reliability Engineering
$207k - $300k
Software Engineering Manager II, Site Reliability Engineering corporate_fare Google Sunnyvale, CA, USA Bachelor’s degree in Computer Science, a related field, or equivalent practical experience. 8 years of experience with software development in one or more programming...
Website
Full time
Google Inc.
Sunnyvale, CA
3 days ago
SRE II: Cloud Reliability & Automation
$86.33k - $191.9k
Traveltechessentialist is looking for a Site Reliability Engineer in Palo Alto, California, to revolutionize travel and expense services. You will design and operate cloud infrastructure, identify reliability issues, and automate systems using tools like Terraform and AWS...
Website
Traveltechessentialist
Palo Alto, CA
2 days ago
Site Reliability Engineer
...Job Description Job Description Site Reliability Engineer Onsite- Bay Area, CA Skills Relevant Skills and Experience What You’ll Do (Day-to-Day) Own and manage our cloud infrastructure (GCP or AWS, on-prem). Build, maintain, and optimize Kubernetes...
Website
Amiri Recruiting
Mountain View, CA
9 days ago
Oracle Cloud Project Manager
Job Description As an Oracle Cloud Project Manager, your main responsibility will be to oversee a team of Consultants and Client personnel... ...Closely monitor and report on project budget. Travel to client sites as required. Work with a global team to ensure successful...
Website
Infovity, Inc.
Palo Alto, CA
1 day ago
Senior SRE: Build Reliability Tools at Scale
A tech-driven financial services company is seeking an experienced Site Reliability Engineer (SRE) to enhance the reliability of production systems in Palo Alto, CA. You will design and implement monitoring and alerting processes while automating operational tasks. The...
Website
Pylon
Palo Alto, CA
1 day ago
Cloud Service Reliability Engineer
...when to step back and when to dive deep. We call this role a Cloud Service Reliability Engineer. The Cloud Service Reliability Engineer will... ...infrastructure, service delivery, and engineering site reliability, maintaining infrastructure on premise and in cloud...
Website
forhyre.com
Sunnyvale, CA
1 day ago
Software Engineering Manager II, Photos Backup Integrations
$207k - $300k
...backup space and photos domain. Ability to work across multiple sites and time zones. Ability to communicate, collaborate, and drive... ...mission is to bring users and content to Google Photos through reliable backup and onboarding experiences. You will focus on core...
Website
Full time
Google Inc.
Mountain View, CA
3 days ago
Software Engineering Manager, AI/ML
$207k - $300k
...deployment of large-scale projects across multiple sites internationally. The AI and Infrastructure team... ...at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and billions of Google users worldwide...
Website
Full time
Worldwide
Google Inc.
Mountain View, CA
2 days ago
Engineering Manager, RCM Core Backend
...0,000 clinicians across hundreds of care sites nationwide - more than $10 billion flows... ...new automations, all while ensuring system reliability, performance, and correctness. You’ll... ...the company Partner closely with Product, Ops, and cross-functional teams to build backend...
Website
Shift work
Augmedix, Inc.
Mountain View, CA
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Director, Cloud Ops/Site Reliability. Be the first to apply!