Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Lead Cloud Engineering and Production Operations Engineer

Qode

Job Description

Job Description

About Incedo:

Incedo is a global AI and data transformation specialist empowering companies to realize sustainable business impact from their digital investments by delivering ROI from View email address on ziprecruiter.com. As a long-term partner for strategy to execution, we operate at the intersection of business and technology. Our integrated services and platforms are built on the foundation of AI & Data, digital engineering, and operations transformation, bringing deep domain expertise and full stack capabilities together. With over 4,000 people in the US, Canada, Latin America and India and a large, diverse portfolio of Fortune 500 enterprises and fast-growing clients worldwide, we work across banking & payments, wealth management, telecom, hi-tech and life sciences.

Please visit the linke to know about Incedo:

Location- San Jose, CA

Title- Lead Cloud Engineering and Production Operations Engineer

Job Description:

This role acts as a hands-on technical lead, driving cloud engineering initiatives, automating infrastructure, and ensuring high-availability and performance across customer-facing systems. The Lead Engineer will collaborate with IT, DevOps, and Software Engineering teams to build secure, scalable environments that support continuous delivery and rapid innovation.

Reporting to the Associate Director of IT and Infrastructure, this position combines deep technical execution with mentoring responsibilities—balancing architectural vision with day-to-day operational excellence.

Key Responsibilities:

Cloud Infrastructure and Engineering

  • Design, deploy, and manage hybrid and cloud infrastructures (OCI, AWS, Azure, on-prem) to support production and enterprise systems
  • Implement infrastructure-as-code (IaC) using Terraform or CloudFormation to ensure repeatable, secure, and automated deployments
  • Develop and maintain CI/CD-ready environments that support rapid build, test, and release cycles for engineering teams
  • Partner with network and security teams to implement resilient, compliant architectures

Production Operations and Reliability

  • Serve as technical lead for production systems, ensuring stability, performance, and scalability
  • Establish monitoring, logging, and alerting frameworks to improve visibility and reduce mean time to detection (MTTD) and resolution (MTTR)
  • Participate in incident response, root cause analysis, and reliability improvement efforts
  • Collaborate with Engineering and SRE teams to define SLIs, SLOs, and performance metrics for critical services

Automation and CI/CD Enablement

  • Develop and enhance deployment pipelines (e.g., Jenkins, GitLab, ArgoCD) to automate software delivery and environment provisioning
  • Embed security, compliance, and testing gates into CI/CD workflows
  • Implement configuration management and orchestration tools such as Ansible, Chef, or Puppet to manage infrastructure at scale
  • Drive efficiency through self-healing systems, auto-scaling, and infrastructure automation

Operational Leadership and Collaboration

  • Lead day-to-day production operations activities, mentoring junior engineers on cloud and reliability best practices
  • Act as a technical bridge between Infrastructure, Security, and Application Engineering teams
  • Contribute to capacity planning, cost optimization, and production readiness reviews
  • Maintain documentation, runbooks, and standard operating procedures for production systems

Qualifications:

  • Bachelor’s degree in Computer Science, Information Systems, or equivalent experience
  • 7+ years of experience in cloud and infrastructure engineering, with at least 2–3 years in a lead or senior engineer capacity
  • Deep expertise in OCI (preferred) AWS or Azure (networking, compute, storage, IAM, and monitoring)
  • Proven experience with production-scale operations and hybrid cloud deployments
  • Proficiency in:
  • Infrastructure-as-code (Terraform, CloudFormation)
  • CI/CD and DevOps pipelines (Jenkins, GitLab, ArgoCD)
  • Containers and orchestration (Kubernetes, Docker)
  • Observability tools (Datadog, Prometheus, Grafana, ELK)
  • Scripting languages (Python, Bash, PowerShell)
  • Strong troubleshooting skills and the ability to lead through high-impact incidents
  • Excellent communication and collaboration skills across cross-functional teams

Preferred Experience:

  • Experience supporting high-availability SaaS or production environments
  • Knowledge of FinOps, cloud governance, and cost optimization practices
  • Familiarity with DevSecOps principles, Zero Trust, and automated compliance frameworks
  • Exposure to AI/ML pipeline infrastructure or high-throughput data systems

AI Use Guidelines for Interviews: Our interviews are designed to reflect your own skills and thinking. The use of AI or recording tools during live interviews is not permitted unless explicitly invited by the interviewer or approved in advance as part of a reasonable accommodation. If these tools are used inappropriately or in a way that misrepresents your work, your application may not move forward in the process.

Hybrid

Targeted compensation guideline: Compensation will vary based on number of factors, including market demand for specific skills, role type, job level, and individual qualifications. Final salary offers are determined by considerations including, but not limited to, subject matter expertise, demonstrated skill level, relevant experience, geographic location, education, certifications, and training.

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Lead Cloud Engineering and Production Operations Engineer in San Jose, CA vacancy
  • NVIDIA is seeking an Implementation Methodology Engineer to join its VLSI team in Santa Clara, California. The role involves front-end design implementation methodologies and collaboration with designers to develop innovative solutions. The ideal candidate has a BS or... 
    Suggested

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $147k - $220k

     ...A leading cybersecurity company in Santa Clara is seeking an experienced QA/Automation Engineer to validate core networking and security features. Candidates must have a graduate degree and over 8 years of relevant experience, showcasing strong automation skills in Python... 
    Suggested

    Palo Alto Networks

    Santa Clara, CA
    3 days ago
  • $140k - $185k

     ...Principal Cloud Engineering and Production Operations Engineer page is loaded## Principal Cloud Engineering and Production Operations Engineerlocations:...  ...production workloads, enterprise systems, and CI/CD pipelines* Lead the adoption of infrastructure-as-code (IaC) using... 
    Suggested
    For subcontractor
    Local area

    A10 Networks

    San Jose, CA
    3 days ago
  • $160k - $180k

    TigerGraph in Milpitas, California is looking for a QA Technical Leader. This role involves leading the QA team, maintaining database engine quality, and establishing automation testing processes. Candidates should have a Bachelor’s degree in Computer Science and over... 
    Suggested
    Remote job

    TigerGraph

    Milpitas, CA
    2 days ago
  • $184k - $287.5k

    Overview NVIDIA DGX Cloud is building and operating large-scale GPU infrastructure for AI research and production workloads. We are looking for Senior Software Engineers to help build the automation, tooling, and operational systems that make GPU clusters reliable, scalable... 
    Suggested

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ..., California, is seeking a seasoned leader to manage a team of Engineering Program Managers for iCloud Platform. The ideal candidate will...  ...program management and will be pivotal in driving the execution of cloud services. Responsibilities include team development, program... 

    Apple Inc.

    Cupertino, CA
    2 days ago
  • $272k - $431.25k

    NVIDIA DGX Cloud is scaling GPU infrastructure across internal...  ...for Principal Software Engineers to help shape the technical direction for production engineering, Kubernetes-based operations, automation, and...  ...can define architecture, lead through influence, build... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $186.06k

     ...designs setup from standalone servers to cloud servers, which can leverage edge...  ...analytics and optimization. Support the engineering team to migrate to cloud computing resources...  ...resources while maintaining low cost of operation. Qualifications Requires Bachelor’s degree... 
    Relocation

    Neethaconsulting

    Cupertino, CA
    22 hours ago
  • NVIDIA Gruppe is seeking a Senior Network Engineer to develop and manage a robust cloud network infrastructure. You will lead the design and implementation of large-scale L3 networks across data centers and corporate IT. Ideal candidates will have over 8 years of networking... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • NVIDIA Gruppe is seeking experienced Senior Software Engineers to join their production engineering team in Santa Clara, California. The role involves building automation and operational systems for GPU clusters, with a focus on Kubernetes and reliability practices. The... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $272k - $431.25k

    NVIDIA Gruppe is seeking a Principal Software Engineer to shape the technical direction of our GPU infrastructure in Santa...  .... You will define the technical strategy for DGX Cloud cluster operations and lead the design and implementation of critical systems. The ideal... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $147k - $237.5k

     ...Networks, Inc. is seeking a Principal Software Engineer in Santa Clara, California, to drive the...  ...leadership and delivery of high-scale cloud security solutions. In this high-impact...  ...network security challenges, manage the full product lifecycle, and collaborate across various... 

    Palo Alto Networks, Inc.

    Santa Clara, CA
    4 days ago
  • $112k - $137k

     ...A leading cybersecurity company in Santa Clara seeks an experienced Software Testing Engineer to design and validate cloud security products. The ideal candidate holds a Bachelor's in Computer Science and has over 10 years of experience in software testing, particularly... 
    Work experience placement

    Fortinet

    Santa Clara, CA
    3 days ago
  •  ...seeking a Principal Site Reliability Engineer in Santa Clara, CA. This role...  ...infrastructure and ensuring applications are production-ready, scalable, and reliable....  ...and researchers, design secure cloud infrastructure, automate processes, and lead root cause analysis. Ideal... 

    Palo Alto Networks, Inc.

    Santa Clara, CA
    1 day ago
  •  ...for application microservices deployed in both on-prem and on Cloud. Setup test tools to validate environment, application and solutions...  ...guidance for team members and coworkers on development and operations. Communicate and highlight any potential risks... 

    Rootshell Enterprise Technologies

    Santa Clara, CA
    1 day ago
  • $190.9k - $334.1k

     ...and experienced Automation Engineering Tech Lead to own and elevate Veza's test...  ...engineering excellence and product quality, you will set the automation...  ...ships software. You will operate with startup-level ownership...  .... Experience with AWS and cloud‑native infrastructure.... 
    Flexible hours
    Shift work

    Centaur Labs

    Santa Clara, CA
    2 days ago
  • $80k

     ...A leading technology company based in Sunnyvale, California, is seeking an Engineer for Cloud Operations & Support. The successful candidate will deploy and maintain cloud services while developing automation tools to enhance operational efficiency. A Bachelor’s degree... 

    eGain

    Sunnyvale, CA
    3 days ago
  • $170k

     ...thrive because of their differences, not despite them. Staff Cloud Operations Engineer - San Jose HQ Extreme’s Cloud Operations team is a group...  ...Operations engineer with strong working experience in production operation and deployment automation. You will work with the... 
    Work experience placement
    Local area
    Relocation

    Extreme Networks

    San Jose, CA
    3 days ago
  • $170k

     ...A leading technology company is seeking a Staff Cloud Operations Engineer in San Jose, CA. The ideal candidate will manage and maintain cloud service infrastructure, troubleshoot issues, and design deployment automation solutions. Candidates should have a Bachelor's degree... 
    Relocation

    Extreme Networks

    San Jose, CA
    3 days ago
  • $180k - $225k

     ...globally trust our end-to-end, cloud-driven networking...  ...week    Extreme’s Cloud Operations team is a group of talented engineers passionate about...  ...strong work experience in production operation, as well as cloud...  ...position is responsible for leading cloud infrastructure... 
    Work experience placement
    Work at office
    Local area
    2 days per week
    1 day per week

    Extreme Networks

    San Jose, CA
    a month ago
  •  ...An established industry player is seeking a skilled software engineer with a strong focus on platforms and systems in the analytics domain. This role involves engineering and maintaining a hybrid cloud analytics data platform while collaborating with cross-functional... 

    TechDigital Group

    Cupertino, CA
    3 days ago
  •  ...Clara is looking for an IT Helpdesk and Operations Engineer. This role involves supporting and...  ...systems, managing security protocols, and leading significant IT projects. Candidates should...  ...operations experience, a background in cloud solutions, and significant... 

    Nexthop Systems Inc

    Santa Clara, CA
    3 days ago
  • $182.13k - $220.9k

     ...Cupertino, California is seeking a skilled automation engineer to develop and maintain testing frameworks for their diverse product line. The ideal candidate will possess a...  ...in building automation tools, working with cloud platforms, and conducting performance testing.... 

    Apple Inc.

    Cupertino, CA
    1 day ago
  •  ...A leading cybersecurity company is seeking a Principal Software Engineer in San Jose. The role involves architecting a scalable test automation framework, collaborating across teams to develop cloud-based solutions, and mentoring junior engineers. The ideal candidate... 

    Zscaler

    San Jose, CA
    3 days ago
  •  ...Cisco is seeking a Senior Software Engineer in San Jose, CA, to lead API development and enhance their AI platform. The ideal candidate has over 5...  ...technical leadership, collaborating with teams, and managing cloud infrastructure. Cisco offers competitive benefits,... 
    Flexible hours

    Cisco

    San Jose, CA
    3 days ago
  •  ...looking for a strong and experienced software engineer who has a focus on Platforms/Systems...  ...analytics data platform based on a hybrid cloud infrastructure. Work collaboratively with...  ...required. Key Qualifications Experience leading teams and working with multiple stakeholders... 

    TechDigital Group

    Cupertino, CA
    4 days ago
  •  ...An innovative firm is seeking a Wireless Engineer to join their dynamic team in Sunnyvale. This role involves designing and developing...  ...tests, and collaborating with cross-functional teams to ensure product performance and stability. The ideal candidate will have a solid... 

    Central Business Solutions

    Sunnyvale, CA
    3 days ago
  • $229.9k - $262.4k

     ...Sr. Lead AI Engineer (Gen AI Platform Services) Overview: At Capital One, we are creating...  ...leading capabilities with breakthrough product experiences and scalable, high-performance...  ...and responsible AI solutions on cloud platforms (e.g. AWS, Google Cloud, Azure... 
    Full time
    Part time
    Local area

    Capital One

    San Jose, CA
    4 days ago
  • $74.04k - $148.08k

     ...Test Automation Engineer At Capgemini Engineering, the world leader...  ...CD pipelines. Your role Lead development and execution of...  ...capabilities in AI, generative AI, cloud and data, combined with its...  ...Professional Community: Products & Systems Engineering Capgemini... 
    Permanent employment
    Full time
    Contract work
    Local area

    Capgemini

    Santa Clara, CA
    2 days ago
  •  ...A leading financial services firm in San Jose seeks a Distinguished AI Engineer to design and implement robust AI platforms. The role requires extensive experience in developing scalable AI solutions with a focus on responsibility and efficiency. Ideal candidates will... 

    Capital One National Association

    San Jose, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Lead Cloud Engineering and Production Operations Engineer. Be the first to apply!