Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Kubernetes Platform Engineer - AI/ML Infrastructure

$137k - $200.5k

Cisco

The application window is expected to close on: 06/12/2026

Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received .

Senior Kubernetes Platform Engineer - AI/ML Infrastructure - hybrid (2013054)

***hybrid role requires some work activity on-site at Research Triangle Park NC, Dallas TX or Allen TX office

Join our Platform Engineering team to design, build, and operate large-scale, on-prem Kubernetes infrastructure powering next-generation AI/ML platforms, including GPU-enabled environments for both traditional ML and state-of-the-art LLM workloads.

You will be pivotal in defining and evolving a highly scalable Kubernetes platform that serves as the foundation for AI/ML workloads. This role combines deep Kubernetes platform engineering with AI/ML infrastructure enablement, ensuring performance, reliability, and scalability across distributed systems.

You will lead technical direction across Kubernetes control plane operations, cluster lifecycle management, and platform extensibility, while working closely with data scientists, ML engineers, and infrastructure teams to support production AI workloads at scale.

This is a senior individual contributor role focused on platform ownership, engineering excellence, and driving reliability and automation across complex distributed environments.

Your Impact / Core Responsibilities

  • Architect, build, and operate large-scale on-prem Kubernetes platforms (OpenShift/Anthos), including control plane and etcd lifecycle management

  • Define and evolve scalable, multi-tenant platform architecture supporting AI/ML and GPU-based workloads

  • Enable and optimize ML workloads including training, inference, and LLM deployment pipelines on Kubernetes

  • Build platform extensions using Kubernetes controllers, operators, CRDs, and Golang-based services

  • Implement Infrastructure as Code and automation to improve scalability, consistency, and operational efficiency

  • Drive AIOps capabilities using telemetry, automation, anomaly detection, and self-healing systems for platform reliability

  • Improve observability (metrics, logs, traces) and optimize resource utilization, scheduling, and cluster performance

  • Partner with ML engineers and data scientists to operationalize ML workflows and ensure platform readiness for AI workloads

  • Participate in on-call rotations, owning incident response, reliability, and continuous operational improvement

  • Mentor engineers and contribute to defining platform engineering standards and best practices

Minimum Qualifications

  • 8+ years of software engineering experience

  • 4+ years of hands-on Kubernetes production experience with control plane ownership

  • Strong experience operating on-prem or self-managed Kubernetes environments

  • Deep expertise in etcd management (backup, restore, recovery, upgrades)

  • Strong proficiency in Go with experience building Kubernetes controllers, operators, CRDs, and webhooks

  • Deep understanding of Kubernetes internals (API server, scheduler, controller loops, reconciliation)

  • Experience supporting AI/ML or GPU-based workloads on Kubernetes platforms

  • Proven experience operating and debugging large-scale distributed systems

  • Experience participating in on-call rotations and production incident management

Preferred Qualifications

  • Experience with bare-metal or enterprise on-prem infrastructure at scale

  • Exposure to AI/ML platforms and tooling (e.g., Kubeflow, MLflow, distributed training systems)

  • Experience building internal developer platforms or platform-as-a-service (PaaS) systems

  • Familiarity with AIOps concepts such as automated remediation and predictive operations

  • Experience applying data-driven or ML-based techniques for system reliability or capacity planning

  • Contributions to Kubernetes, CNCF, or other open-source ecosystems

Why Cisco?

At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint.

Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere.

We are Cisco, and our power starts with you.

Message to applicants applying to work in the U.S. and/or Canada:

The starting salary range posted for this position is $137,000.00 to $200,500.00 and reflects the projected salary range for new hires in this position in U.S. and/or Canada locations, not including incentive compensation*, equity, or benefits.

Individual pay is determined by the candidate's hiring location, market conditions, job-related skillset, experience, qualifications, education, certifications, and/or training. The full salary range for certain locations is listed below. For locations not listed below, the recruiter can share more details about compensation for the role in your location during the hiring process.

U.S. employees are offered benefits, subject to Cisco's plan eligibility rules, which include medical, dental and vision insurance, a 401(k) plan with a Cisco matching contribution, paid parental leave, short and long-term disability coverage, and basic life insurance. Please see the Cisco careers site to discover more benefits and perks. Employees may be eligible to receive grants of Cisco restricted stock units, which vest following continued employment with Cisco for defined periods of time.

U.S. employees are eligible for paid time away as described below, subject to Cisco's policies:

  • 10 paid holidays per full calendar year, plus 1 floating holiday for non-exempt employees

  • 1 paid day off for employee's birthday, paid year-end holiday shutdown, and 4 paid days off for personal wellness determined by Cisco

  • Non-exempt employees** receive 16 days of paid vacation time per full calendar year, accrued at rate of 4.92 hours per pay period for full-time employees

  • Exempt employees participate in Cisco's flexible vacation time off program, which has no defined limit on how much vacation time eligible employees may use (subject to availability and some business limitations)

  • 80 hours of sick time off provided on hire date and each January 1st thereafter, and up to 80 hours of unused sick time carried forward from one calendar year to the next

  • Additional paid time away may be requested to deal with critical or emergency issues for family members

  • Optional 10 paid days per full calendar year to volunteer

For non-sales roles, employees are also eligible to earn annual bonuses subject to Cisco's policies.

Employees on sales plans earn performance-based incentive pay on top of their base salary, which is split between quota and non-quota components, subject to the applicable Cisco plan. For quota-based incentive pay, Cisco typically pays as follows:

  • .75% of incentive target for each 1% of revenue attainment up to 50% of quota;

  • 1.5% of incentive target for each 1% of attainment between 50% and 75%;

  • 1% of incentive target for each 1% of attainment between 75% and 100%; and

  • Once performance exceeds 100% attainment, incentive rates are at or above 1% for each 1% of attainment with no cap on incentive compensation.

For non-quota-based sales performance elements such as strategic sales objectives, Cisco may pay 0% up to 125% of target. Cisco sales plans do not have a minimum threshold of performance for sales incentive compensation to be paid.

The applicable full salary ranges for this position, by specific state, are listed below:

New York City Metro Area:

$165,000.00 - $277,600.00

Non-Metro New York state & Washington state:

$146,700.00 - $247,000.00

  • For quota-based sales roles on Cisco's sales plan, the ranges provided in this posting include base pay and sales target incentive compensation combined.

** Employees in Illinois, whether exempt or non-exempt, will participate in a unique time off program to meet local requirements.

Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis.

Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.

Vacancy posted 7 days ago
Similar jobs that could be interesting for youBased on the Senior Kubernetes Platform Engineer - AI/ML Infrastructure in Allen, TX vacancy
  • $137k - $200.5k

     ...sufficient number of applications are received . Senior Kubernetes Platform Engineer - AI Infrastructure - hybrid (2013580) *** hybrid role requires some on...  ...infrastructure powering next-generation AI/ML platforms, including GPU-enabled environments for both... 
    Senior
    Full time
    Temporary work
    Work at office
    Local area
    Flexible hours

    Cisco

    Allen, TX
    3 days ago
  •  ...Senior Kubernetes Platform Engineer - AI/ML Infrastructure Join our Platform Engineering team to design, build, and operate large-scale, on-prem Kubernetes infrastructure powering next-generation AI/ML platforms, including GPU-enabled environments for both traditional... 
    Senior

    Webex Events (formerly Socio)

    Parker, TX
    3 days ago
  •  ...Company is looking for a Software Engineer in Plano, Texas, to design and enhance cloud-native platform components for AI/ML workloads. Candidates should...  ...in developing and supporting infrastructure for AI/ML, particularly with AWS, Kubernetes, and CI/CD pipelines. The... 
    Senior
    Flexible hours

    TCC Toyota Motor Credit Corporation Company

    Plano, TX
    5 days ago
  •  ...and enhance cloud‑native platform components for AI/ML and LLM workloads in production...  ...services. Partner with engineering, data, and cybersecurity...  ...in cloud platforms or infrastructure engineering. 2+ years of...  ...Hands‑on experience with Kubernetes (EKS or equivalent), containers... 
    Senior

    TCC Toyota Motor Credit Corporation Company

    Plano, TX
    2 days ago
  •  ...is seeking a Cloud Database Platform Engineer to join their team. As a...  ...Engineer, you will be part of the Infrastructure Support team supporting...  ...on containers within Kubernetes / OpenShift platforms. Implement...  ...innovative automation and AI-driven applications.... 
    Senior
    Weekly pay
    3 days per week

    Manpower Group Inc.

    Plano, TX
    1 day ago
  • Toyota Deutschland GmbH is seeking a Senior AI/ML Platform Engineer in Plano, Texas to design and build scalable AI/ML platforms. The role involves...  ...in software engineering and expertise with AWS, Kubernetes, and AI/ML workloads. This position offers a collaborative... 
    Senior

    Toyota Deutschland GmbH

    Plano, TX
    3 days ago
  • Senior Staff Engineer, Software Date: May 25, 2026 General...  ...optimization of large‑scale infrastructure supporting AI workloads across...  ...AI infrastructure platforms supporting GPU‑...  ...Python or Go within Kubernetes‑based environments...  ...Experience with AI/ML infrastructure, GPU... 
    Senior

    Celestica Inc.

    Richardson, TX
    1 day ago
  •  ...We are seeking a Senior AI Cloud Architect with...  ...agentic AI, and cloud infrastructure, while partnering closely with engineering and product teams from...  ...documents, and AI/ML white papers Build...  ...deploy applications on Kubernetes-based cloud platforms, ensuring... 
    Senior
    Full time

    Zhone Technologies

    Plano, TX
    2 days ago
  •  ...innovation, building the platforms, data products, and AI capabilities that give...  ...Role Summary The Senior AI/ML Engineer is a hands-on technical...  ...containerization (Docker/Kubernetes), and CI/CD pipelines....  ...Vertex AI) and associated infrastructure services.... 
    Senior
    Day shift

    CBRE

    Richardson, TX
    3 days ago
  • $60 - $70 per hour

     ...expertise in ideating and developing AI/ML applications on prediction,...  .... Create and maintain data infrastructure to ingest, normalize, and...  ...Azure architecture and Azure Kubernetes ~3 years of experience with AI platform engineering, ModelOps About NTT DATA... 
    Hourly pay
    Remote work

    NTT Data Americas, Inc.

    Plano, TX
    3 days ago
  •  ...Description Job Details: Role : Sr AI Platform Engineer Location: Bellevue, Frisco,...  ...engineering: hands-on building data infrastructure for AI and ML use cases (RAG, agent tooling, feature...  ...Communication: works directly with senior architects, product managers, and... 
    Senior

    Appex Innovation

    Frisco, TX
    6 days ago
  • $123.5k - $206.75k

     ...Overview The AI Observability Architect is a senior technical...  ...observability platform that spans the...  ...strategic and engineering authority for end...  ...hours. Drive infrastructure-as-code (IaC) practices...  ...Science, AI/ML, Data Science,...  ...proficiency in Kubernetes, service mesh,... 
    Senior
    Shift work

    PepsiCo

    Plano, TX
    2 days ago
  • $135k - $155k

     ...Department OverviewThe AI Services team creates...  ...system integrations, and infrastructure deployment. Our...  ...will join a talented engineering organization with a track...  ...and implement scalable ML/AI systems and pipeline...  ...containerized solutions (Docker, Kubernetes) and serverless... 
    Senior
    Contract work
    Remote work
    Relocation

    Motorola Solutions

    Allen, TX
    1 day ago
  •  ...Financial Services Enterprise Platforms team is looking for a...  ...and highly motivated Lead AI/ML Platform Engineer . The primary responsibility...  ...‑ready MLOps and LLMOps infrastructure that supports model training...  ...Experience with Kubernetes, containerization, and CI/... 

    Toyota Deutschland GmbH

    Plano, TX
    5 days ago
  • $140k - $150k

     ...Technical Lead Software Engineer with deep...  ...architecture, agentic AI systems, strong...  ...rapidly delivering AI/ML solutions into...  ...Experience with cloud platforms (AWS, Azure, or...  ...technologies (Docker, Kubernetes) as a primary...  ...Advanced Cloud & K8s Infrastructure: Deep expertise in... 
    Work experience placement
    Work at office
    Remote work

    GlobalLogic

    Allen, TX
    2 days ago
  • $10k

     ...AI Engineer Opportunity This is an exciting and pivotal...  ...with AI engineers, platform teams, and partners who...  ...closely with AI leads, ML engineers, and...  ...Collaborate with the AI Infrastructure team to architect robust...  ...containerized environments (e.g., Kubernetes, Docker) Bachelor's... 
    Senior
    Work at office
    Work from home

    Fisher Investments

    Plano, TX
    3 days ago
  • Senior Engineer, Enterprise AI Job Overview: The Senior Engineer, Enterprise AI helps design, build, and scale AI-powered applications and platforms that improve productivity and intelligent decision-making...  ...in software engineering, AI/ML engineering, platform engineering... 
    Senior
    Full time
    Temporary work
    Part time
    Work experience placement
    Local area
    Flexible hours

    T-Mobile

    Frisco, TX
    4 days ago
  •  ...Global is seeking a Senior Cloud Engineer, you will consult and...  ...security controls and infrastructure. You will also conduct...  ...report to the Cloud Platform Team Lead. The Day...  ..., high-performance AI infrastructure on Microsoft...  ...with Deployments of ML models - VERY STRONG... 
    Senior
    Work experience placement
    Freelance
    Flexible hours

    Insight Global

    Plano, TX
    6 hours ago
  •  ...4 Experience – 8+ year We are seeking a highly skilled and experienced AI/ML Platform Engineer to build and manage our end to end Machine Learning (ML) and IT operations (AIOps) infrastructure on Google Cloud Platform (GCP). In this role, you will be a key player in... 
    Contract work

    Echo IT Solutions

    Plano, TX
    2 days ago
  • $125.7k - $213.9k

     ...the Generative AI revolution, dedicated...  ...a Lead AI Engineer who is a senior technical leader...  ...robust, scalable AI platforms that leverage...  ...-time streaming infrastructures Advanced agent...  ...to AI Engineers, ML Engineers, and...  ...orchestration (Docker, Kubernetes) and modern CI/... 
    Senior
    Remote work

    RealPage

    Richardson, TX
    4 days ago
  • Toyota Deutschland GmbH is seeking a Lead AI/ML Platform Engineer to design and build scalable solutions for enterprise AI/ML capabilities. You will work closely with various teams to solve infrastructure challenges and enhance operational resilience. The ideal candidate... 

    Toyota Deutschland GmbH

    Plano, TX
    5 days ago
  •  ...Senior Principal Engineer, Infrastructure Platform Architect At RTX, the world largest aerospace and defense company, 185,000 great minds are united by...  ...Experience with containerization using technologies such as Kubernetes, Docker Experience developing, testing &... 
    Senior
    Relocation

    Raytheon

    Plano, TX
    2 days ago
  •  ...hiring a Technical Program Manager to oversee large-scale technology initiatives involving Cloud infrastructure and AI/ML platforms. You will work closely with engineering teams and stakeholders to ensure the successful delivery of programs aligned with strategic goals... 
    Senior

    Anblicks

    Richardson, TX
    4 days ago
  •  ...network Join our DevOps / Platform Engineer Expert Network to connect with leading AI labs and companies seeking your...  ...experience in CI / CD pipelines, cloud infrastructure (AWS / GCP / Azure),...  ...containerization & orchestration (Docker / Kubernetes) Strong communication skills... 
    Contract work
    Remote work

    Mercor Inc

    Richardson, TX
    1 day ago
  • $140k - $150k

    GlobalLogic is hiring a Principal Software Engineer/Architect in Allen, TX, specializing in AI/ML. The ideal candidate will have 10-15 years of experience, strong expertise in cloud-native technologies, and a proven track record in leading technical projects. The role... 
    Senior
    Remote job

    GlobalLogic

    Allen, TX
    5 days ago
  • $100k - $150k

    Bright Vision Technologies is seeking an AI Data Infrastructure Engineer to operate large-scale data systems for AI training pipelines. This role requires expertise in data engineering and a deep understanding of AI workloads. The position is fully remote with a competitive... 
    Senior
    Remote job

    Bright Vision Technologies

    Plano, TX
    5 days ago
  • Senior Principal Engineer, Infrastructure Platform Architect Platform Engineering group at RTX seeks an experienced Senior Principal Engineer to architect,...  .... Preferred: Experience with containerization using Kubernetes or Docker. Preferred: Experience developing, testing... 
    Senior
    Relocation

    Prattwhitney

    Plano, TX
    5 days ago
  •  ...and deploying agentic AI systems and AI/ML models into...  ...Experience with cloud platforms (AWS, Azure, or GCP)...  ...technologies (Docker, Kubernetes) as a primary deployment...  ...Cloud & K8s Infrastructure: Deep expertise in Kubernetes...  ...a team of software engineers, fostering a culture... 

    Cynet Systems

    Frisco, TX
    1 day ago
  • $160k - $184k

     ...Job Description Job Description Senior Platform IAC Engineer (Infrastructure as Code) Location: Richardson, TX (Onsite) | Compensation: $160,00...  ...platform container development and orchestration such as Kubernetes and Docker Support the design, development, and... 
    Senior
    Work experience placement

    David Joseph & Company

    Richardson, TX
    9 days ago
  •  ...AI Ops Senior Technical Architect Location: Richardson...  ...delivery of the AIOps platform across observability,...  ...driven operations. Guide engineering teams, drive...  ...analytics, automation, and AI/ML adoption. Architect...  ...architecture (AWS/Azure/GCP), Kubernetes platform patterns, and... 
    Senior
    Full time
    Contract work
    Work at office
    3 days per week

    Yantran LLC

    Richardson, TX
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Kubernetes Platform Engineer - AI/ML Infrastructure. Be the first to apply!