Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Kubernetes Platform Engineer - AI Infrastructure

Webex Events (formerly Socio)

Kubernetes Platform Engineer - Ai Infrastructure - Hybrid

Join our Platform Engineering team to design, build, and operate large-scale, on-prem Kubernetes infrastructure powering next-generation AI/ML platforms, including GPU-enabled environments for traditional models and LLMs. You will lead the technical direction of scalable, reliable systems, managing the Kubernetes control plane and extending platform capabilities through custom controllers and operators. You'll architect ML platforms, implement Infrastructure as Code with Golang, and drive MLOps best practices. Partnering closely with data scientists and ML engineers, you'll enable high-performance AI workloads while leveraging AIOps for automation and reliability. This role requires strong hands-on on-prem Kubernetes experience and offers opportunities to mentor engineers and influence platform strategy in a hybrid environment.

Your Impact / Responsibilities
  • Design, build, and operate large-scale on-prem Kubernetes platforms (OpenShift/Anthos), with ownership of control plane, etc, and cluster lifecycle.
  • Architect scalable, multi-tenant platform infrastructure as the foundation for AI/ML and GenAI workloads.
  • Enable and optimize AI/ML workloads, including GPU-based environments for training, inference, and model deployment.
  • Partner with data scientists and ML engineers to onboard and scale ML pipelines and workflows.
  • Build platform capabilities using Kubernetes controllers, operators, CRDs, and Golang/Python services.
  • Implement Infrastructure as Code, automation, and AIOps-driven self-healing using platform telemetry and observability.
  • Ensure reliability through performance tuning (scheduling, resource utilization) and participate in on-call support and incident response.
Minimum Qualifications
  • 5+ years of software engineering experience, including supporting AI/ML or GPU-based workloads on Kubernetes platforms
  • 3+ years operating Kubernetes in production with control plane ownership, preferably in on-prem or self-managed environments
  • Strong experience with etcd management (backup, restore, recovery) and Kubernetes cluster upgrades
  • Proficiency in Go with experience building Kubernetes controllers/operators, CRDs, and webhooks
  • Deep understanding of Kubernetes internals (API server, scheduler, controller loops, reconciliation patterns)
  • Proven ability to debug and operate large-scale distributed systems in production environments, including participation in on-call rotations
Preferred Qualifications
  • Experience with bare-metal or on-prem infrastructure at scale
  • Experience enabling or supporting GPU-based workloads in Kubernetes environments
  • Familiarity with AI/ML platforms, pipelines, or tooling (e.g., model training, inference, or orchestration)
  • Experience building internal developer platforms or platform-as-a-service (PaaS) capabilities
  • Exposure to AIOps, including automation, anomaly detection, or self-healing systems
  • Experience applying statistical or ML techniques to operational data for reliability, performance, or capacity planning
Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Kubernetes Platform Engineer - AI Infrastructure in Parker, TX vacancy
  •  ...Senior Kubernetes Platform Engineer - AI Infrastructure Join our Platform Engineering team to design, build, and operate large-scale, on-prem Kubernetes infrastructure powering next-generation AI/ML platforms, including GPU-enabled environments for both traditional... 
    Suggested

    Cisco

    Parker, TX
    3 days ago
  •  ...talent network Join our DevOps / Platform Engineer Expert Network to connect with leading AI labs and companies seeking...  ...experience in CI / CD pipelines, cloud infrastructure (AWS / GCP / Azure),...  ...containerization & orchestration (Docker / Kubernetes) Strong communication skills... 
    Suggested
    Remote job
    Contract work

    Mercor

    Richardson, TX
    5 days ago
  •  ...Corporation Company in Plano, Texas, is looking for a Cloud Infrastructure Engineer to design cloud data lake architecture and manage...  ...orchestration. The ideal candidate has strong experience with AWS, Kubernetes, and Infrastructure as Code. We offer a collaborative work... 
    Suggested

    TCC Toyota Motor Credit Corporation Company

    Plano, TX
    5 days ago
  •  ...collaborative environment. DevOps/Platform Engineer, Security Intelligence...  ...organization. You'll own the AWS cloud infrastructure that powers AI-driven security intelligence platform...  ...engineering or platform/SRE roles ~ Deep Kubernetes/EKS experience - cluster management... 
    Suggested

    Toyota Motor Sales, U.S.A., Inc.

    Plano, TX
    4 days ago
  •  ...Deutschland GmbH is seeking a DevOps/Platform Engineer in Plano, Texas. In this role, you'll manage cloud infrastructure for a security intelligence...  ...and secure environment for AI features. The ideal...  ...extensive experience in AWS, Kubernetes, and infrastructure as code,... 
    Suggested

    Toyota Deutschland GmbH

    Plano, TX
    1 day ago
  • $180k - $250k

     ...Platform Engineering System Architect Richardson, Texas Onsite Full...  ...technologies such as Kubernetes (OpenShift/Rancher), Terraform...  ...will define how platform, infrastructure, and application layers come...  ...Experience supporting GPU, AI, or high-performance computing... 
    Full time
    Relocation

    Motion Recruitment

    Richardson, TX
    4 days ago
  •  ...Senior AI/ML Platform Engineer Collaborative. Respectful. A place to dream and do. These are...  ...our growing ML, AI and GenAI platform infrastructure needs. As a Senior AI/ML Platform...  ...CloudFormation. ~ Hands-on experience with Kubernetes (EKS or equivalent), containers, and... 

    Toyota Motor Sales, U.S.A., Inc.

    Plano, TX
    9 hours ago
  • $60 - $70 per hour

     ...expertise in ideating and developing AI/ML applications on prediction,...  .... Create and maintain data infrastructure to ingest, normalize, and...  ...with Azure architecture and Azure Kubernetes 3 years of experience with AI platform engineering, ModelOps About NTT DATA NTT DATA... 
    Remote job
    Hourly pay

    NTT Data Americas, Inc.

    Plano, TX
    17 hours ago
  • $102.5k - $120.5k

     ...Innova Solutions is immediately hiring an Azure Infrastructure Engineer with AI (Copilot) Experience Position type: Permanent Duration...  ...initiatives including cloud migration, containerization (AKS/Kubernetes), and DevOps automation. Ensure compliance with... 
    Permanent employment
    Full time
    Temporary work
    Work experience placement
    Immediate start
    Worldwide
    Flexible hours

    Innova Solutions

    Plano, TX
    4 days ago
  •  ...Company is looking for a Software Engineer in Plano, Texas, to design and enhance cloud-native platform components for AI/ML workloads. Candidates...  ...in developing and supporting infrastructure for AI/ML, particularly with AWS, Kubernetes, and CI/CD pipelines. The role... 
    Flexible hours

    TCC Toyota Motor Credit Corporation Company

    Plano, TX
    5 days ago
  •  ...Financial Services Enterprise Platforms team is looking for a...  ...and highly motivated Lead AI/ML Platform Engineer . The primary responsibility...  ...‑ready MLOps and LLMOps infrastructure that supports model training...  ...Experience with Kubernetes, containerization, and CI/... 

    Toyota Deutschland GmbH

    Plano, TX
    5 days ago
  • $167k - $215.8k

     ...TITLE: Senior System Engineering JOB LOCATION: 3400 W...  ...and migrate existing platforms and applications to Azure...  ...Automate cloud-based infrastructure deployments and...  ...technologies including Docker, Kubernetes, MuleSoft, Redis,...  ..., and maintain AI-driven workflows, integrating... 
    Temporary work
    Local area

    AT&T

    Plano, TX
    3 days ago
  •  ...skilled professionals to join their DevOps / Platform Engineer Expert Network. This role involves connecting with AI labs and participating in projects focused on...  ...extensive experience with CI/CD pipelines, cloud infrastructure, and container orchestration. Competitive... 
    Remote job
    Flexible hours

    Mercor

    Garland, TX
    4 days ago
  •  ..., we're looking for a skilled VMware Platform Engineer to join our dynamic team and contribute...  .... Build and maintain Tanzu Kubernetes Grid environments for containerized workloads...  .... Build automation for VMware infrastructure using PowerCLI, Terraform, Ansible, and... 
    Full time
    H1b
    Local area
    Immediate start
    Remote work
    Visa sponsorship
    Work visa

    Bright Vision Technologies

    Parker, TX
    10 hours ago
  •  ...Job Title: Cloud DevOps Engineer Location: Scottsdale, AZ/Richardson,...  ...with strong expertise in Google Cloud Platform (GCP), Kubernetes, and Infrastructure as Code (IaC). The ideal candidate...  ...with Java/J2EE and Spring Boot. AI/ML experience is preferred.
    Contract work

    SysMind Tech

    Richardson, TX
    2 days ago
  •  ...leading technology provider seeks a CloudOps Engineer to enhance ERP solutions through developing and optimizing cloud infrastructure. The ideal candidate should have strong...  ...cloud operations, particularly within AWS and Kubernetes, and demonstrate the ability to implement... 
    Remote job

    Tyler Technologies, Inc.

    Plano, TX
    5 days ago
  • $108.3k - $154.3k

     ...Enterprise AI Platform Engineer Category: Software Development/ Engineering Main location: United States, Various Position ID...  ...Language Model (LLM) ecosystems deployed across cloud-first infrastructure environments, including AWS. We partner with 15 of the... 
    Full time
    Local area

    CGI Technologies and Solutions, Inc.

    Plano, TX
    4 days ago
  •  ...The Site Reliability Engineer reports to the Manager, SRE & Platform Engineering, and contributes...  ...of Armor’s production infrastructure. This position operates...  ...to containerization and Kubernetes concepts; hands-on experience...  .... ~ Proficiency with AI-assisted development... 
    Work at office
    Local area
    Immediate start
    Remote work
    3 days per week

    Armor Defense Inc

    Plano, TX
    15 days ago
  •  ...edge data analytics & AI/ML solutions for...  ...Our team solves hard engineering problems at scale, with...  ...Responsibilities): • Cloud Infrastructure Management: Design,...  ...systems such as Kubernetes or AWS ECS. •...  ...AWS and Azure cloud platforms. • Proficiency in developing... 

    Ascentt

    Plano, TX
    2 days ago
  • Senior Principal Engineer, Infrastructure Platform Architect Platform Engineering group at RTX seeks an experienced Senior Principal Engineer to architect...  ...). Preferred: Experience with containerization using Kubernetes or Docker. Preferred: Experience developing, testing,... 
    Relocation

    Prattwhitney

    Plano, TX
    5 days ago
  • $73.4k - $136.3k

     ...organizations - helping them harness AI to drive outcomes at a time...  ...deep expertise in Managed Infrastructure Services, Application...  ...industry expertise, proven software platforms, and innovative AI-driven...  ...experienced Cloud/Infrastructure Engineer to design, implement, and... 
    Minimum wage
    Full time
    Contract work
    Flexible hours

    DXC Technology

    Plano, TX
    4 days ago
  •  ...4 Experience – 8+ year We are seeking a highly skilled and experienced AI/ML Platform Engineer to build and manage our end to end Machine Learning (ML) and IT operations (AIOps) infrastructure on Google Cloud Platform (GCP). In this role, you will be a key player in... 
    Contract work

    Echo IT Solutions

    Plano, TX
    7 days ago
  • $107.2k - $182.6k

     ...Staff Engineer Position We are looking for a Staff Engineer to...  ...integration and ML classification platform. This system ingests hundreds...  ...decisions on model serving infrastructure, database schema design, and...  ...(Wallaroo, SageMaker, Vertex AI, or similar) including OAuth2... 

    RealPage

    Richardson, TX
    10 hours ago
  • Toyota Deutschland GmbH is seeking a Lead AI/ML Platform Engineer to design and build scalable solutions for enterprise AI/ML capabilities. You will work closely with various teams to solve infrastructure challenges and enhance operational resilience. The ideal candidate... 

    Toyota Deutschland GmbH

    Plano, TX
    5 days ago
  •  ...core competencies - in AI, analytics, app...  ...IoT, mobile, quality engineering and UX, and our deep...  ...and GCP environments, infrastructure automation using Terraform...  ...orchestration with Kubernetes. This role requires a...  ...Technologies Cloud Platforms AWS, GCP Infrastructure... 

    Apexon

    Richardson, TX
    10 hours ago
  •  ...are seeking a highly skilled Lead Data Platform Engineer, Snowflake to design, maintain, secure,...  ...capabilities such as Cortex and emerging agentic AI use cases. What you’ll be doing Design,...  ...engineering, analytics, security, and infrastructure teams to support adoption and platform... 
    Relocation package

    TCC Toyota Motor Credit Corporation Company

    Plano, TX
    5 days ago
  • $147.1k - $167.9k

    Senior Software Engineer, DevOps (Cloud Operations Resilience...  ...’s foundational cloud infrastructure layer, including...  ...including Docker and Kubernetes, CM tools including...  ...Microsoft Azure, Google Cloud Platform) At least 2 years of...  ...interactive AI tooling to accelerate... 
    Full time
    Part time
    Internship
    H1b
    Local area

    Information Technology Senior Management Forum

    Plano, TX
    3 days ago
  • Toyota Deutschland GmbH is seeking a Senior AI/ML Platform Engineer in Plano, Texas to design and build scalable AI/ML platforms. The role involves...  ...in software engineering and expertise with AWS, Kubernetes, and AI/ML workloads. This position offers a collaborative... 

    Toyota Deutschland GmbH

    Plano, TX
    3 days ago
  • $197.3k - $225.1k

    Manager, Security Platform Engineering, DLP Control Platform Capital One operates entirely in the...  ...development and existing production infrastructure. Job Responsibilities Lead the technical...  ...1+ years of experience using AI coding tools (GitHub Copilot, Claude Code... 
    Full time
    Part time
    H1b

    COMFORT SYSTEMS

    Plano, TX
    2 days ago
  • $180k - $300k

     ...required, i.e. CompTIA Security+ Senior Platform Engineer DPG is seeking a talented and...  ..., and support secure mission infrastructure and cloud enabled capabilities. You will...  ...engineering experience using Kubernetes and container runtimes such as Docker... 

    DPG Solutions LLC

    Richardson, TX
    9 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Kubernetes Platform Engineer - AI Infrastructure. Be the first to apply!