Senior DevOps Engineer| AI/ Kubernetes/Python
$134.8k - $229.2kKLA
Company Overview
KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents systems and solutions for the manufacturing of wafers and reticles, integrated circuits, packaging, printed circuit boards and flat panel displays. The innovative ideas and devices that are advancing humanity all begin with inspiration, research and development. KLA focuses more than average on innovation and we invest 15% of sales back into R&D. Our expert teams of physicists, engineers, data scientists and problem-solvers work together with the world's leading technology providers to accelerate the delivery of tomorrow's electronic devices. Life here is exciting and our teams thrive on tackling really hard problems. There is never a dull moment with us.Group/Division
The KLA Services team headquartered in Milpitas, CA is our service organization that consists of Service Sales and Marketing, Spares Supply Chain management, Field Operations, Engineering, Product Training, and Technical Support. The KLA Services organization partners with our field teams and customers in all business sectors to maintain the high performance and productivity of our products through a flexible portfolio of services. Our comprehensive services include: proactive management of tools to identify and improve performance; expertise in optics, image processing and motion control with worldwide service engineers, 24/7 technical support teams and knowledge management systems; and an extensive parts network to ensure worldwide availability of parts.Job Description/Preferred Qualifications
We seek a highly skilled and passionate Senior AI Ops Engineer to join our team. This role will be pivotal in architecting and delivering the automation layer that enables fast, reproducible, and scalable model development-spanning end-to-end experiment management, model fine-tuning pipelines, and Reinforcement Learning with Human Feedback (RLHF). We encourage you to apply if you're a systems-minded engineer who loves turning research workflows into reliable production-grade pipelines, setting standards, and mentoring others to raise the bar across the organization.
Key Responsibilities:
- Implement and operate experiment tracking, lineage, and reproducibility standards (datasets, code, configs, artifacts, metrics) using MLflow/W&B or equivalents.
- Build CI/CD for ML: tests (unit/integration), packaging, reproducibility checks, policy gates, automated deployment and rollback strategies.
- Design workflow orchestration for large-scale ML jobs (scheduled runs, triggered retrains, parameter sweeps, gated releases) using tools such as Airflow/Kubeflow/Argo or equivalents.
- Architect, build, and own automated pipelines for model training, fine-tuning (e.g., PEFT/LoRA), evaluation, and promotion across environments (dev staging production).
- Establish standardized training "recipes" (configs, templates, golden paths) to reduce time-to-first-experiment and improve consistency across teams.
- Enable and optimize distributed GPU training (throughput, reliability, and cost), including checkpointing, mixed precision, fault tolerance, and spot/preemptible handling where applicable.
- Develop evaluation harnesses and automated benchmark suites (quality, safety, latency, and cost) with clear, repeatable reporting to compare runs and releases.
Qualifications:
- Strong proficiency in Python and experience building robust automation frameworks and production-grade services for ML workloads
- Hands-on experience with experiment tracking and model lifecycle tooling (e.g., MLflow, Weights & Biases) and reproducible ML workflows
- Practical experience fine-tuning modern deep learning models (e.g., Transformers) and familiarity with parameter-efficient approaches (LoRA/PEFT)
- Working knowledge of RLHF concepts and pipelines (preference data, reward models, policy optimization) and how to operationalize human-in-the-loop workflows.
- Experience with containerization (Docker), orchestration (Kubernetes), and operating GPU workloads reliably at scale.
- Experience with CI/CD, version control (Git), and Infrastructure-as-Code (Terraform/Bicep or equivalent).
- Excellent problem-solving skills across distributed systems (training jobs, pipelines, compute infrastructure) and strong communication to partner with research and engineering teams.
- Prior experience in a similar industry and/or operating ML platforms with stringent IP/security requirements is a plus.
- Bachelor's degree in Computer Science, Software Engineering, or related field
- 5+ years of experience in MLOps/Platform Engineering/DevOps/ML Engineering (or demonstrated equivalent impact), including owning production systems and leading cross-team initiatives
Minimum Qualifications
Master's Level Degree and related work experience of 6 years; OR Bachelor's Level Degree and related work experience of 8 years; OR equivalent work experience
Interns are eligible for some of the benefits listed. Our pay ranges are determined by role, level, and location. The range displayed reflects the pay for this position in the primary location identified in this posting. Actual pay depends on several factors, including state minimum pay wage rates, location, job-related skills, experience, and relevant education level or training. We are committed to complying with all applicable federal and state minimum wage requirements where applicable. If applicable, your recruiter can share more about the specific pay range for your preferred location during the hiring process.
KLA is proud to be an Equal Opportunity Employer. We will ensure that qualified individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us View email address on click.appcast.io or at View phone number on click.appcast.io to request accommodation.
Be aware of potentially fraudulent job postings or suspicious recruiting activity by persons that are currently posing as KLA employees. KLA never asks for any financial compensation to be considered for an interview, to become an employee, or for equipment. Further, KLA does not work with any recruiters or third parties who charge such fees either directly or on behalf of KLA. Please ensure that you have searched KLA's Careers website for legitimate job postings. KLA follows a recruiting process that involves multiple interviews in person or on video conferencing with our hiring managers. If you are concerned that a communication, an interview, an offer of employment, or that an employee is not legitimate, please send an email to View email address on click.appcast.io to confirm the person you are communicating with is an employee. We take your privacy very seriously and confidentially handle your information.
- ...Senior DevOps Engineer We are looking for a Senior DevOps Engineer to join our team in Ann Arbor, MI. The DevOps team is responsible for everything... ...issues. At least 5 years of experience with Bash/Python scripting. At least 5 years of experience managing a...SeniorPythonPermanent employmentFull timeFor contractorsWork at office2 days per week3 days per week
- A technology-driven organization is seeking a Senior Software Engineer to develop their open source SDK and enterprise solutions. This role involves... ...in full-stack software engineering, proficiency in Python, and expertise with NoSQL databases. This position supports...SeniorPythonRemote work
- ...A leading technology company is seeking a Senior Software Engineer to develop scalable APIs and open source SDKs. This fully remote role requires... ...of experience in software engineering, proficiency with Python, and expertise in NoSQL databases. You will work on supporting...SeniorPythonRemote work
$95k - $154k
...re a Java developer, software programmer, data scientist, ML/ AI Engineer or a data analyst struggling to break into the tech industry,... ...entry-level software programmers, Java Full stack developers, Python/Java developers, Data analysts/Data Engineers/ Data Scientists...PythonFull timeH1b- ...Senior Full Stack Software Engineer Mariana Minerals is looking for an experienced Senior... ...technologies (Node.js/Python/Go), and cloud platforms (AWS... ...~ Strong understanding of DevOps practices, CI/CD pipelines,... ...engineering, ETL pipelines, or ML/AI integration is a plus....SeniorPython
- ...data. Our platform, FiftyOne, is where AI work happens . Our enterprise platform... ...retreats per year. About your role As a Senior Software Engineer, you will be empowering our users around... ...Typescript/React. Proficiency with Python, a plus. A passion for UI and visualization...SeniorPythonRemote work
$85.39k - $116.98k
...Strategic Group (SSG) is seeking a talented Senior Systems Engineer (Amazon Web Services (AWS) Cloud... ...development languages such as .Net/C#, Go, Python, Java, or NodeJS ~ Demonstrated... ...practices ~ An ability to evaluate when AI-assisted code generation is appropriate...SeniorPythonFull timeRemote work- .... Our platform, FiftyOne, is where AI work happens . Our enterprise platform... ...retreats per year. About your role As a Senior Software Engineer at Voxel51, you’ll collaborate with a... ...a related field. ~ Proficiency with Python. ~ Expertise with NoSQL databases (...SeniorPythonRemote workFlexible hours
- • 3+ years of DevOps / SRE experience • AWS cloud infrastructure expertise • CI/CD pipeline experience • Docker and Kubernetes proficiency • Terraform (Infrastructure as Code) • Monitoring/logging tools (Prometheus, Grafana, ELK, etc.) • Strong communication...Senior
- ...s degree in Computer Science, AI/ML, or related field 5+ years... ...hands-on experience with Kubernetes and container technologies in production Proficiency in Python, Golang, or similar programming... ...architectures Familiarity with DevOps principles and best practices...Python
- ...DevOps Engineer 2: (Jr. - Mid Level) Location: Ann Arbor, MI – 4 days/week on-site Monday... ...Scripting languages such as Bash, Groovy, Python, Powershell . ~ General understanding... ...technologies (i.e. Docker, Kubernetes). ~ Stay up-to-date on relevant technologies...PythonRemote workMonday to Friday
$30 - $35 per hour
...conversational commerce solutions). The DevOps Engineer will work with the most-often used... ...scripting languages like Bash, Groovy, Python, Powershell • Advanced understanding... ...containerization technologies (i.e. Docker, Kubernetes) Nice to Have Skills & Experience...Python- ...mobility; and Cloud & AI, the digital infrastructure... ...with Autonomy ML engineers working on Perception and... ...close collaboration with senior engineers and ML... ...Expert proficiency in Python and experience with PyTorch... ...using systems such as Kubernetes, Airflow, Flyte, or similar...PythonInternshipWork at office
- ...As the Senior Director, Software Engineering, you are the main point of contact for business and content operations... ...languages including Java and Python It would be great if you also... ...assists in process improvement Use AI to improve the software development process...SeniorPythonFull timeWork experience placementLive inWork at office2 days per week3 days per week
- ...platform, FiftyOne, is where AI work happens . Our enterprise... ...As a Principal Infrastructure Engineer at Voxel51, you will shape the... ...and debugging container images Kubernetes (and Docker Compose) for orchestration... ...skills (Bash or similar) Python expertise , including build...PythonRemote work
- DevOps Virtual Development Engineer Hyundai America Technical Center, Inc. (HATCI) is looking for an... ...with common DevOps software: OpenShift/Kubernetes, Docker, QEMU, Artifactory, GitLab... ...Programming experience in C/C++, Python, Java/Kotlin, and/or Bash Scripting...PythonFor contractorsRemote workFlexible hoursShift work
- ...for a Sr. Consultant, Analytics Engineering to join our growing team of... ...decision-making, BI, and downstream AI/ML use cases. The Sr.... ...GitHub Actions, GitLab CI, Azure DevOps, or similar) for dbt projects.... ...-native schedulers). • Python scripting for analytics tooling...SeniorPython
$95k - $150k
...Full stack developers, Python/Java developers, Data analysts/Data Engineers/ Data Scientists, Machine... ...- SynergisticIT Is AI Going to Replace Software... ...Collaborate with DevOps teams to build and maintain... ...using Jenkins, Docker, and Kubernetes Debug, test, and optimize...PythonFull timeWork experience placementH1b$226.4k - $271.7k
...Senior, Machine Learning Engineer - End-to-End Remote - U.S, Ann Arbor, MI At Torc, we have always believed that autonomous vehicle technology... ...decision-making environments Strong programming skills in Python and PyTorch, with ability to write production-quality ML...SeniorPythonFull timeWork at officeImmediate startRemote workRelocation- ...vehicles; Woven City, a test course for mobility; and Cloud & AI, the digital infrastructure powering our collaborative foundation... ...this goal in its next-generation vehicles. The Arene Tools Engineering Team's Mission The Arene Tools Engineering team's mission is...SeniorWork at officeFlexible hoursShift work
- ...offs, and communicate like an engineer . That's exactly where SynergisticIT... ..., Java full stack developers, Python/Java developers, Data Analysts... ...include Java / Full Stack / DevOps and Data tracks like Data... ..., Data Analytics/BI, ML/AI , because those are the roles...PythonFull timeH1b
$170k - $210k
...a fast-growing NVIDIA-backed AI company enabling AI data centers... .... The AI Infrastructure Engineer is responsible for designing,... ...orchestration and cluster management (Kubernetes, Docker) ~ Experience... ...it matters ~ Proficiency in Python; C++, CUDA, Go, Rust a plus...PythonLocal areaRemote workFlexible hours- ...Job Description Under the leadership of the Software Engineering Director and in collaboration with the Global Software Development... ...Strong knowledge of the following Programming Languages: C, C++, Python, MATLAB, Java, or other relevant languages. Embedded...SeniorPython
- Back Model Risk Management Senior Analyst - Model Risk Management #51-8798 Multiple Locations Apply X Facebook LinkedIn Email Copy Location... ...banking preferred. Experience with analytical tools such as Python, R, MATLAB, SAS, SQL, MS Office required. Experience in...SeniorPythonFull timePart timeRemote work
- ...Senior Software Engineer Design, develop, enhance, and debug enterprise software applications, including... ...optimize CI/CD in Jenkins and Azure DevOps with SAST and DAST security gates... ...using the Repository pattern. Develop Python components for application and data integration...SeniorPythonWork at officeRemote work
$90k - $154k
...such as junior software programmer , Java full stack engineer , Python/Java developer , DevOps/cloud engineer , plus data-track roles like data... ...analyst , data engineer , data scientist , and ML/AI engineer . The focus areas remain: Java / Full Stack...PythonFull timeH1b$90k - $154k
...programmers, Java full stack developers, Python/Java developers, data analysts, data engineers, data scientists, and machine... ...-demand tracks: Java / Full Stack / DevOps and Data (Analytics/Engineering/Science) + Machine Learning/AI . Interview success improves when you...PythonH1bRemote work$180k - $210k
...fast-growing NVIDIA-backed edge AI company enabling greater... ...provisioned power. We are seeking a DevOps Engineer to help design, build, and... ...applications using Kubernetes, ensuring optimal performance... ...distributed systems, with clear senior or principal‑level impact Experience...Local areaRemote workFlexible hours- ...Quality Assurance Engineer The Michigan Online Product... ...you will report to the Senior Quality Assurance... ...automation as part of our DevOps culture. You will contribute... ...tools Programming: Python, JavaScript, TypeScript... ...and outcomes, including AI-enabled tools where...PythonWork at officeRemote workWorldwideMonday to Thursday
- ...Hyundai America Technical Center, Inc. (HATCI) is looking for a Senior Engineer to join the Infotainment Software Department, which... ...or mobile app industries ~ Expert-level proficiency in SQL, Python, and R ~ Proven experience with data visualization tools, like...SeniorPythonFor contractorsFlexible hoursShift work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior DevOps Engineer| AI/ Kubernetes/Python. Be the first to apply!
- devops aws developer (remote) Ann Arbor, MI
- senior devops engineer remote Ann Arbor, MI
- senior vmware engineer Ann Arbor, MI
- senior performance engineer Ann Arbor, MI
- senior software design engineer Ann Arbor, MI
- senior tableau developer Ann Arbor, MI
- senior magento developer Ann Arbor, MI
- senior sas developer Ann Arbor, MI
- senior dynamics crm developer Ann Arbor, MI
- senior grant accountant Ann Arbor, MI

