Forward Deployed Engineer, AI Inference (vLLM and Kubernetes)

$184.94k - $342.49k

Red Hat

Overview The vLLM and LLM-D Engineering team at Red Hat is looking for a customer obsessed developer to join our team as a Forward Deployed Engineer. In this role, you will not just build software; you will be the bridge between our cutting-edge inference platform (LLM-D, and vLLM) and our customers' most critical production environments. You will interface directly with the engineering teams at our customer to deploy, optimize, and scale distributed Large Language Model (LLM) inference systems. You will solve "last mile" infrastructure challenges that defy off-the-shelf solutions, ensuring that massive models run with low latency and high throughput on complex Kubernetes clusters. This is not a sales engineering role, you will be part of the core vLLM and LLM-D engineering team. What You Will Do Orchestrate Distributed Inference: Deploy and configure LLM-D and vLLM on Kubernetes clusters. Set up and configure advanced deployment like disaggregated serving, KV-cache aware routing, KV Cache offloading etc to maximize hardware utilization. Optimize for Production: Go beyond standard deployments by running performance benchmarks, tuning vLLM parameters, and configuring intelligent inference routing policies to meet SLOs for latency and throughput. Focus on Time Per Output Token (TPOT), GPU utilization, GPU networking optimizations, and Kubernetes scheduler efficiency. Code Side-by-Side: Work directly with customer engineers to write production-quality code (Python/Go/YAML) that integrates our inference engine into their existing Kubernetes ecosystem. Solve the "Unsolvable": Debug complex interaction effects between specific model architectures (e.g., MoE, large context windows), hardware accelerators (NVIDIA GPUs, AMD GPUs, TPUs), and Kubernetes networking (Envoy/ISTIO). Feedback Loop: Act as the "Customer Zero" for our core engineering teams. Channel field learnings back to product development, influencing the roadmap for LLM-D and vLLM features. Travel: Travel only as needed to customers to present, demo, or help execute proof-of-concepts. What You Will Bring 8+ Years of Engineering Experience: You have a decade-long track record in Backend Systems, SRE, or Infrastructure Engineering. Customer Fluency: You speak both "Systems Engineering" and "Business Value". Bias for Action: You prefer rapid prototyping and iteration over theoretical perfection, and you are comfortable operating in ambiguity and taking ownership of the outcome. Deep Kubernetes Expertise: Fluent in K8s primitives, from defining custom resources (CRDs, Operators, Controllers) to configuring modern ingress via the Gateway API. Experience with stateful workloads and high-performance networking, including tuning scheduler logic (affinity/tolerations) for GPU workloads and troubleshooting complex CNI failures. AI Inference Proficiency: Understand how a LLM forward pass works. Know KV Caching, prefill/decode disaggregation, context length impacts, and how continuous batching works in vLLM. Systems Programming: Proficiency in Python (model interfaces) and Go (Kubernetes controllers/scheduler logic). Infrastructure as Code: Experience with Helm, Terraform, or similar tools for reproducible deployments. Cloud & GPU Hardware Fluency: Comfortable spinning up clusters and deploying LLMs on bare-metal and hyperscaler Kubernetes clusters. Following is considered a plus: Experience contributing to open-source AI infrastructure projects (e.g., KServe, vLLM, Kubernetes); Knowledge of Envoy Proxy or Inference Gateway (IGW); Familiarity with model optimization techniques like Quantization (AWQ, GPTQ) and Speculative Decoding. Compensation and Benefits The salary range for this position is $184,940.00 - $342,490.00. Actual offer will be based on your qualifications. Pay Transparency: Red Hat determines compensation based on several factors including but not limited to job location, experience, applicable skills and training, external market value, and internal pay equity. Annual salary is one component of Red Hat’s compensation package. This position may also be eligible for bonus, commission, and/or equity. For positions with Remote-US locations, the actual salary range for the position may differ based on location but will be commensurate with job duties and relevant work experience. About Red Hat Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40+ countries, our associates work flexibly across work environments, from in-office, to office-flex, to fully remote, depending on the requirements of their role. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We\'re a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact. Benefits Comprehensive medical, dental, and vision coverage Flexible Spending Account - healthcare and dependent care Health Savings Account - high deductible medical plan Retirement 401(k) with employer match Paid time off and holidays Paid parental leave plans for all new parents Leave benefits including disability, paid family medical leave, and paid military leave Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more! Note: These benefits are only applicable to full time, permanent associates at Red Hat located in the United States. Inclusion and Equal Opportunity Equal Opportunity Policy (EEO): Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law. Red Hat supports individuals with disabilities and provides reasonable accommodations to job applicants. If you need assistance completing our online job application, email View email address on click.appcast.io. General inquiries regarding the status of a job application will not receive a reply. #J-18808-Ljbffr

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the Forward Deployed Engineer, AI Inference (vLLM and Kubernetes) in Seattle, WA vacancy

AI Inference Engineer: Kubernetes, vLLM, Customer Delivery
...Red Hat, LLC is seeking a Forward Deployed Engineer to enhance their LLM-D and vLLM platforms. You will be responsible for deploying and optimizing distributed inference systems on Kubernetes, working closely with customer teams. The ideal candidate has extensive experience...
Suggested
Red Hat
Seattle, WA
13 hours ago
Software Systems Engineering
$135k - $200k
...missing children, and more. The Role We are seeking a Forward Deployed Software Engineer to join a newly-formed team focused on developing advanced... ...Autonomy C2 solutions into Palantir platform and with AI and autonomy software solutions such sensor and data fusion...
Suggested
Work experience placement
Work at office
Remote work
Work from home
Relocation package
Palantir Technologies
Seattle, WA
2 days ago
AI Inference Infrastructure Software Engineer (Kubernetes / Cloud)
...on a mission to reinvent AI inference infrastructure from the... ...Infrastructure Software Engineer to own and evolve the cloud and Kubernetes backbone behind our Token... ...scale predictably, and deploy seamlessly across managed... ...runtimes (e.g., Triton, vLLM, TGI) and model-serving...
Suggested
Work at office
Flexible hours
3 days per week
ElastixAI INC.
Seattle, WA
13 hours ago
Forward-Deployed AI & Data Systems Engineer
Spice.ai is seeking a Forward Deployed Engineer to embed with customers in Seattle, WA, and optimize Spice.ai deployments. The role requires expertise in deploying data and AI systems, with 5+ years of engineering experience preferred. Excellent communication and SQL skills...
Suggested
spice.ai
Seattle, WA
13 hours ago
Senior Applied Scientist - Machine Learning Systems Engineer- Photoshop
$164k - $313.3k
...Systems & Efficiency Engineer to join our R&D... ...in inference performance, latency... ...Intelligence (AI), ML systems, and... ...aware ML systems deployed in production.... ...(e.g., Triton, vLLM, SGLang, xDiT,... ...workflows (Docker, Kubernetes) and job... ...application may not move forward in the process....
Suggested
Temporary work
Local area
Worldwide
Adobe
Seattle, WA
2 days ago
Lead Forward Deployed Engineer, AI Evaluation Platform
$171.6k - $258.1k
...States Software and Services AI systems are only as... ...foundational. Join Apple Services Engineering to build the next generation... ...interaction.We are looking for a Lead Forward Deployed Engineer (FDE) to lead the... ...(datasets, training vs. inference, evaluation metrics) and can...
Relocation
Apple Inc.
Seattle, WA
1 day ago
Forward Deployed Software Engineer - Seattle
...About the team OpenAI’s Forward Deployed Engineering team partners with customers to turn research breakthroughs into production systems. We operate... ...that solves their problem. About OpenAI OpenAI is an AI research and deployment company dedicated to ensuring that...
Internship
Work at office
Relocation package
Centaur Labs
Seattle, WA
14 hours ago
Forward Deployed Software Engineer
...logistics and operational decision-making through AI-enabled predictive software. Built by a team combining elite software engineering talent with deep operational domain... .... The Opportunity We are seeking Forward Deployed Software Engineers in Seattle, New York City...
HRB
Seattle, WA
14 hours ago
Sr. Forward Deployed Engineer
$140k - $165k
...Powered by RapidSOS HARMONY, the industry’s first purpose‑built AI for public safety, RapidSOS empowers first responders with... ...lives. Learn more at What This Role Is About The Senior Forward Deployed Engineer (FDE) is a senior, client‑facing technical expert who serves...
Flexible hours
RapidSOS
Seattle, WA
1 day ago
Forward Deployed Engineer (Rust)
...Building data-driven AI applications and agents... ...piece together query engines, search systems, caches... ...and fully managed cloud deployments. For a deeper dive... ...This role requires a Forward Deployed Engineer to embed... ..., search, and AI inference. Identify expansion...
Work at office
Remote work
Spice AI
Bellevue, WA
20 days ago
Forward Deployed Engineer - AI Solutions Engineering
...Job Description Job Description Aircall is a unicorn, AI-powered customer communications platform used by 22,000+ companies... ...’ll feel at home here. About the Team Aircall’s Forward Deployed Engineering team connects our innovative AI Agents technology to real-...
Full time
Worldwide
Aircall
Seattle, WA
15 days ago
Field-Deploy Software Engineer - AI-Driven Logistics
HRB is looking for Forward Deployed Software Engineers to join their mission-driven teams in New York City. This role offers significant responsibility... ...Computer Science. Join a team committed to operational excellence and cutting-edge AI solutions. #J-18808-Ljbffr HRB
HRB
Seattle, WA
4 days ago
Software Engineer, Forward Deployed Agent Builder Seattle, Washington, United States
$152k - $240k
# Software Engineer, Forward Deployed Agent Builder#### Seattle, Washington, United StatesSoftware Engineer, Forward Deployed Agent Builder**Engineering... ...builders become leaders.**What you’ll do**We're building AI agents to automate and augment internal functions at Brex,...
Work at office
Remote work
Work from home
Brex
Seattle, WA
13 hours ago
Senior AI Inference Engineer - Model Optimization & Deployment
$242k - $290k
...Model Optimization & Deployment Engineer The Perception team is pioneering the development of... ...CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic... ...and maximize memory bandwidth on AI accelerators. Write production-level...
Temporary work
Relocation package
Zoox
Seattle, WA
3 days ago
Forward Deployed Scientific AI Engineer
Accenture is seeking a Forward Deployed Scientific Engineer in Kirkland, Washington. This role involves leveraging AI tools and scientific informatics expertise to develop solutions for laboratory scientists. Candidates must have at least 3 years of experience and a Bachelor...
Accenture
Kirkland, WA
2 days ago
Senior Forward-Deployed AI Engineer - End-to-End Delivery
$85.88k - $193.43k
KPMG Careers is seeking a Technical Developer, Forward Deployed Engineer in Seattle to collaborate with elite AI-native full-stack engineers. The role requires at least 7 years of experience in secure software production, with a focus on data and AI applications. You'll...
KPMG Careers
Seattle, WA
4 days ago
Forward Deployed AI Engineer (Must be PST timezone)
$150k - $190k
...service workflows. Join our Agent Deployment team to play a critical role in... ...customer interactions every day. As a Forward Deployed AI Engineer, you'll directly influence PolyAI's... ...microservice deployments using Docker and Kubernetes. ~ Native-level proficiency in...
Work from home
Flexible hours
PolyAI
Seattle, WA
18 days ago
AI Inference Infra Engineer - Kubernetes & Cloud
...ElastixAI INC. in Seattle seeks an Inference Infrastructure Software Engineer to manage the cloud and Kubernetes backbone behind their Token-as-a-Service platform. The ideal... ...the opportunity to work at the forefront of AI technology in a collaborative environment. #J-1...
ElastixAI INC.
Seattle, WA
14 hours ago
Sr. Site Reliability Engineer
...caliber Site Reliability Engineer (SRE) to join our Forward Engineering team.... ..., data-driven AI platforms remain resilient... ...strategies for Kubernetes (GKE) to handle fluctuating... ...and high-volume inference. 2. MLOps & AI Infrastructure... ...and optimize robust deployment pipelines for both...
Local area
Tiger Analytics
Seattle, WA
14 hours ago
Director of Enterprise AI & Forward-Deployed Engineering
UiPath is looking for a Director of Forward Deployed Engineering in Bellevue, Washington. This role leads a senior team responsible for defining strategic customer engagements and scaling AI and automation practices. Candidates should have 12+ years of software engineering...
UiPath
Bellevue, WA
1 day ago
Frontend Engineer, Forward Deployed
...warehouses, vehicles, and field deployments. When robots fail, behave... ...unexpectedly, or need to be improved, engineers rely on data to understand... ...and highly adaptable Forward‑Deployed Engineer to join our... ...generation of robotics and embodied AI. Team: Work with world‑class...
Remote work
Foxglove
Seattle, WA
13 hours ago
Technical Lead Manager, Forward Deployed Engineering
$171k - $311k
...as passionate about your future as we are, join our team. KPMG is currently seeking a Technical Lead Manager, Forward Deployed Engineering to join our AI & Data Labs practice. Responsibilities: Lead a pod of elite, AI-native full-stack engineers with a bias to...
H1b
Local area
KPMG
Seattle, WA
1 day ago
Forward Deployed Engineer
Forward Deployed Engineer About the Role Ravenna is looking for a Forward Deployed Engineer to work directly with customers and ensure they are successful using our platform. This role sits at the intersection of engineering, product, and customer success. You will partner...
Immediate start
Flexible hours
RAVENNA
Seattle, WA
2 days ago
Customer-Driven Forward Deployed Engineer
RAVENNA is looking for a Forward Deployed Engineer in Seattle to drive customer success using their platform. This role involves leading customer deployments, building integrations, and troubleshooting system issues. Candidates should have at least three years of software...
Flexible hours
RAVENNA
Seattle, WA
2 days ago
Senior Forward Deployed Engineer
$126k - $220.5k
...safety. Please be aware that all official communication will only be sent from @ Rippling.com addresses. About the Role Forward Deployed Engineers (FDEs) at Rippling are customer‑facing software engineers who bridge the gap between the complex business problems of our...
Work at office
3 days per week
Rippling
Seattle, WA
4 days ago
Forward Deployed Engineer (FDE) - Seattle
About the team OpenAI’s Forward Deployed Engineering team partners with customers to turn research breakthroughs into production systems. We operate... ...when the stakes are high About OpenAI OpenAI is an AI research and deployment company dedicated to ensuring that...
Work at office
Relocation package
Slope
Seattle, WA
4 days ago
Forward Deployed Engineer
This is Glover's first Forward Deployed Engineering hire. Palantir pioneered the FDE model by embedding engineers directly into the world's hardest... ...Deep sense of ownership. You don't wait for instructions. AI native. LLMs and agents are core tools in your engineering...
Glover Labs
Seattle, WA
1 day ago
Staff + Sr. Software Engineer, Inference Deployment
$320k
...About The Role Our mandate is to make inference deployment boring and unattended. Anthropic serves... ...and unattended. As a Software Engineer on the Launch Engineering team, you will... ...velocity and reliability Proficiency with Kubernetes‑based deployments, rolling update mechanics...
Visa sponsorship
Shift work
Menlo Ventures
Seattle, WA
14 hours ago
Software Engineer Graduate (Inference Infrastructure) - 2026 Start (PhD)
$148.2k - $300.96k
...About the Team The Inference Infrastructure team is... ...maintainer of AIBrix, a Kubernetes-native control plane for... ...external developers to bring AI workloads from research... ..., and are looking for engineers passionate about cloud-... ...solutions using vLLM, SGLang, TensorRT-LLM,...
Temporary work
Local area
ByteDance
Seattle, WA
2 days ago
Staff Software Engineer - Managed Kubernetes
...Cloud, is a leader in AI cloud infrastructure serving... ...We are seeking a Staff Engineer to help our development of our Managed Kubernetes platform. Think GKE,... ...of AI training and inference at scale. As a Staff Engineer... ...load, and multi-model deployment patterns Design self-...
Work at office
Local area
Immediate start
Work from home
Flexible hours
Lambda Corporation
Bellevue, WA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Forward Deployed Engineer, AI Inference (vLLM and Kubernetes). Be the first to apply!