Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Forward Deployed Engineer, AI Inference (vLLM and Kubernetes)

$184.94k - $342.49k

Red Hat

Overview The vLLM and LLM-D Engineering team at Red Hat is looking for a customer obsessed developer to join our team as a Forward Deployed Engineer. In this role, you will not just build software; you will be the bridge between our cutting-edge inference platform (LLM-D, and vLLM) and our customers' most critical production environments. You will interface directly with the engineering teams at our customer to deploy, optimize, and scale distributed Large Language Model (LLM) inference systems. You will solve "last mile" infrastructure challenges that defy off-the-shelf solutions, ensuring that massive models run with low latency and high throughput on complex Kubernetes clusters. This is not a sales engineering role, you will be part of the core vLLM and LLM-D engineering team. What You Will Do Orchestrate Distributed Inference: Deploy and configure LLM-D and vLLM on Kubernetes clusters. Set up and configure advanced deployment like disaggregated serving, KV-cache aware routing, KV Cache offloading etc to maximize hardware utilization. Optimize for Production: Go beyond standard deployments by running performance benchmarks, tuning vLLM parameters, and configuring intelligent inference routing policies to meet SLOs for latency and throughput. Focus on Time Per Output Token (TPOT), GPU utilization, GPU networking optimizations, and Kubernetes scheduler efficiency. Code Side-by-Side: Work directly with customer engineers to write production-quality code (Python/Go/YAML) that integrates our inference engine into their existing Kubernetes ecosystem. Solve the "Unsolvable": Debug complex interaction effects between specific model architectures (e.g., MoE, large context windows), hardware accelerators (NVIDIA GPUs, AMD GPUs, TPUs), and Kubernetes networking (Envoy/ISTIO). Feedback Loop: Act as the "Customer Zero" for our core engineering teams. Channel field learnings back to product development, influencing the roadmap for LLM-D and vLLM features. Travel: Travel only as needed to customers to present, demo, or help execute proof-of-concepts. What You Will Bring 8+ Years of Engineering Experience: You have a decade-long track record in Backend Systems, SRE, or Infrastructure Engineering. Customer Fluency: You speak both "Systems Engineering" and "Business Value". Bias for Action: You prefer rapid prototyping and iteration over theoretical perfection, and you are comfortable operating in ambiguity and taking ownership of the outcome. Deep Kubernetes Expertise: Fluent in K8s primitives, from defining custom resources (CRDs, Operators, Controllers) to configuring modern ingress via the Gateway API. Experience with stateful workloads and high-performance networking, including tuning scheduler logic (affinity/tolerations) for GPU workloads and troubleshooting complex CNI failures. AI Inference Proficiency: Understand how a LLM forward pass works. Know KV Caching, prefill/decode disaggregation, context length impacts, and how continuous batching works in vLLM. Systems Programming: Proficiency in Python (model interfaces) and Go (Kubernetes controllers/scheduler logic). Infrastructure as Code: Experience with Helm, Terraform, or similar tools for reproducible deployments. Cloud & GPU Hardware Fluency: Comfortable spinning up clusters and deploying LLMs on bare-metal and hyperscaler Kubernetes clusters. Following is considered a plus: Experience contributing to open-source AI infrastructure projects (e.g., KServe, vLLM, Kubernetes); Knowledge of Envoy Proxy or Inference Gateway (IGW); Familiarity with model optimization techniques like Quantization (AWQ, GPTQ) and Speculative Decoding. Compensation and Benefits The salary range for this position is $184,940.00 - $342,490.00. Actual offer will be based on your qualifications. Pay Transparency: Red Hat determines compensation based on several factors including but not limited to job location, experience, applicable skills and training, external market value, and internal pay equity. Annual salary is one component of Red Hat’s compensation package. This position may also be eligible for bonus, commission, and/or equity. For positions with Remote-US locations, the actual salary range for the position may differ based on location but will be commensurate with job duties and relevant work experience. About Red Hat Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40+ countries, our associates work flexibly across work environments, from in-office, to office-flex, to fully remote, depending on the requirements of their role. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We\'re a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact. Benefits Comprehensive medical, dental, and vision coverage Flexible Spending Account - healthcare and dependent care Health Savings Account - high deductible medical plan Retirement 401(k) with employer match Paid time off and holidays Paid parental leave plans for all new parents Leave benefits including disability, paid family medical leave, and paid military leave Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more! Note: These benefits are only applicable to full time, permanent associates at Red Hat located in the United States. Inclusion and Equal Opportunity Equal Opportunity Policy (EEO): Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law. Red Hat supports individuals with disabilities and provides reasonable accommodations to job applicants. If you need assistance completing our online job application, email View email address on click.appcast.io. General inquiries regarding the status of a job application will not receive a reply. #J-18808-Ljbffr

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Forward Deployed Engineer, AI Inference (vLLM and Kubernetes) in Seattle, WA vacancy
  •  ...Red Hat, LLC is seeking a Forward Deployed Engineer to enhance their LLM-D and vLLM platforms. You will be responsible for deploying and optimizing distributed inference systems on Kubernetes, working closely with customer teams. The ideal candidate has extensive experience... 
    Suggested

    Red Hat

    Seattle, WA
    13 hours ago
  • $135k - $200k

     ...missing children, and more. The Role We are seeking a Forward Deployed Software Engineer to join a newly-formed team focused on developing advanced...  ...Autonomy C2 solutions into Palantir platform and with AI and autonomy software solutions such sensor and data fusion... 
    Suggested
    Work experience placement
    Work at office
    Remote work
    Work from home
    Relocation package

    Palantir Technologies

    Seattle, WA
    2 days ago
  •  ...on a mission to reinvent AI inference infrastructure from the...  ...Infrastructure Software Engineer to own and evolve the cloud and Kubernetes backbone behind our Token...  ...scale predictably, and deploy seamlessly across managed...  ...runtimes (e.g., Triton, vLLM, TGI) and model-serving... 
    Suggested
    Work at office
    Flexible hours
    3 days per week

    ElastixAI INC.

    Seattle, WA
    13 hours ago
  • Spice.ai is seeking a Forward Deployed Engineer to embed with customers in Seattle, WA, and optimize Spice.ai deployments. The role requires expertise in deploying data and AI systems, with 5+ years of engineering experience preferred. Excellent communication and SQL skills... 
    Suggested

    spice.ai

    Seattle, WA
    13 hours ago
  • $164k - $313.3k

     ...Systems & Efficiency Engineer to join our R&D...  ...in inference performance, latency...  ...Intelligence (AI), ML systems, and...  ...aware ML systems deployed in production....  ...(e.g., Triton, vLLM, SGLang, xDiT,...  ...workflows (Docker, Kubernetes) and job...  ...application may not move forward in the process.... 
    Suggested
    Temporary work
    Local area
    Worldwide

    Adobe

    Seattle, WA
    2 days ago
  • $171.6k - $258.1k

     ...States Software and Services AI systems are only as...  ...foundational. Join Apple Services Engineering to build the next generation...  ...interaction.We are looking for a Lead Forward Deployed Engineer (FDE) to lead the...  ...(datasets, training vs. inference, evaluation metrics) and can... 
    Relocation

    Apple Inc.

    Seattle, WA
    1 day ago
  •  ...About the team OpenAI’s Forward Deployed Engineering team partners with customers to turn research breakthroughs into production systems. We operate...  ...that solves their problem. About OpenAI OpenAI is an AI research and deployment company dedicated to ensuring that... 
    Internship
    Work at office
    Relocation package

    Centaur Labs

    Seattle, WA
    14 hours ago
  •  ...logistics and operational decision-making through AI-enabled predictive software. Built by a team combining elite software engineering talent with deep operational domain...  .... The Opportunity We are seeking Forward Deployed Software Engineers in Seattle, New York City... 

    HRB

    Seattle, WA
    14 hours ago
  • $140k - $165k

     ...Powered by RapidSOS HARMONY, the industry’s first purpose‑built AI for public safety, RapidSOS empowers first responders with...  ...lives. Learn more at What This Role Is About The Senior Forward Deployed Engineer (FDE) is a senior, client‑facing technical expert who serves... 
    Flexible hours

    RapidSOS

    Seattle, WA
    1 day ago
  •  ...Building data-driven AI applications and agents...  ...piece together query engines, search systems, caches...  ...and fully managed cloud deployments. For a deeper dive...  ...This role requires a Forward Deployed Engineer to embed...  ..., search, and AI inference. Identify expansion... 
    Work at office
    Remote work

    Spice AI

    Bellevue, WA
    20 days ago
  •  ...Job Description Job Description Aircall is a unicorn, AI-powered customer communications platform used by 22,000+ companies...  ...’ll feel at home here. About the Team   Aircall’s Forward Deployed Engineering team connects our innovative AI Agents technology to real-... 
    Full time
    Worldwide

    Aircall

    Seattle, WA
    15 days ago
  • HRB is looking for Forward Deployed Software Engineers to join their mission-driven teams in New York City. This role offers significant responsibility...  ...Computer Science. Join a team committed to operational excellence and cutting-edge AI solutions. #J-18808-Ljbffr HRB

    HRB

    Seattle, WA
    4 days ago
  • $152k - $240k

    # Software Engineer, Forward Deployed Agent Builder#### Seattle, Washington, United StatesSoftware Engineer, Forward Deployed Agent Builder**Engineering...  ...builders become leaders.**What you’ll do**We're building AI agents to automate and augment internal functions at Brex,... 
    Work at office
    Remote work
    Work from home

    Brex

    Seattle, WA
    13 hours ago
  • $242k - $290k

     ...Model Optimization & Deployment Engineer The Perception team is pioneering the development of...  ...CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic...  ...and maximize memory bandwidth on AI accelerators. Write production-level... 
    Temporary work
    Relocation package

    Zoox

    Seattle, WA
    3 days ago
  • Accenture is seeking a Forward Deployed Scientific Engineer in Kirkland, Washington. This role involves leveraging AI tools and scientific informatics expertise to develop solutions for laboratory scientists. Candidates must have at least 3 years of experience and a Bachelor... 

    Accenture

    Kirkland, WA
    2 days ago
  • $85.88k - $193.43k

    KPMG Careers is seeking a Technical Developer, Forward Deployed Engineer in Seattle to collaborate with elite AI-native full-stack engineers. The role requires at least 7 years of experience in secure software production, with a focus on data and AI applications. You'll... 

    KPMG Careers

    Seattle, WA
    4 days ago
  • $150k - $190k

     ...service workflows. Join our Agent Deployment team to play a critical role in...  ...customer interactions every day. As a Forward Deployed AI Engineer, you'll directly influence PolyAI's...  ...microservice deployments using Docker and Kubernetes. ~ Native-level proficiency in... 
    Work from home
    Flexible hours

    PolyAI

    Seattle, WA
    18 days ago
  •  ...ElastixAI INC. in Seattle seeks an Inference Infrastructure Software Engineer to manage the cloud and Kubernetes backbone behind their Token-as-a-Service platform. The ideal...  ...the opportunity to work at the forefront of AI technology in a collaborative environment. #J-1... 

    ElastixAI INC.

    Seattle, WA
    14 hours ago
  •  ...caliber Site Reliability Engineer (SRE) to join our Forward Engineering team....  ..., data-driven AI platforms remain resilient...  ...strategies for Kubernetes (GKE) to handle fluctuating...  ...and high-volume inference. 2. MLOps & AI Infrastructure...  ...and optimize robust deployment pipelines for both... 
    Local area

    Tiger Analytics

    Seattle, WA
    14 hours ago
  • UiPath is looking for a Director of Forward Deployed Engineering in Bellevue, Washington. This role leads a senior team responsible for defining strategic customer engagements and scaling AI and automation practices. Candidates should have 12+ years of software engineering... 

    UiPath

    Bellevue, WA
    1 day ago
  •  ...warehouses, vehicles, and field deployments. When robots fail, behave...  ...unexpectedly, or need to be improved, engineers rely on data to understand...  ...and highly adaptable Forward‑Deployed Engineer to join our...  ...generation of robotics and embodied AI. Team: Work with world‑class... 
    Remote work

    Foxglove

    Seattle, WA
    13 hours ago
  • $171k - $311k

     ...as passionate about your future as we are, join our team. KPMG is currently seeking a Technical Lead Manager, Forward Deployed Engineering to join our AI & Data Labs practice. Responsibilities: Lead a pod of elite, AI-native full-stack engineers with a bias to... 
    H1b
    Local area

    KPMG

    Seattle, WA
    1 day ago
  • Forward Deployed Engineer About the Role Ravenna is looking for a Forward Deployed Engineer to work directly with customers and ensure they are successful using our platform. This role sits at the intersection of engineering, product, and customer success. You will partner... 
    Immediate start
    Flexible hours

    RAVENNA

    Seattle, WA
    2 days ago
  • RAVENNA is looking for a Forward Deployed Engineer in Seattle to drive customer success using their platform. This role involves leading customer deployments, building integrations, and troubleshooting system issues. Candidates should have at least three years of software... 
    Flexible hours

    RAVENNA

    Seattle, WA
    2 days ago
  • $126k - $220.5k

     ...safety. Please be aware that all official communication will only be sent from @ Rippling.com addresses. About the Role Forward Deployed Engineers (FDEs) at Rippling are customer‑facing software engineers who bridge the gap between the complex business problems of our... 
    Work at office
    3 days per week

    Rippling

    Seattle, WA
    4 days ago
  • About the team OpenAI’s Forward Deployed Engineering team partners with customers to turn research breakthroughs into production systems. We operate...  ...when the stakes are high About OpenAI OpenAI is an AI research and deployment company dedicated to ensuring that... 
    Work at office
    Relocation package

    Slope

    Seattle, WA
    4 days ago
  • This is Glover's first Forward Deployed Engineering hire. Palantir pioneered the FDE model by embedding engineers directly into the world's hardest...  ...Deep sense of ownership. You don't wait for instructions. AI native. LLMs and agents are core tools in your engineering... 

    Glover Labs

    Seattle, WA
    1 day ago
  • $320k

     ...About The Role Our mandate is to make inference deployment boring and unattended. Anthropic serves...  ...and unattended. As a Software Engineer on the Launch Engineering team, you will...  ...velocity and reliability Proficiency with Kubernetes‑based deployments, rolling update mechanics... 
    Visa sponsorship
    Shift work

    Menlo Ventures

    Seattle, WA
    14 hours ago
  • $148.2k - $300.96k

     ...About the Team The Inference Infrastructure team is...  ...maintainer of AIBrix, a Kubernetes-native control plane for...  ...external developers to bring AI workloads from research...  ..., and are looking for engineers passionate about cloud-...  ...solutions using vLLM, SGLang, TensorRT-LLM,... 
    Temporary work
    Local area

    ByteDance

    Seattle, WA
    2 days ago
  •  ...Cloud, is a leader in AI cloud infrastructure serving...  ...We are seeking a Staff Engineer to help our development of our Managed Kubernetes platform. Think GKE,...  ...of AI training and inference at scale. As a Staff Engineer...  ...load, and multi-model deployment patterns Design self-... 
    Work at office
    Local area
    Immediate start
    Work from home
    Flexible hours

    Lambda Corporation

    Bellevue, WA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Forward Deployed Engineer, AI Inference (vLLM and Kubernetes). Be the first to apply!