LLM Inference Deployment Engineer
$180k - $240kEnCharge AI
LLM Inference Deployment Engineer
U.S.-Remote, Canada
EnCharge AI is a leader in advanced AI hardware and software systems for edge-to-cloud computing. EnCharge's robust and scalable next-generation in-memory computing technology provides orders-of-magnitude higher compute efficiency and density compared to today's best-in-class solutions. The high-performance architecture is coupled with seamless software integration and will enable the immense potential of AI to be accessible in power, energy, and space constrained applications. EnCharge AI launched in 2022 and is led by veteran technologists with backgrounds in semiconductor design and AI systems.
About the Role
EnCharge AI is seeking an LLM Inference Deployment Engineer to optimize, deploy, and scale large language models (LLMs) for high-performance inference on its energy efficient AI accelerators. You will work at the intersection of AI frameworks, model optimization, and runtime execution to ensure efficient model execution and low-latency AI inference.
Responsibilities
- Deploy and optimize LLMs (GPT, LLaMA, Mistral, Falcon, etc.) post-training from libraries like HuggingFace
- Utilize inference runtimes such as ONNX Runtime, vLLM for efficient execution.
- Optimize batching, caching, and tensor parallelism to improve LLM scalability in real-time applications.
- Develop and maintain high-performance inference pipelines using Docker, Kubernetes, and other inference servers.
Qualifications
- Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field.
- Experience in LLM inference deployment, model optimization, and runtime engineering.
- Strong expertise in LLM inference frameworks (PyTorch, ONNX Runtime, vLLM, TensorRT-LLM, DeepSpeed).
- In-depth knowledge of the Python programming language for model integration and performance tuning.
- Strong understanding of high-level model representations and experience implementing framework-level optimizations for Generative AI use cases
- Experience with containerized AI deployments (Docker, Kubernetes, Triton Inference Server, TensorFlow Serving, TorchServe).
- Strong knowledge of LLM memory optimization strategies for long-context applications.
- Experience with real-time LLM applications (chatbots, code generation, retrieval-augmented generation).
EnchargeAI is an equal employment opportunity employer in the United States.
The salary range for this position is $180,000 to $240,000 USD ($175,000 to $245,000 CAD) per year. Actual compensation offered will be determined based on job-related knowledge, skills, and experience.
$150k - $270k
A leading developer platform company in San Francisco seeks a talented engineer to collaborate with clients on LLM applications. The role requires 3+ years in a technical role, with a strong emphasis on customer engagement and solution architecture. You will provide training...SuggestedFlexible hours- ...the Role We are looking for a Forward Deployment Engineer (FDE) to work directly with customers to design, deploy, and validate inference & reinforcement learning POCs on GMI's... ...customer POCs end-to-end: deploy and optimize LLM inference, RL training, and post-training...SuggestedH1bVisa sponsorship
- ...A leading technology firm is seeking a Forward Deployed Engineer to work closely with enterprise clients, deploying and optimizing their AI platform. The role requires extensive experience in AI/LLM pipelines and proficiency in Python, Node.js, or TypeScript. Applicants...SuggestedRemote work
- A technology company based in Los Angeles seeks a Forward Deployed Engineer to lead deployments of AI models in production. You will collaborate with customers to design and deliver complex systems while ensuring impactful adoption. The ideal candidate will have over five...SuggestedRelocation package
$320k
Menlo Ventures is seeking a Software Engineer to design and build deployment infrastructure to maximize deployment throughput across resource-constrained GPUs, TPUs, and Trainiums. This role requires substantial experience in deployment systems and strong Kubernetes proficiency...Suggested- ...(frequent customer interaction) Team: Inference & Reinforcement Learning Platform About... ...the Role We're looking for a Forward Deployment Engineer (FDE) to work directly with customers and... ...POCs end-to-end Deploy and optimize LLM inference , RL training , and post-training...
- ...Analytics is looking for experienced Forward Deployment Engineer (Generative AI) with Gen AI experience... ...optimize large-scale Gen AI models and LLM orchestration frameworks within... ...safety, prompt engineering patterns, and inference cost optimization. Product Collaboration...Local area
- A forward-thinking AI startup in New York is seeking a Senior Forward Deployed Engineer to lead deployment projects in enterprise environments. This role involves engineering solutions for complex technical challenges and collaborating with customer engineering teams. Candidates...
- ...looking to hire a Principal AI Forward Deployed Engineer. The Principal AI Forward Deployed Engineer... ...for real‑time data processing and AI inference. Take full ownership of end‑to‑end... ...Experience building generative AI applications, LLM integrations, or agentic AI solutions....Work at officeWork from homeRelocationMonday to Thursday
$114.1k - $214.95k
...Forward Deployment Engineer You will be joining the newly formed Forward Deployment Engineering (FDE) team within Adobe's Digital Experience... ...using frameworks like Spring Boot. ~ Familiarity with LLM Development: You have used APIs (OpenAI, Anthropic, Gemini) to...Temporary workLocal areaImmediate startWorldwide- Job Summary As a Senior Forward Deployed Engineer, you will be at the forefront of enterprise AI adoption, delivering advanced technology in complex... ...Qualifications Experience building or integrating AI/LLM‑powered systems (e.g., embedding agent workflows, vector databases...
- ...ubiquitous. We build the foundation for agent engineering in the real world, helping developers... ...a platform for building, evaluating, deploying, and operating agents at scale. Today,... ...closely with companies on the frontier of LLM-applications, bringing ideas to...Flexible hours
- ...’s first healthcare‑only, safety‑focused LLM — a breakthrough platform designed to transform... ...Role We're looking for an Integration Engineer to help bridge healthcare data systems... ...containerization (Docker), or cloud‑based deployment pipelines. Background in AI/ML data...
$304k - $338k
Head of Forward Deployment Engineer (FDE) New York Office YOUR MISSION: As Head of Forward Deployed Engineering, AMER (FDE) at Parloa, you own... ...architecture (PSTN, SIP), building or integrating AI/LLM-powered systems (agent workflows, orchestration, evaluation,...Work at officeFlexible hoursNight shift- ...Forward Deployed Engineer Required Travel: Minimal Location: Hybrid Jersey City, NJ; Alpharetta, GA; Plano, TX Amdocs helps the world's... ...software engineering experience ~ Hands-on experience with GenAI / LLM-based systems and complex enterprise-scale system integration...Worldwide
- ...evaluations and feedback loops. The Role We\'re looking for a Deployment Architect / Forward Deployed Engineer to own the path from "first customer call" to "signed... ...can read a codebase, write a Python script, debug an LLM prompt, sketch an integration, and hold your own with...Contract workLive inDay shift
- ...re building the world’s first healthcare‑only, safety‑focused LLM — a breakthrough platform designed to transform patient... ...unless otherwise specified. About the Role We're seeking a Agent Deployment Engineer to join our collaborative team of engineers, scientists, and...Work at office
- ## Staff Forward Deployment EngineerApplyremote type: Hybridlocations: United States - Boston... ...the product doesn't do yet. With Evo's engineering team, you bring back what you've learned... ....* You've built something with LLM SDKs, MCP, or agent frameworks.## It'd be...Work at officeWork from homeFlexible hours
- ...that enables enterprises to specialize and deploy LLMs into production with measurable... ...models at scale - pioneering task-specific LLM development and running production-ready... ...customers. About the role As a DevOps Engineer in our Product Staff, you will help...Live inWork at officeRemote workRelocationVisa sponsorship
- ...AI Deployment Engineer The AI Deployment Engineering team is responsible for ensuring the safe and effective deployment of Generative AI applications... ...proficient in Python, JavaScript, and a strong grasp of AI/LLM best practices. Built and/or delivered prototypes on top...Work at officeRemote workRelocation package
$120k - $150k
...Lyra Tech Group portfolio company, is looking for a Forward Deployed AI Engineer to work directly inside their clients' businesses - rolling up... ...APIs and cloud services • Hands-on experience building with LLM APIs (OpenAI, Anthropic, Gemini, or similar) •...Work at office$70 - $80 per hour
...Great Place to Work® certification year after year. Forward Deployment Engineer \n Primary Skills AWS, Python, and Claude (and related AI... ...and deploying solutions on AWS. Experience building LLM-powered applications and workflows, including practical experience...Contract workRemote work- ...Senior Desktop Deployment Engineer Location: Washington, DC Rate: DOE $/hr. on W2 Position Type: Contract Interview Process: Phone Followed by F2F US Citizen, Green Card and GC EAD Job Description: We have a position at the IMF we are trying to fill, a Senior...Contract workWork at office
$220k - $280k
AI Deployment Engineer- Startups Technical Success - San Francisco We are seeking a technically proficient, business-minded AI Deployment Engineer... ...proficient in Python, JavaScript, and a strong grasp of AI/LLM best practices. Built and/or delivered prototypes on top of...Work at officeRelocation package$150k - $200k
EmergencyMD is seeking an AI Deployment Engineer based in New York. This role focuses on designing and deploying AI-powered applications, integrating... ..., with proficiency in Python and SQL, and experience with LLM systems. The position offers a competitive salary of $150k-$2...- ...thinking technology firm in New York is looking for a Forward Deployed Engineer to integrate cutting-edge AI systems into client workflows. This... ...in production software development, particularly with LLM-powered applications. Competitive compensation is offered, reflecting...
$150k - $200k
AI Deployment Engineer AI & Automation Practice Stable Rock is hiring an AI Deployment Engineer to help build and scale a new AI & Automation... ...production systems using Python, APIs, databases, and modern LLM tooling. Integrate with CRMs, ERPs, accounting systems, internal...Immediate startFlexible hours- A leading AI technology firm is looking for a passionate individual to collaborate with companies on production-ready LLM applications. The ideal candidate has over 3 years of experience in a technical role and enjoys working closely with customers. Responsibilities include...
- HRB is seeking a Senior Forward Deployed Engineer in Redwood City to deliver voice AI solutions to enterprise clients. This role involves leading... ...communication, writes production-grade code, and understands LLM-powered systems. This is an exciting opportunity to shape the...
- A leading telecommunications company is seeking a Cellular Deployment Engineer to manage the hands-on deployment and troubleshooting of licensed spectrum products. This role requires strong experience with Ericsson systems and offers competitive compensation, including...Remote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to LLM Inference Deployment Engineer. Be the first to apply!
- software deployment engineer United States
- network deployment engineer United States
- deployment specialist United States
- deployment project manager United States
- deployment manager United States
- desktop deployment technician United States
- deployment technician United States
- senior deployment engineer
- software deployment engineer
- desktop deployment engineer

