Remote AI Performance Engineer: LLM Throughput & Latency
$100k - $150kBright Vision Technologies
- Remote job
Bright Vision Technologies is seeking an AI Performance Optimization Engineer to enhance performance across training and inference workloads for large neural network systems. This role involves optimizing various AI processes and requires strong skills in Python and C++, along with a deep understanding of ML systems. This full-time remote position offers a competitive salary of $100K - $150K annually based on experience. The ideal candidate will have extensive experience in performance engineering and a solid grasp of GPU architecture. #J-18808-Ljbffr Bright Vision Technologies
- ...Senior AI / LLM Backend Engineer (Python) Publicis Sapient is a digital transformation partner helping... ...pipelines, testing, monitoring, and performance optimization. Collaborate with... ...AI systems at scale, including cost, latency, and reliability optimization. Familiarity...Remote workPerformance
$180k - $250k
...Senior AI Engineer (LLMs / Agents) Boston, Massachusetts (Hybrid / Remote) $180,000 - $250,000 + Equity + Healthcare... ...production-grade LLM systems looking to... ...track model and agent performance *Optimize inference for latency and throughput using vLLM or TensorRT...Remote workPerformance$120k - $170k
...Description Generative AI Engineer Location: Remote (U.S.) Salary Range:... ...Design and build LLM-powered applications, including... ..., and optimize for latency, cost, and performance Continuously... ...reliability, scalability, throughput, and cost efficiency...Remote workPerformance- ...LLM/Prompt-Context Engineer – Fullstack Python (AI Agents, LangGraph, Context Engineering) Location – 1st Atlanta... ...nd Dallas, 3rd Seattle (Onsite no remote). Onsite interview required We... ...personalization, to enhance agent performance. LLM Integration: Integrate,...Remote workPerformance
- ...Senior AI Engineer Location: Camden, NJ (Remote) Duration: Long Term Mandatory skill:... ...Kernel, or LlamaIndex. LLM Integration &... ...Model Serving; optimize latency, throughput, and cost. RAG & Knowledge... ...incident response. Performance & Cost Optimization: Optimize...Remote workPerformance
$197.3k - $225.1k
...Lead AI Engineer (FM Hosting, LLM Inference) Overview At Capital One, we are creating responsible... ...product experiences and scalable, high‑performance AI infrastructure. At Capital One,... ...performance – scalability, cost, latency, throughput – of large‑scale production AI...PerformanceLocal area$229.9k - $262.4k
...Senior Lead AI Engineer (LLM Gateway, FM Hosting) Overview: At Capital One, we are... ...experiences and scalable, high-performance AI infrastructure. At Capital One, you... ...the performance — scalability, cost, latency, throughput — of large scale production AI systems...PerformanceFull timePart timeLocal area$197.3k - $225.1k
...responsible and reliable AI systems, changing... ...applied science and engineering teams to deliver our... ...experiences and scalable, high‑performance AI infrastructure. At... ...state‑of‑the‑art LLM optimization... ...— scalability, cost, latency, throughput — of large scale production...PerformanceFull timePart timeLocal area$197.3k - $225.1k
...Lead AI Engineer (AI Foundations, LLM Customization and Finetuning)Overview At Capital One, we are creating... ...experiences and scalable, high-performance AI infrastructure. At Capital One,... ...performance — scalability, cost, latency, throughput — of large scale production AI...PerformanceFull timePart timeLocal area- ...Generative AI Engineer (LLM Expert) BigRio is a Boston-based, remote-first technology consulting firm specializing in advanced data and software solutions.... ..., and optimize AI-powered applications with high-performance standards and robust infrastructure integration....Remote workPerformance
- ...Seeking a hands-on AI Native Software Engineer to design, build, and deploy production... ...and deploying AI/LLM-based systems in production... ...system-level trade-offs (performance, cost, latency, reliability)... ...system performance (latency, throughput, accuracy, cost) Debug...Remote workPerformance
$100k - $150k
...AI Performance Optimization Engineer Bright Vision Technologies is a forward-thinking... .... Location: 100% Remote (Continental United... ...on extracting maximum throughput, minimizing latency, and reducing cost across... ...speculative decoding for LLM serving. Drive...Remote workPerformanceFull timeH1bVisa sponsorshipWork visa$151.8k
...will join a dynamic AI Infrastructure... ...on enabling high-performance AI across Zoom's... ...the boundary on latency, throughput, and cost. Responsibilities... ...(vLLM, TensorRT-LLM, SGLang, or... ..., Electrical Engineering, or a related technical... ...our offices and remote work environments...Remote workPerformanceWork at office$250k - $350k
...Seattle is looking for a Member of Technical Staff to optimize real-time AI model inference. The ideal candidate will have deep expertise in LLM inference optimization and will work on improving performance across their model stack. The compensation includes a base salary...PerformanceFull time$99k - $225k
...Job Number: R0239413 AI Engineer The Opportunity:... ...deploying, and maintaining LLM-powered systems and complex... ...ownership of system performance, continuously optimizing for latency, throughput, and high reliability... ...during meetings. Remote : If this position is...Remote workPerformanceFull timeContract workPart timeWork at officeLocal area- ...Job Title: Generative AI Engineer (LLM Expert) Location: Remote Employment Type: Part Time / Contract About BigRio BigRio is a Boston-... ...build, and optimize AI-powered applications with high-performance standards and robust infrastructure integration. Key...Remote workPerformanceContract workPart time
$209k
...Learning Platform Engineer Immigration... ...platform performance, scalability, and... ...utilization, and throughput. • Develop dashboards... ...such as latency, accuracy, and... ...high-performance LLM training GPU... ...resource-efficient AI workloads... ...our offices and remote work environments...Remote workPerformanceWork at office1 day per week$197.3k - $225.1k
...Lead AI Engineer (AI Foundations, LLM Customization and Finetuning) Overview At Capital One,... ...product experiences and scalable, high-performance AI infrastructure. At Capital One,... ...performance - scalability, cost, latency, throughput - of large scale production AI...PerformanceFull timePart timeLocal area- Metaschool is seeking an AI Engineer to develop and implement LLM-powered applications using LangChain. The role... ...AI solutions and optimizing performance for cost-efficiency. Ideal candidates... ...equity, health insurance, and more in a remote environment. #J-18808-Ljbffr...Remote jobPerformance
- We are looking for a versatile and experienced AI / LLM Data Engineer to join our team and help shape the future of how Stylitics leverages... ...design Help us build systems to easily monitor and test LLM performance Contribute to production code (Clojure + Python) Analyze...Remote jobPerformanceWork experience placement
$215k - $230k
...trajectory. The AI Engineering Team is chartered... ...pipelines, high-performance infrastructure, and... ...millisecond-level latency, and provide the... ...edge tools in the LLM and agent space —... ...team—onsite and remote—has full visibility... ..., optimizes team throughput with appropriate...Remote workPerformanceLocal area$272k - $425.5k
Principal Software Engineer – Large-Scale LLM Memory and Storage... ...Santa Clara: US, WA, Remote: US, MA, Remotetime... ...Dynamo is a high-throughput, low-latency inference framework... ...serving generative AI and reasoning models... ...Built in Rust for performance and Python for extensibility...Remote workPerformanceLocal area- Agentic AI Engineer Location: Remote / Alexandria, VA Clearance: Eligibility to... ...the capabilities and performance of agentic systems within... ...engineering and fine tuning LLM models. Ability to design... ...models at scale in high‑throughput, low‑latency environments. Benefits...Remote workPerformanceWork at office
$93.68k - $128.81k
...Siemens Healthineers company, is hiring for an AI LLM Engineer. The role is designed to be onsite in Atlanta, Georgia with some remote/hybrid flexibility. Responsibilities AI /... ...Develop evaluation frameworks for prompt performance, answer quality, hallucination mitigation,...Remote workPerformanceTemporary workLocal area- ...Overview We are seeking an experienced AI‑focused Software Engineer to design, build, and scale... ...development Hands‑on experience with LLM frameworks and orchestration tools (LangChain... ...Experience optimizing AI systems for performance and scalability Exposure to MLOps...Remote jobPerformance
- ...Hybrid Department AI ABOUT FATHOM We... ...re hiring a Model Performance Engineer to own the speed,... ...GPU family's tail latency explodes at high concurrency... ...to think about throughput curves. Our team... ...experience with LLM serving frameworks... ...being fully remote. We schedule meetings...Remote workPerformanceFull time
$93.68k - $128.81k
## AI LLM EngineerApplyremote type: Hybridlocations: ATL NPtime type... ...is hiring for an AI LLM Engineer. This role is ideal for experienced... ...Atlanta, Georgia with some remote/hybrid flexibility\***What... ...evaluation frameworks for prompt performance, answer quality,...Remote workPerformanceTemporary workWork at officeLocal area- ...Description Job Description Role: Senior AI Engineer - Agentic Systems and LLM Client Location: Mason, OH 100% Remote Job Description: We are... ...evaluation, reliability, and performance strategies (accuracy, latency, cost) Job Requirements - ~2/3 years...Remote workPerformance
- ...forefront of Generative AI innovation,... ...AI Product Engineer – Agentic Platforms... ...GenAI frameworks and LLM platforms –... ...for repeatability, performance tuning, and regression... .... Monitor cost, latency, throughput, and behavioral... ...work arrangements (remote and/or office-...Remote workPerformanceWork at officeFlexible hours
$168.4k - $220k
Principal Software Engineer - IE06GE The Hartford’s Applied AI COE Team is seeking a Principal... ...AI systems meet performance, latency, throughput, resiliency, recovery,... ...technologies. Experience with LLM orchestration... ...Arrangement Hybrid or remote work arrangement. Candidates...Remote workPerformanceTemporary workWork at office3 days per week
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Remote AI Performance Engineer: LLM Throughput & Latency. Be the first to apply!
- remote customer service chat East Brunswick, NJ
- remote coding part time East Brunswick, NJ
- part time remote medical coder East Brunswick, NJ
- entry level finance remote East Brunswick, NJ
- remote internship accounting East Brunswick, NJ
- part time telecommute East Brunswick, NJ
- remote sales jobs East Brunswick, NJ
- remote finance East Brunswick, NJ
- remote legal writer East Brunswick, NJ
- remote entry level developer East Brunswick, NJ


