Remote AI Performance Engineer: LLM Throughput & Latency

$100k - $150k

Bright Vision Technologies

Remote job

Bright Vision Technologies is seeking an AI Performance Optimization Engineer to enhance performance across training and inference workloads for large neural network systems. This role involves optimizing various AI processes and requires strong skills in Python and C++, along with a deep understanding of ML systems. This full-time remote position offers a competitive salary of $100K - $150K annually based on experience. The ideal candidate will have extensive experience in performance engineering and a solid grasp of GPU architecture. #J-18808-Ljbffr Bright Vision Technologies

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the Remote AI Performance Engineer: LLM Throughput & Latency in East Brunswick, NJ vacancy

Senior AI / LLM Backend Engineer (Python)
...Senior AI / LLM Backend Engineer (Python) Publicis Sapient is a digital transformation partner helping... ...pipelines, testing, monitoring, and performance optimization. Collaborate with... ...AI systems at scale, including cost, latency, and reliability optimization. Familiarity...
Remote work
Performance
Prodigious Worldwide
United States
1 day ago
Senior AI Engineer
$180k - $250k
...Senior AI Engineer (LLMs / Agents) Boston, Massachusetts (Hybrid / Remote) $180,000 - $250,000 + Equity + Healthcare... ...production-grade LLM systems looking to... ...track model and agent performance *Optimize inference for latency and throughput using vLLM or TensorRT...
Remote work
Performance
Client Services
Boston, MA
20 hours ago
Generative AI Engineer
$120k - $170k
...Description Generative AI Engineer Location: Remote (U.S.) Salary Range:... ...Design and build LLM-powered applications, including... ..., and optimize for latency, cost, and performance Continuously... ...reliability, scalability, throughput, and cost efficiency...
Remote work
Performance
Prosum
Dallas, TX
20 hours ago
LLM/Prompt-Context Engineer - Fullstack Python (AI Agents, LangGraph, Context Engineering)
...LLM/Prompt-Context Engineer – Fullstack Python (AI Agents, LangGraph, Context Engineering) Location – 1st Atlanta... ...nd Dallas, 3rd Seattle (Onsite no remote). Onsite interview required We... ...personalization, to enhance agent performance. LLM Integration: Integrate,...
Remote work
Performance
Diversity Nexus
Dallas, TX
1 day ago
Senior AI Engineer
...Senior AI Engineer Location: Camden, NJ (Remote) Duration: Long Term Mandatory skill:... ...Kernel, or LlamaIndex. LLM Integration &... ...Model Serving; optimize latency, throughput, and cost. RAG & Knowledge... ...incident response. Performance & Cost Optimization: Optimize...
Remote work
Performance
Diverse Lynx
United States
3 days ago
Lead AI Engineer (FM Hosting, LLM Inference)
$197.3k - $225.1k
...Lead AI Engineer (FM Hosting, LLM Inference) Overview At Capital One, we are creating responsible... ...product experiences and scalable, high‑performance AI infrastructure. At Capital One,... ...performance – scalability, cost, latency, throughput – of large‑scale production AI...
Performance
Local area
Capital One National Association
New York, NY
20 hours ago
Senior Lead AI Engineer (LLM Gateway, FM Hosting)
$229.9k - $262.4k
...Senior Lead AI Engineer (LLM Gateway, FM Hosting) Overview: At Capital One, we are... ...experiences and scalable, high-performance AI infrastructure. At Capital One, you... ...the performance — scalability, cost, latency, throughput — of large scale production AI systems...
Performance
Full time
Part time
Local area
Capital One
McLean, VA
22 hours ago
Lead AI Engineer (FM Hosting, LLM Inference)
$197.3k - $225.1k
...responsible and reliable AI systems, changing... ...applied science and engineering teams to deliver our... ...experiences and scalable, high‑performance AI infrastructure. At... ...state‑of‑the‑art LLM optimization... ...— scalability, cost, latency, throughput — of large scale production...
Performance
Full time
Part time
Local area
Capital One National Association
McLean, VA
1 day ago
Lead AI Engineer (AI Foundations, LLM Customization and Finetuning)
$197.3k - $225.1k
...Lead AI Engineer (AI Foundations, LLM Customization and Finetuning)Overview At Capital One, we are creating... ...experiences and scalable, high-performance AI infrastructure. At Capital One,... ...performance — scalability, cost, latency, throughput — of large scale production AI...
Performance
Full time
Part time
Local area
Capital One
McLean, VA
1 day ago
Generative AI Engineer (LLM Expert)
...Generative AI Engineer (LLM Expert) BigRio is a Boston-based, remote-first technology consulting firm specializing in advanced data and software solutions.... ..., and optimize AI-powered applications with high-performance standards and robust infrastructure integration....
Remote work
Performance
Saviance
United States
3 days ago
AI Native Software Engineer
...Seeking a hands-on AI Native Software Engineer to design, build, and deploy production... ...and deploying AI/LLM-based systems in production... ...system-level trade-offs (performance, cost, latency, reliability)... ...system performance (latency, throughput, accuracy, cost) Debug...
Remote work
Performance
Rearc
United States
1 day ago
AI Performance Optimization Engineer
$100k - $150k
...AI Performance Optimization Engineer Bright Vision Technologies is a forward-thinking... .... Location: 100% Remote (Continental United... ...on extracting maximum throughput, minimizing latency, and reducing cost across... ...speculative decoding for LLM serving. Drive...
Remote work
Performance
Full time
H1b
Visa sponsorship
Work visa
Bright Vision Technologies
United States
1 day ago
AI Software Engineer
$151.8k
...will join a dynamic AI Infrastructure... ...on enabling high-performance AI across Zoom's... ...the boundary on latency, throughput, and cost. Responsibilities... ...(vLLM, TensorRT-LLM, SGLang, or... ..., Electrical Engineering, or a related technical... ...our offices and remote work environments...
Remote work
Performance
Work at office
Zoom Video Communications
Seattle, WA
3 days ago
Real-Time AI Inference Engineer Low-Latency LLM
$250k - $350k
...Seattle is looking for a Member of Technical Staff to optimize real-time AI model inference. The ideal candidate will have deep expertise in LLM inference optimization and will work on improving performance across their model stack. The compensation includes a base salary...
Performance
Full time
Nuance Labs, Inc.
Seattle, WA
4 days ago
AI Engineer
$99k - $225k
...Job Number: R0239413 AI Engineer The Opportunity:... ...deploying, and maintaining LLM-powered systems and complex... ...ownership of system performance, continuously optimizing for latency, throughput, and high reliability... ...during meetings. Remote : If this position is...
Remote work
Performance
Full time
Contract work
Part time
Work at office
Local area
Booz Allen Hamilton
Fort Belvoir, VA
3 days ago
Generative AI Engineer (LLM Expert)
...Job Title: Generative AI Engineer (LLM Expert) Location: Remote Employment Type: Part Time / Contract About BigRio BigRio is a Boston-... ...build, and optimize AI-powered applications with high-performance standards and robust infrastructure integration. Key...
Remote work
Performance
Contract work
Part time
Saviance
Boston, MA
20 hours ago
Senior AI Engineer
$209k
...Learning Platform Engineer Immigration... ...platform performance, scalability, and... ...utilization, and throughput. • Develop dashboards... ...such as latency, accuracy, and... ...high-performance LLM training GPU... ...resource-efficient AI workloads... ...our offices and remote work environments...
Remote work
Performance
Work at office
1 day per week
Zoom Video Communications
San Jose, CA
3 days ago
Lead AI Engineer (AI Foundations, LLM Customization and Finetuning)
$197.3k - $225.1k
...Lead AI Engineer (AI Foundations, LLM Customization and Finetuning) Overview At Capital One,... ...product experiences and scalable, high-performance AI infrastructure. At Capital One,... ...performance - scalability, cost, latency, throughput - of large scale production AI...
Performance
Full time
Part time
Local area
Capital One Financial Corp
Cambridge, MA
20 hours ago
Remote AI Engineer - LLM & LangChain Architect
Metaschool is seeking an AI Engineer to develop and implement LLM-powered applications using LangChain. The role... ...AI solutions and optimizing performance for cost-efficiency. Ideal candidates... ...equity, health insurance, and more in a remote environment. #J-18808-Ljbffr...
Remote job
Performance
Democrance
New York, NY
4 days ago
Remote Clojure AI / LLM Engineer - Stylitics
We are looking for a versatile and experienced AI / LLM Data Engineer to join our team and help shape the future of how Stylitics leverages... ...design Help us build systems to easily monitor and test LLM performance Contribute to production code (Clojure + Python) Analyze...
Remote job
Performance
Work experience placement
WorksHub
New York, NY
4 days ago
AI Agent Engineer
$215k - $230k
...trajectory. The AI Engineering Team is chartered... ...pipelines, high-performance infrastructure, and... ...millisecond-level latency, and provide the... ...edge tools in the LLM and agent space —... ...team—onsite and remote—has full visibility... ..., optimizes team throughput with appropriate...
Remote work
Performance
Local area
Crypto Pro Network
San Francisco, CA
1 day ago
Principal Software Engineer - Large-Scale LLM Memory and Storage Systems
$272k - $425.5k
Principal Software Engineer – Large-Scale LLM Memory and Storage... ...Santa Clara: US, WA, Remote: US, MA, Remotetime... ...Dynamo is a high-throughput, low-latency inference framework... ...serving generative AI and reasoning models... ...Built in Rust for performance and Python for extensibility...
Remote work
Performance
Local area
NVIDIA Corporation
Santa Clara, CA
2 days ago
Agentic AI Engineer
Agentic AI Engineer Location: Remote / Alexandria, VA Clearance: Eligibility to... ...the capabilities and performance of agentic systems within... ...engineering and fine tuning LLM models. Ability to design... ...models at scale in high‑throughput, low‑latency environments. Benefits...
Remote work
Performance
Work at office
Whitespace, Ltd.
Alexandria, VA
4 days ago
AI LLM Engineer
$93.68k - $128.81k
...Siemens Healthineers company, is hiring for an AI LLM Engineer. The role is designed to be onsite in Atlanta, Georgia with some remote/hybrid flexibility. Responsibilities AI /... ...Develop evaluation frameworks for prompt performance, answer quality, hallucination mitigation,...
Remote work
Performance
Temporary work
Local area
591x Varian Medical Systems, Inc.
Newport, RI
2 days ago
Senior AI Software Engineer (Agentic Systems & LLM Applications) | W2 Only | REMOTE |
...Overview We are seeking an experienced AI‑focused Software Engineer to design, build, and scale... ...development Hands‑on experience with LLM frameworks and orchestration tools (LangChain... ...Experience optimizing AI systems for performance and scalability Exposure to MLOps...
Remote job
Performance
Xlysi LLC.
Chicago, IL
3 days ago
AI Engineer - Model Performance
...Hybrid Department AI ABOUT FATHOM We... ...re hiring a Model Performance Engineer to own the speed,... ...GPU family's tail latency explodes at high concurrency... ...to think about throughput curves. Our team... ...experience with LLM serving frameworks... ...being fully remote. We schedule meetings...
Remote work
Performance
Full time
Pantera Capital
San Francisco, CA
11 hours ago
AI LLM Engineer
$93.68k - $128.81k
## AI LLM EngineerApplyremote type: Hybridlocations: ATL NPtime type... ...is hiring for an AI LLM Engineer. This role is ideal for experienced... ...Atlanta, Georgia with some remote/hybrid flexibility\***What... ...evaluation frameworks for prompt performance, answer quality,...
Remote work
Performance
Temporary work
Work at office
Local area
Siemens Healthineers AG
Atlanta, GA
3 days ago
Senior AI Engineer - Agentic Systems and LLM
...Description Job Description Role: Senior AI Engineer - Agentic Systems and LLM Client Location: Mason, OH 100% Remote Job Description: We are... ...evaluation, reliability, and performance strategies (accuracy, latency, cost) Job Requirements - ~2/3 years...
Remote work
Performance
Vytwo
Prosper, TX
3 days ago
AI Product Engineer - Agentic AI Platforms (Financial Services)
...forefront of Generative AI innovation,... ...AI Product Engineer – Agentic Platforms... ...GenAI frameworks and LLM platforms –... ...for repeatability, performance tuning, and regression... .... Monitor cost, latency, throughput, and behavioral... ...work arrangements (remote and/or office-...
Remote work
Performance
Work at office
Flexible hours
Capgemini
Dallas, TX
1 day ago
Principal AI Engineer - Agent Ops / SRE
$168.4k - $220k
Principal Software Engineer - IE06GE The Hartford’s Applied AI COE Team is seeking a Principal... ...AI systems meet performance, latency, throughput, resiliency, recovery,... ...technologies. Experience with LLM orchestration... ...Arrangement Hybrid or remote work arrangement. Candidates...
Remote work
Performance
Temporary work
Work at office
3 days per week
The Hartford
Hartford, CT
20 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Remote AI Performance Engineer: LLM Throughput & Latency. Be the first to apply!