Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Remote AI Performance Engineer: LLM Throughput & Latency

$100k - $150k

Bright Vision Technologies

Bright Vision Technologies is seeking an AI Performance Optimization Engineer to enhance performance across training and inference workloads for large neural network systems. This role involves optimizing various AI processes and requires strong skills in Python and C++, along with a deep understanding of ML systems. This full-time remote position offers a competitive salary of $100K - $150K annually based on experience. The ideal candidate will have extensive experience in performance engineering and a solid grasp of GPU architecture. #J-18808-Ljbffr Bright Vision Technologies

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Remote AI Performance Engineer: LLM Throughput & Latency in East Brunswick, NJ vacancy
  •  ...Senior AI / LLM Backend Engineer (Python) Publicis Sapient is a digital transformation partner helping...  ...pipelines, testing, monitoring, and performance optimization. Collaborate with...  ...AI systems at scale, including cost, latency, and reliability optimization. Familiarity... 
    Remote work
    Performance

    Prodigious Worldwide

    United States
    1 day ago
  • $180k - $250k

     ...Senior AI Engineer (LLMs / Agents) Boston, Massachusetts (Hybrid / Remote) $180,000 - $250,000 + Equity + Healthcare...  ...production-grade LLM systems looking to...  ...track model and agent performance *Optimize inference for latency and throughput using vLLM or TensorRT... 
    Remote work
    Performance

    Client Services

    Boston, MA
    20 hours ago
  • $120k - $170k

     ...Description Generative AI Engineer Location: Remote (U.S.) Salary Range:...  ...Design and build LLM-powered applications, including...  ..., and optimize for latency, cost, and performance Continuously...  ...reliability, scalability, throughput, and cost efficiency... 
    Remote work
    Performance

    Prosum

    Dallas, TX
    20 hours ago
  •  ...LLM/Prompt-Context Engineer – Fullstack Python (AI Agents, LangGraph, Context Engineering) Location – 1st Atlanta...  ...nd Dallas, 3rd Seattle (Onsite no remote). Onsite interview required We...  ...personalization, to enhance agent performance. LLM Integration: Integrate,... 
    Remote work
    Performance

    Diversity Nexus

    Dallas, TX
    1 day ago
  •  ...Senior AI Engineer Location: Camden, NJ (Remote) Duration: Long Term Mandatory skill:...  ...Kernel, or LlamaIndex. LLM Integration &...  ...Model Serving; optimize latency, throughput, and cost. RAG & Knowledge...  ...incident response. Performance & Cost Optimization: Optimize... 
    Remote work
    Performance

    Diverse Lynx

    United States
    3 days ago
  • $197.3k - $225.1k

     ...Lead AI Engineer (FM Hosting, LLM Inference) Overview At Capital One, we are creating responsible...  ...product experiences and scalable, high‑performance AI infrastructure. At Capital One,...  ...performance – scalability, cost, latency, throughput – of large‑scale production AI... 
    Performance
    Local area

    Capital One National Association

    New York, NY
    20 hours ago
  • $229.9k - $262.4k

     ...Senior Lead AI Engineer (LLM Gateway, FM Hosting) Overview: At Capital One, we are...  ...experiences and scalable, high-performance AI infrastructure. At Capital One, you...  ...the performance — scalability, cost, latency, throughput — of large scale production AI systems... 
    Performance
    Full time
    Part time
    Local area

    Capital One

    McLean, VA
    22 hours ago
  • $197.3k - $225.1k

     ...responsible and reliable AI systems, changing...  ...applied science and engineering teams to deliver our...  ...experiences and scalable, high‑performance AI infrastructure. At...  ...state‑of‑the‑art LLM optimization...  ...— scalability, cost, latency, throughput — of large scale production... 
    Performance
    Full time
    Part time
    Local area

    Capital One National Association

    McLean, VA
    1 day ago
  • $197.3k - $225.1k

     ...Lead AI Engineer (AI Foundations, LLM Customization and Finetuning)Overview At Capital One, we are creating...  ...experiences and scalable, high-performance AI infrastructure. At Capital One,...  ...performance — scalability, cost, latency, throughput — of large scale production AI... 
    Performance
    Full time
    Part time
    Local area

    Capital One

    McLean, VA
    1 day ago
  •  ...Generative AI Engineer (LLM Expert) BigRio is a Boston-based, remote-first technology consulting firm specializing in advanced data and software solutions....  ..., and optimize AI-powered applications with high-performance standards and robust infrastructure integration.... 
    Remote work
    Performance

    Saviance

    United States
    3 days ago
  •  ...Seeking a hands-on AI Native Software Engineer to design, build, and deploy production...  ...and deploying AI/LLM-based systems in production...  ...system-level trade-offs (performance, cost, latency, reliability)...  ...system performance (latency, throughput, accuracy, cost) Debug... 
    Remote work
    Performance

    Rearc

    United States
    1 day ago
  • $100k - $150k

     ...AI Performance Optimization Engineer Bright Vision Technologies is a forward-thinking...  .... Location: 100% Remote (Continental United...  ...on extracting maximum throughput, minimizing latency, and reducing cost across...  ...speculative decoding for LLM serving. Drive... 
    Remote work
    Performance
    Full time
    H1b
    Visa sponsorship
    Work visa

    Bright Vision Technologies

    United States
    1 day ago
  • $151.8k

     ...will join a dynamic AI Infrastructure...  ...on enabling high-performance AI across Zoom's...  ...the boundary on latency, throughput, and cost. Responsibilities...  ...(vLLM, TensorRT-LLM, SGLang, or...  ..., Electrical Engineering, or a related technical...  ...our offices and remote work environments... 
    Remote work
    Performance
    Work at office

    Zoom Video Communications

    Seattle, WA
    3 days ago
  • $250k - $350k

     ...Seattle is looking for a Member of Technical Staff to optimize real-time AI model inference. The ideal candidate will have deep expertise in LLM inference optimization and will work on improving performance across their model stack. The compensation includes a base salary... 
    Performance
    Full time

    Nuance Labs, Inc.

    Seattle, WA
    4 days ago
  • $99k - $225k

     ...Job Number: R0239413 AI Engineer The Opportunity:...  ...deploying, and maintaining LLM-powered systems and complex...  ...ownership of system performance, continuously optimizing for latency, throughput, and high reliability...  ...during meetings. Remote : If this position is... 
    Remote work
    Performance
    Full time
    Contract work
    Part time
    Work at office
    Local area

    Booz Allen Hamilton

    Fort Belvoir, VA
    3 days ago
  •  ...Job Title: Generative AI Engineer (LLM Expert) Location: Remote Employment Type: Part Time / Contract About BigRio BigRio is a Boston-...  ...build, and optimize AI-powered applications with high-performance standards and robust infrastructure integration. Key... 
    Remote work
    Performance
    Contract work
    Part time

    Saviance

    Boston, MA
    20 hours ago
  • $209k

     ...Learning Platform Engineer Immigration...  ...platform performance, scalability, and...  ...utilization, and throughput. • Develop dashboards...  ...such as latency, accuracy, and...  ...high-performance LLM training GPU...  ...resource-efficient AI workloads...  ...our offices and remote work environments... 
    Remote work
    Performance
    Work at office
    1 day per week

    Zoom Video Communications

    San Jose, CA
    3 days ago
  • $197.3k - $225.1k

     ...Lead AI Engineer (AI Foundations, LLM Customization and Finetuning) Overview At Capital One,...  ...product experiences and scalable, high-performance AI infrastructure. At Capital One,...  ...performance - scalability, cost, latency, throughput - of large scale production AI... 
    Performance
    Full time
    Part time
    Local area

    Capital One Financial Corp

    Cambridge, MA
    20 hours ago
  • Metaschool is seeking an AI Engineer to develop and implement LLM-powered applications using LangChain. The role...  ...AI solutions and optimizing performance for cost-efficiency. Ideal candidates...  ...equity, health insurance, and more in a remote environment. #J-18808-Ljbffr... 
    Remote job
    Performance

    Democrance

    New York, NY
    4 days ago
  • We are looking for a versatile and experienced AI / LLM Data Engineer to join our team and help shape the future of how Stylitics leverages...  ...design Help us build systems to easily monitor and test LLM performance Contribute to production code (Clojure + Python) Analyze... 
    Remote job
    Performance
    Work experience placement

    WorksHub

    New York, NY
    4 days ago
  • $215k - $230k

     ...trajectory. The AI Engineering Team is chartered...  ...pipelines, high-performance infrastructure, and...  ...millisecond-level latency, and provide the...  ...edge tools in the LLM and agent space —...  ...team—onsite and remote—has full visibility...  ..., optimizes team throughput with appropriate... 
    Remote work
    Performance
    Local area

    Crypto Pro Network

    San Francisco, CA
    1 day ago
  • $272k - $425.5k

    Principal Software Engineer – Large-Scale LLM Memory and Storage...  ...Santa Clara: US, WA, Remote: US, MA, Remotetime...  ...Dynamo is a high-throughput, low-latency inference framework...  ...serving generative AI and reasoning models...  ...Built in Rust for performance and Python for extensibility... 
    Remote work
    Performance
    Local area

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • Agentic AI Engineer Location: Remote / Alexandria, VA Clearance: Eligibility to...  ...the capabilities and performance of agentic systems within...  ...engineering and fine tuning LLM models. Ability to design...  ...models at scale in high‑throughput, low‑latency environments. Benefits... 
    Remote work
    Performance
    Work at office

    Whitespace, Ltd.

    Alexandria, VA
    4 days ago
  • $93.68k - $128.81k

     ...Siemens Healthineers company, is hiring for an AI LLM Engineer. The role is designed to be onsite in Atlanta, Georgia with some remote/hybrid flexibility. Responsibilities AI /...  ...Develop evaluation frameworks for prompt performance, answer quality, hallucination mitigation,... 
    Remote work
    Performance
    Temporary work
    Local area

    591x Varian Medical Systems, Inc.

    Newport, RI
    2 days ago
  •  ...Overview We are seeking an experienced AI‑focused Software Engineer to design, build, and scale...  ...development Hands‑on experience with LLM frameworks and orchestration tools (LangChain...  ...Experience optimizing AI systems for performance and scalability Exposure to MLOps... 
    Remote job
    Performance

    Xlysi LLC.

    Chicago, IL
    3 days ago
  •  ...Hybrid Department AI ABOUT FATHOM We...  ...re hiring a Model Performance Engineer to own the speed,...  ...GPU family's tail latency explodes at high concurrency...  ...to think about throughput curves. Our team...  ...experience with LLM serving frameworks...  ...being fully remote. We schedule meetings... 
    Remote work
    Performance
    Full time

    Pantera Capital

    San Francisco, CA
    11 hours ago
  • $93.68k - $128.81k

    ## AI LLM EngineerApplyremote type: Hybridlocations: ATL NPtime type...  ...is hiring for an AI LLM Engineer. This role is ideal for experienced...  ...Atlanta, Georgia with some remote/hybrid flexibility\***What...  ...evaluation frameworks for prompt performance, answer quality,... 
    Remote work
    Performance
    Temporary work
    Work at office
    Local area

    Siemens Healthineers AG

    Atlanta, GA
    3 days ago
  •  ...Description Job Description Role: Senior AI Engineer - Agentic Systems and LLM Client Location: Mason, OH 100% Remote Job Description: We are...  ...evaluation, reliability, and performance strategies (accuracy, latency, cost) Job Requirements - ~2/3 years... 
    Remote work
    Performance

    Vytwo

    Prosper, TX
    3 days ago
  •  ...forefront of Generative AI innovation,...  ...AI Product Engineer – Agentic Platforms...  ...GenAI frameworks and LLM platforms –...  ...for repeatability, performance tuning, and regression...  .... Monitor cost, latency, throughput, and behavioral...  ...work arrangements (remote and/or office-... 
    Remote work
    Performance
    Work at office
    Flexible hours

    Capgemini

    Dallas, TX
    1 day ago
  • $168.4k - $220k

    Principal Software Engineer - IE06GE The Hartford’s Applied AI COE Team is seeking a Principal...  ...AI systems meet performance, latency, throughput, resiliency, recovery,...  ...technologies. Experience with LLM orchestration...  ...Arrangement Hybrid or remote work arrangement. Candidates... 
    Remote work
    Performance
    Temporary work
    Work at office
    3 days per week

    The Hartford

    Hartford, CT
    20 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Remote AI Performance Engineer: LLM Throughput & Latency. Be the first to apply!