Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Real-Time AI Inference Engineer — Low-Latency LLM

$250k - $350k

Nuance Labs

Nuance Labs in Seattle is looking for a Member of Technical Staff to optimize real-time AI model inference. The ideal candidate will have deep expertise in LLM inference optimization and will work on improving performance across their model stack. The compensation includes a base salary ranging from $250,000 to $350,000 plus equity. The role is in-person full-time, and the company offers a competitive health plan, generous PTO, and daily meals. #J-18808-Ljbffr Nuance Labs

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Real-Time AI Inference Engineer — Low-Latency LLM in Seattle, WA vacancy
  • A technology company in Seattle is looking for an engineer to build and own a server-side real-time engine for an AI avatar system. This role requires strong Python skills, experience with real-time streaming systems, and the ability to design architecture and ship features... 
    Suggested

    Nuance Labs

    Seattle, WA
    2 days ago
  • $242k - $290k

     ...Optimization & Deployment Engineer The Perception...  ...concurrent inference code to ensure real-time, deterministic...  ...accuracy recovery, and latency benchmarking...  ...bandwidth on AI accelerators....  ...production-level, low latency, and memory...  ...(e.g., TensorRT-LLM). $242,000 -... 
    Suggested
    Temporary work
    Relocation package

    Zoox

    Seattle, WA
    3 days ago
  • $150k - $160k

    AI Engineer - Responsible AI page is loaded## AI...  ...Remote Work( USA)time type: Full timeposted...  ...strategies for LLM jailbreaks (prompt...  ...of requests, with real-time monitoring, alerting...  ...mechanisms as low-latency services handling...  ...• MLOps: Triton Inference Server, Weights &... 
    Suggested
    Remote work

    Centific Global Solutions, Inc.

    Seattle, WA
    3 days ago
  • $160k - $180k

    Read AI, Inc. is seeking a skilled iOS/Android Software Engineer to lead the development of mobile applications that capture audio data and provide AI-generated...  ...development for iOS and Android, with a strong emphasis on real-time audio processing capabilities. The position offers... 
    Suggested

    Read AI, Inc.

    Seattle, WA
    1 day ago
  • $150.3k - $270.5k

     ...Architect and deliver AI systems end‑to‑end....  ...specifications for low‑latency APIs/services that...  .... Driving engineering excellence by participating...  ...applying them in real solutions. Experience...  ...advanced LLM solution patterns (...  ...programs; generous paid time off; tuition assistance... 
    Suggested
    Flexible hours

    Premera

    Mountlake Terrace, WA
    4 days ago
  • $151.8k

     ...will join a dynamic AI Infrastructure...  ..., deployment, and inference at scale, driving...  ...in areas such as real-time communication, computer...  ...the boundary on latency, throughput, and cost...  ...(vLLM, TensorRT-LLM, SGLang, or equivalent...  ..., Electrical Engineering, or a related technical... 
    Work at office
    Remote work

    Zoom Corporation

    Seattle, WA
    3 days ago
  •  ...Staff+ Software Engineer, Inference Runtime Remote-Friendly...  ..., and steerable AI systems. We want...  ..., and who gets real satisfaction from...  ...'s expansion cost low by ensuring new models...  ...profiling, latency and throughput optimization...  ...rates, release times, latency, or... 
    Work at office
    Remote work
    Visa sponsorship
    Flexible hours

    Anthropic

    Seattle, WA
    1 day ago
  • $172.5k - $260.1k

     ...Category Software Engineering Job Details...  ...is the #1 AI CRM, where humans...  ...performance over time.This is an...  ...to improve from real-world outcomes...  ...agents that combine LLM reasoning, tool...  ..., and inference Transform raw...  ...revenue impact, latency, etc.) Use production... 

    Salesforce.Com Inc

    Seattle, WA
    4 days ago
  • $177.1k

     ...and product-driven AI Agent Engineer to design, build,...  ...Models (LLMs) and real-world business systems...  ...capability, latency, and operational cost...  ...for low latency, high throughput...  ...Have expertise in LLM and agent mechanisms...  ...application - take your time to ensure it's a... 
    Work at office
    Remote work

    Zoom Video Communications

    Seattle, WA
    13 hours ago
  •  ...Tech Lead, Data & Inference Engineer Seattle, Washington, United...  ...vertical in Applied AI, Machine Learning, and...  ...Work type: Full Time Compensation: above...  ...interfaces into trusted and low latency systems. Take full...  ...and support for both real time and batch oriented... 
    Full time

    Catalyst Labs, LLC

    Seattle, WA
    2 days ago
  •  ...challenges in modern AI workflows:...  ...teams to spend more time steering AI than actually...  ...systems that power real user experiences across...  ...ideas into working LLM systems and...  ...delivers precise, low-latency context to user workflows...  ...For Strong Python engineering fundamentals —... 

    Symmetry AI

    Seattle, WA
    1 day ago
  • $171.6k - $302.2k

     ...and shape how AI fundamentally transforms...  ..., all the time, through...  ...accelerate our Data Engineering and Data...  ...performance and inference-quality efficiency...  ...decisions, optimizing latency, throughput and...  ...+ years taking LLM or agentic...  ...Kafka Streams) for real‑time data and... 
    Worldwide
    Relocation

    Apple Inc.

    Seattle, WA
    1 day ago
  • Lead AI Engineer - Salesforce Lead AI Engineer at...  ...systems to improve from real‑world outcomes...  ...agents that combine LLM reasoning, tool usage...  ...and near real‑time) for training, evaluation, and inference Transform raw interaction...  ..., revenue impact, latency, etc.) Use... 

    salesforce.com, inc.

    Seattle, WA
    1 day ago
  • $168k - $252k

    Senior Software Engineer - Real Time Systems Develop real-time software systems for autonomous vehicles...  ...of systems is powered by Lattice OS, an AI-powered operating system that turns...  ...engineering school, etc. If you've succeeded in a low structure, high autonomy environment you... 
    Full time
    Work experience placement
    Local area
    Relocation package

    jobs.frontdoordefense.com - Jobboard

    Seattle, WA
    4 days ago
  • $140k - $150k

     ...Centific is a frontier AI data foundry...  ...and engineers. We harness the...  ...)**Type:** Full‐time**Build the Future...  ...used to transform real-world spaces into...  ...ondevice‐ or edge inference.* Robust multi‐modal...  ...optical flow* **VLM / LLM:** Vision...  ...a product KPI (latency, accuracy, robustness... 
    Full time
    Remote work

    Centific Global Solutions, Inc.

    Seattle, WA
    4 days ago
  • $160k - $230k

     ...Senior Software Engineer — LLM Post-Training Platform...  ...this new era, we seek AI-native thinkers...  ...impact. We look for low-ego individuals who...  ...-node training and inference, fault tolerance, and...  ...infra skills with real post-training...  ...paid holidays; paid time off; parental leave... 
    Flexible hours

    Streamlit

    Bellevue, WA
    1 day ago
  • $99.6k - $234.6k

     ...The Principal AI Agent / ML Software Engineer is a Senior Staff-level...  ...workflows, scalable inference infrastructure, and...  ...services optimized for low latency, high throughput, GPU...  ...Deep understanding of LLM application patterns...  ...match 8. Paid time off: Flexible Vacation... 
    Temporary work
    Flexible hours

    Oracle

    Seattle, WA
    3 days ago
  •  ...startup on a mission to reinvent AI inference infrastructure from the...  ...Inference Infrastructure Software Engineer to own and evolve the cloud...  ..., with strong SLAs around latency, throughput, and availability...  ...paid by employer). Flexible Time Off (FTO). Paid parental leave... 
    Work at office
    Flexible hours
    3 days per week

    ElastixAI Inc.

    Seattle, WA
    26 days ago
  • $236k - $339.25k

     ...this new era, we seek AI-native thinkers across...  ...your impact. We look for low-ego individuals who thrive...  ...machine learning and LLM workloads. Join us to...  ...in serving LLMs using inference engines like vLLM, TensorRT-LLM...  ...in building batch and real-time ML serving systems preferred... 
    Flexible hours

    Streamlit

    Bellevue, WA
    3 days ago
  • $131.9k - $237.4k

     ...have the opportunity to drive real change by transforming...  ...Healthsource blog: . As an AI Engineer IV , you'll contribute leadership...  ...Develop specifications for low latency APIs and services necessary to...  ...name a few. Generous paid time off to reenergize. Looking... 
    Remote work
    Work from home

    Premera Blue Cross

    Mountlake Terrace, WA
    3 days ago
  • $135.6k - $230.5k

     ...have the opportunity to drive real change by transforming...  ...our Healthsource blog: . AI Engineer III As an AI Engineer III,...  ...Develop specifications for low latency APIs and services necessary to...  ...name a few. Generous paid time off to reenergize. Looking... 

    Premera Blue Cross

    Mountlake Terrace, WA
    2 days ago
  • $231k

     ...Expedia Group is using AI to re‑invent how we do engineering to deliver hyper‑...  ...travel experiences in real‑time. The Customer Data...  ...including detailed low‑level designs, API contracts...  ...throughput, low latency services. Proven...  ...stores, and online inference systems. Expertise... 
    Immediate start

    Expedia, Inc.

    Seattle, WA
    3 days ago
  • $148.5k - $313.7k

     ...Category Software Engineering Job Details...  ...is the #1 AI CRM, where humans...  ...training, deployment, inference, and monitoring...  ...strategies, real time feature serving...  ...for the low level systems that...  ...platform supports LLM efficiency and...  ...throughput, latency sensitive workloads... 
    Temporary work

    Salesforce

    Seattle, WA
    6 days ago
  • $185k - $385k

     ...important to us than unfettered growth. The Engineering team manages a massive fleet of GPUs...  ...development and has experience with Real-Time Communication (RTC). Candidates should have...  .... About OpenAI OpenAI is an AI research and deployment company dedicated... 

    OpenAI

    Seattle, WA
    4 days ago
  • $118.7k - $232.7k

     ...Intelligence builds data and AI-powered products...  ...design and develop LLM-powered...  ...tool calling, prompt engineering, structured data grounding...  ...systems that solve real customer problems,...  ...the accuracy, latency, cost, and trustworthiness...  ...groups Paid time off for individual... 
    Full time
    Immediate start
    Flexible hours

    Ford Motor Company

    Bellevue, WA
    2 days ago
  • Rivet Industries, Inc. is seeking an early/mid-career XR Software Engineer to work on real-time 3D applications for embedded Linux and Android devices. In this role, you'll collaborate with senior engineers on C# and Unity projects, focusing on developing and shipping scoped... 

    Rivet Industries, Inc.

    Bellevue, WA
    13 hours ago
  • A tech company specializing in AR and VR is seeking a Senior Augmented and Virtual Reality Software Engineer to create real-time 3D and XR applications. The ideal candidate will have extensive experience in building interactive applications using C# and C++, and will work... 

    Rivet Industries, Inc.

    Bellevue, WA
    2 days ago
  • $217k - $307k

    Zoox is seeking a Performance Software Engineer in Seattle, WA to analyze and optimize performance-critical algorithms for their advanced...  ...demands strong knowledge of C++ and proficiency in debugging real-time systems. Compensation ranges from $217,000 to $307,000... 

    Zoox

    Seattle, WA
    1 day ago
  • A technology company in the United States is seeking an Augmented and Virtual Reality Software Engineer to build interfaces for real-time 3D and XR applications. This role involves creating frontend systems using C#, C++, and frameworks like Unity or StereoKit, while leveraging... 

    Rivet Industries, Inc.

    Bellevue, WA
    2 days ago
  • Description About Slack AI Slack AI's...  ..., deployment, inference, and monitoring...  ...strategies, real time feature serving...  ...responsible for the low level systems...  ...supports LLM efficiency and...  ...infrastructure and product engineering teams. About...  ...throughput, latency sensitive... 
    Temporary work

    B Capital

    Seattle, WA
    13 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Real-Time AI Inference Engineer — Low-Latency LLM. Be the first to apply!