Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Real-Time AI Inference Engineer Low-Latency LLM

$250k - $350k

Nuance Labs, Inc.

Nuance Labs in Seattle is looking for a Member of Technical Staff to optimize real-time AI model inference. The ideal candidate will have deep expertise in LLM inference optimization and will work on improving performance across their model stack. The compensation includes a base salary ranging from $250,000 to $350,000 plus equity. The role is in-person full-time, and the company offers a competitive health plan, generous PTO, and daily meals. #J-18808-Ljbffr Nuance Labs, Inc.

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Real-Time AI Inference Engineer Low-Latency LLM in Seattle, WA vacancy
  • SpaceX is looking for a Software Engineer specializing in Low Latency Computing for their Starlink program, based in Seattle. You will develop reliable, real-time software for a global satellite-based network to enhance connectivity for underserved communities. The ideal... 
    Suggested

    SupportFinity™

    Seattle, WA
    4 days ago
  •  ...Optimization & Deployment Engineer, you will focus...  ...concurrent inference code to ensure real-time, deterministic...  ...accuracy recovery, and latency benchmarking...  ...memory bandwidth on AI accelerators....  ...production-level, low latency, and memory...  ...(e.g., TensorRT-LLM). Base Salary... 
    Suggested
    Temporary work
    Relocation package

    Zoox

    Seattle, WA
    15 days ago
  • $125k - $145k

    Software Engineer, Low Latency Computing (Starlink) SpaceX | Posted Mar 14 | Full-time | Seattle SpaceX was founded under the belief that a future where humanity is out...  ...the feedback loop between software design and real‑world performance. In this role, your software... 
    Suggested
    Permanent employment
    Full time
    Temporary work
    Internship
    Work at office
    Worldwide
    Monday to Friday
    Weekend work

    SupportFinity™

    Seattle, WA
    5 days ago
  • $150.3k - $270.5k

     ...Architect and deliver AI systems end‑to‑end....  ...specifications for low‑latency APIs/services that...  .... Driving engineering excellence by participating...  ...applying them in real solutions. Experience...  ...advanced LLM solution patterns (...  ...programs; generous paid time off; tuition assistance... 
    Suggested
    Flexible hours

    Premera

    Mountlake Terrace, WA
    4 days ago
  •  ...About Job Role: AI Engineer - Responsible AI...  ...Remote Type: Full-time Build the...  ...frontiers of AI safety, LLM jailbreak...  ...of requests, with real-time monitoring, alerting...  ...mechanisms as low-latency services handling...  ...• MLOps: Triton Inference Server, Weights &... 
    Suggested
    Full time
    Remote work

    International Recruiting LLC

    Seattle, WA
    more than 2 months ago
  • $151.8k

     ...will join a dynamic AI Infrastructure...  ..., deployment, and inference at scale, driving...  ...in areas such as real-time communication, computer...  ...the boundary on latency, throughput, and cost...  ...(vLLM, TensorRT-LLM, SGLang, or equivalent...  ..., Electrical Engineering, or a related technical... 
    Work at office
    Remote work

    Zoom Video Communications

    Seattle, WA
    3 days ago
  •  ...challenges in modern AI workflows:...  ...teams to spend more time steering AI than actually...  ...systems that power real user experiences across...  ...ideas into working LLM systems and...  ...delivers precise, low-latency context to user workflows...  ...For Strong Python engineering fundamentals —skilled... 

    Symmetry AI

    Seattle, WA
    1 day ago
  •  ...Lead AI Engineer – Salesforce Lead AI Engineer at...  ...systems to improve from real‑world outcomes...  ...agents that combine LLM reasoning, tool usage...  ...and near real‑time) for training, evaluation, and inference Transform raw interaction...  ..., revenue impact, latency, etc.) Use... 

    Salesforce.Com Inc

    Seattle, WA
    5 days ago
  • $172.5k - $260.1k

     ...Category Software Engineering Job Details...  ...is the #1 AI CRM, where humans...  ...performance over time.This is an...  ...to improve from real-world outcomes...  ...agents that combine LLM reasoning, tool...  ..., and inference Transform raw...  ...revenue impact, latency, etc.) Use production... 

    Salesforce.Com Inc

    Seattle, WA
    4 days ago
  • $103.2k - $203.4k

     ...forward! Build AI that matters ....  ...measure success with latency, reliability,...  .../policy design, LLM evaluation, and...  ...; target low hallucination, tight...  .... Operate in real world constraints...  ...services or on prem inference stacks...  ...; mentorship of engineers. Clear communication... 
    Live in
    Work at office
    Local area

    Accenture

    Seattle, WA
    2 days ago
  • $171.6k - $302.2k

     ...and shape how AI fundamentally transforms...  ..., all the time, through...  ...accelerate our Data Engineering and Data...  ...performance and inference-quality efficiency...  ...decisions, optimizing latency, throughput and...  ...+ years taking LLM or agentic...  ...Kafka Streams) for real‑time data and... 
    Worldwide
    Relocation

    Apple

    Seattle, WA
    1 day ago
  • $168k - $252k

    Senior Software Engineer - Real Time Systems Develop real-time software systems for autonomous vehicles...  ...of systems is powered by Lattice OS, an AI-powered operating system that turns...  ...engineering school, etc. If you've succeeded in a low structure, high autonomy environment you... 
    Full time
    Work experience placement
    Local area
    Relocation package

    jobs.frontdoordefense.com - Jobboard

    Seattle, WA
    4 days ago
  • $96.8k - $306.4k

     ...The Senior Principal AI Agent / ML Software Engineer is a Senior Staff-...  ...workflows, scalable inference infrastructure, and...  ...services optimized for low latency, high throughput,...  ...Deep understanding of LLM application patterns...  ...company match 8. Paid time off: Flexible... 
    Temporary work
    Flexible hours

    Oracle

    Seattle, WA
    3 days ago
  •  ...startup on a mission to reinvent AI inference infrastructure from the...  ...Inference Infrastructure Software Engineer to own and evolve the cloud...  ..., with strong SLAs around latency, throughput, and availability...  ...paid by employer) Flexible Time Off (FTO) Paid parental leave... 
    Work at office
    Flexible hours
    3 days per week

    ElastixAI INC.

    Seattle, WA
    23 hours ago
  • $160k - $200k

     ...Employment Type Full time Location Type Hybrid Department Engineering Compensation $160K -...  ...with a passion for AI and a strong track record...  ...build, and operate LLM powered systems that...  ...integrated into real product workflows. You...  ..., context limits, latency, and cost when designing... 
    Full time
    Remote work
    Flexible hours

    Madrona Venture Labs

    Seattle, WA
    2 days ago
  • $184.5k

     ...travel perks, generous time-off, parental leave, a...  ...Senior Machine Learning Engineer to join our high-performing...  ...large-scale batch and real-time ML systems that...  ...and validation, scalable inference, monitoring, drift...  ...scaling production ML and AI systems, including LLMs... 
    Local area
    Flexible hours

    Expedia Group

    Seattle, WA
    2 days ago
  • $231k

     ...perks, generous time-off, parental...  ...world class engineering and machine...  ...signals into real time decisions...  ...friction low. What makes this...  ...of risk with AI on a global scale...  ...for low-latency risk decisioning...  ..., online inference, monitoring/retraining...  ...LLM/agentic techniques... 
    Local area
    Flexible hours

    PowerToFly

    Seattle, WA
    5 days ago
  • $119.85k - $162.15k

    The Boeing Company is looking for a Software Engineer - Data Acquisition .NET, C++ to join its team in Seattle, Washington. This role...  ...Engineering or a related field, along with experience in C/C++, real-time systems, and data acquisition. Competitive pay and benefits... 

    The Boeing Company

    Seattle, WA
    2 days ago
  • A tech company specializing in AR and VR is seeking a Senior Augmented and Virtual Reality Software Engineer to create real-time 3D and XR applications. The ideal candidate will have extensive experience in building interactive applications using C# and C++, and will work... 

    Rivet Industries, Inc.

    Bellevue, WA
    2 days ago
  •  ...Salesforce is the #1 AI CRM, where...  ..., deployment, inference, and monitoring...  ...deployment strategies, real time feature serving...  ...for the low level systems...  ...platform supports LLM efficiency and...  ...and product engineering teams.## About...  ...high throughput, latency sensitive workloads... 
    Temporary work

    Salesforce

    Seattle, WA
    2 days ago
  • A technology company in the United States is seeking an Augmented and Virtual Reality Software Engineer to build interfaces for real-time 3D and XR applications. This role involves creating frontend systems using C#, C++, and frameworks like Unity or StereoKit, while leveraging... 

    Rivet Industries, Inc.

    Bellevue, WA
    2 days ago
  • $255k - $405k

     ...important to us than unfettered growth. The Engineering team manages a massive fleet of GPUs...  ...stack development and has experience with Real-Time Communication (RTC). Candidates should...  ...not required. About OpenAI OpenAI is an AI research and deployment company... 

    Slope

    Seattle, WA
    9 hours ago
  • Description About Slack AI Slack AI's...  ..., deployment, inference, and monitoring...  ...strategies, real time feature serving...  ...responsible for the low level systems...  ...supports LLM efficiency and...  ...infrastructure and product engineering teams. About...  ...throughput, latency sensitive... 
    Temporary work

    B Capital

    Seattle, WA
    9 hours ago
  • Rivet Industries, Inc. is seeking an early/mid-career XR Software Engineer to work on real-time 3D applications for embedded Linux and Android devices. In this role, you'll collaborate with senior engineers on C# and Unity projects, focusing on developing and shipping scoped... 

    Rivet Industries, Inc.

    Bellevue, WA
    9 hours ago
  •  ...What you'll do: Lead AI Engineer in the Platforms and...  ...evaluating production-grade LLM systems , including...  ..., and scalable inference pipelines. Design...  ...inference workflows for latency , GPU utilization ,...  ...being, financial future, time away, and professional... 
    Work at office
    Worldwide

    ZS

    Bellevue, WA
    2 days ago
  •  ...in Seattle is looking for a Senior Mobile Engineer to take charge of mobile app development....  ...development, particularly in building real-time audio applications with strong communication...  ..., and various benefits, contributing to impactful AI technologies. #J-18808-Ljbffr... 

    Read AI, Inc.

    Seattle, WA
    1 day ago
  • $172.5k - $313.7k

     ...Category Software Engineering Job Details...  ...Salesforce is the #1 AI CRM, where humans with...  ...scaffolding that wraps LLM calls, manages tool...  ...be tested against real traces before going...  ...rates, latency, and cost Build...  ...quality metrics over time; own the signal that... 

    Salesforce.Com Inc

    Bellevue, WA
    5 days ago
  • $190k - $255k

     ...hiring a Senior Software Engineer - ML to join our Applied AI team at Supio. You'll...  ...strong applied ML/LLM experience but not...  ...agentic systems for real-world use cases. Own...  ...to measure accuracy, latency, and reliability....  ...reducing processing time. Compensation... 
    Remote work
    Flexible hours

    Supio

    Seattle, WA
    1 day ago
  • A cloud technology company is looking for a Senior Engineer 2 to enhance their AI Inference Optimization team. In this role, you will drive architectural decisions that improve throughput and reduce latency in large models. Candidates should have over 5 years of experience... 
    Remote job

    DigitalOcean

    Seattle, WA
    4 days ago
  • SupportFinity™ is looking for iOS Engineers to develop and maintain a large-scale infrastructure system that supports a three-sided marketplace...  ...new features, improving existing code, and addressing real-time data challenges. Candidates should have experience in Swift... 

    SupportFinity™

    Seattle, WA
    6 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Real-Time AI Inference Engineer Low-Latency LLM. Be the first to apply!