Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Member of Technical Staff — Model Optimization and Inference (New Grad)

$200k - $300k

Nuance Labs

About Nuance Labs Nuance Labs is building photorealistic, real-time AI avatars with emotional intelligence: a full‑duplex audiovisual system that can listen, speak, react, interrupt, and respond like a real person. About The Role We can train a great model, but the next problem is making it fast enough to actually use in a real‑time conversation. A model that responds in 3 seconds is a demo; a model that responds in under 500 ms is a product. We’re looking for someone who’s excited about taking trained models and squeezing every last millisecond out of them. You understand—or want to deeply understand—the full stack from model weights to serving infrastructure: quantization, KV cache optimization, kernel‑level acceleration, and batching strategies. You’ve worked with vLLM, SGLang, or similar frameworks (through coursework, research, internships, or open‑source) and have opinions about where they fall short. This posting is aimed at early‑career engineers finishing or recently finished with a BS, MS, or PhD. We don’t require a PhD – we care about systems intuition, engineering chops, and the appetite to go deep. What You’ll Do Contribute to end‑to‑end inference optimization across our model stack—LLMs, audio models, and diffusion‑based components Implement and tune KV cache strategies for long‑context conversations, including eviction policies, compression, and memory‑efficient attention Work with inference serving frameworks (vLLM, SGLang, TensorRT‑LLM, etc.) and extend them for our specific workloads Profile and benchmark end‑to‑end latency and throughput; identify and systematically eliminate bottlenecks Build internal tooling that makes optimization work faster and more rigorous—profiling viewers, end‑to‑end inference test harnesses, and other infrastructure that helps the team move quickly Accelerate diffusion model inference—consistency models, step distillation, caching strategies, and custom kernel optimizations Apply quantization techniques (INT8, INT4, GPTQ, AWQ, and beyond) to reduce memory footprint and increase throughput without meaningfully degrading quality Work closely with research and infrastructure to ensure new models ship with optimized serving from day one What We’re Looking For BS, MS, or PhD in CS, ML, or a related field—completed or in the final stretch Strong fundamentals in LLM inference or ML systems—KV caching, memory layout, attention kernels, batching, or serving—picked up through coursework, research, internships, or open‑source. You don’t need to have shipped at production scale yet; you do need to learn fast and go deep. Exposure to inference serving frameworks (vLLM, SGLang, TensorRT‑LLM, or similar)—even at a research or hobby level Strong Python and PyTorch skills; familiarity with CUDA or Triton is a significant plus A systematic approach to profiling and optimization— you measure first, then optimize Curiosity about diffusion inference, speculative decoding, quantization, or other inference‑time acceleration techniques Bonus Points Internship or research experience with LLM inference, ML systems, or model serving Contributions to open‑source inference frameworks (vLLM, SGLang, TensorRT‑LLM, etc.) CUDA / Triton kernel work, even at a research or hobby scale Publications or research projects in MLSys, model compression, or inference optimization Familiarity with multimodal or streaming inference architectures Experience with hard latency SLAs in any real‑time system Compensation $200,000 – $300,000 base salary, plus meaningful equity. We think long‑term ownership matters and structure equity accordingly. Logistics Location: In‑person in Seattle, five days a week — we believe in the compounding value of working shoulder‑to‑shoulder. Visa sponsorship: We sponsor visas (O‑1, H‑1B, green card) from day one. AI‑native tooling: Do your best work with the best tools, including unlimited tokens. Benefits Health: HSA plan with ~$2,000 in annual company contributions — roughly 2× what most big tech companies put in. Time off: 15 days of PTO plus public holidays, and we close the office for a full week at year‑end. Food: Lunch, drinks, and snacks on us every workday — the small thing that quietly makes the day better. Commuter benefits: We help cover the cost of getting to the office. 401(k): In the works. Nuance Labs is an equal opportunity employer. We believe diverse teams build better AI. #J-18808-Ljbffr Nuance Labs

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff — Model Optimization and Inference (New Grad) in Seattle, WA vacancy
  • $250k - $350k

    Member of Technical Staff — Model Optimization and Inference Seattle, Washington About Nuance Labs Nuance Labs is building photorealistic, real-time AI avatars with...  ...closely with research and infrastructure to ensure new models ship with optimized serving from day one... 
    Suggested

    Nuance Labs

    Seattle, WA
    3 days ago
  • $180k

     ...Member Of Technical Staff - Imagine Model Palo Alto, CA; Seattle, WA About XAI XAI's mission is to create...  ...data curation, modeling, training, inference serving, and product integration,...  ...learning systems. Ability to deliver optimal end-to-end user experiences.... 
    Suggested
    Temporary work

    Xai

    Seattle, WA
    3 days ago
  • $250k - $350k

     ...developing foundation models designed for it from...  ...for a deeply technical Member of Technical Staff to own RL and post-training...  ...modeling, policy optimization, evaluation, data feedback...  ...adaptability toward new RL algorithms, model...  ...infrastructure, inference serving, simulation,... 
    New grad
    Internship
    H1b
    Work at office
    Visa sponsorship
    Shift work

    Nuance Labs, Inc.

    Seattle, WA
    4 days ago
  • $202.16k - $368.22k

     ...easier way. The team has research groups dedicated to generative models for content creation, image generation, video synthesis,...  ...We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and... 
    Suggested
    Temporary work
    Local area

    ByteDance

    Seattle, WA
    21 hours ago
  • $180k

     ...Member Of Technical Staff - Model Training Austin, TX; New York, NY; Palo Alto, CA; Seattle, WA About XAI XAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly... 
    Suggested
    Temporary work

    Xai

    Seattle, WA
    2 days ago
  • $242k - $290k

     ...Model Optimization & Deployment Engineer The Perception team is pioneering the development of a multi-modality foundation model to drive...  ...models, write custom CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic execution on edge... 
    Temporary work
    Relocation package

    Zoox

    Seattle, WA
    2 days ago
  • $180k

     ...be able to concisely and accurately share knowledge with their teammates. ABOUT THE ROLE: You will work on the most critical modeling challenges at any given time. You will get clarity on your first project before an offer. BASIC QUALIFICATIONS: You... 
    Temporary work

    xAI

    Seattle, WA
    22 days ago
  • $159.75k - $255.6k

    Sr. Full Stack Member of Technical Staff Seattle, Washington, United States Join...  ...the full stack, from data, models, and infrastructure to...  ...constrained devices. Architect and optimize full‑stack AI pipelines....  ...platforms for large‑scale inference and training. Strong... 
    Work at office

    Axon

    Seattle, WA
    4 days ago
  • $300k - $400k

    Member of Technical Staff — Pretraining Infra Seattle, Washington About Nuance Labs...  ...we're developing foundation models designed for it from the...  ...long-running training jobs. Optimize large-scale training performance...  ..., and adaptability to new model architectures, training... 

    Nuance Labs

    Seattle, WA
    1 day ago
  • $120k - $150k

    # Member of Technical Staff (AI-Powered EdTech)Colleague AI$120K - $1600KKirkland, WA, USSeniorAI/ML EngineerInterested...  ...**. We fully integrate **the best AI models and tools** into our **product design...  ...and cloud technologies.* Build and optimize **AI\-driven features** for... 
    Permanent employment
    Full time
    Flexible hours

    AI Pulse

    Kirkland, WA
    2 days ago
  • $200k - $300k

    Member of Technical Staff — ML Data Infra Seattle, Washington About Nuance Labs Nuance...  ...we're developing foundation models designed for it from the...  ...as you are designing a new pipeline architecture from scratch...  ...without losing correctness Optimize pipeline throughput and... 

    Nuance Labs

    Seattle, WA
    2 days ago
  • $96.8k - $223.4k

     ...can design and build innovative new systems from the ground up. As...  ...and excellence. As a valued member of our software engineering division...  ...Collaborate and lead technical discussions across multiple teams...  ...principles Data management: data modeling, data warehousing, data... 
    Temporary work
    Remote work
    Flexible hours

    Oracle

    Seattle, WA
    3 days ago
  • $79.2k - $178.1k

     ...infrastructure responsible for automating the full server lifecycle from new platform shape (AMD/Intel/Arm/Nvidia) creation, hardware...  ...cloud in the industry. Responsibilities As a Senior Member of Technical Staff, you will own the software design and development for major... 
    Temporary work
    Worldwide
    Flexible hours

    Oracle

    Seattle, WA
    4 days ago
  •  ...systems challenges, and help deliver the foundation for OCI’s most performant compute services. Responsibilities As a Senior Member of Technical Staff, you will own the software design and development for major components of Oracle’s Cloud Infrastructure. You should be a... 
    Temporary work
    Worldwide
    Flexible hours

    Oracle

    Seattle, WA
    2 days ago
  • $180k

     ...accurately share knowledge with their teammates. ABOUT THE ROLE: We're looking for exceptional media engineers who want to join us on a new project to deeply integrate xAI's advanced AI infrastructure into a platform used by around 600 million users every month. We're... 
    Temporary work

    xAI

    Seattle, WA
    26 days ago
  • $124k - $280k

     ...Strategy Consulting - Business Model Reinvention - Senior Manager...  ...opportunities for growth, optimize operations, and enhance overall...  ...and reinforce professional and technical standards....  ...collaborating closely with team members. We evaluate these factors thoughtfully... 
    Full time
    H1b

    PwC

    Seattle, WA
    3 days ago
  • $99k - $232k

     ...As a Strategy& - Business Model Reinvention - Manager, you will...  ...identify growth opportunities, optimize operations, and enhance overall...  ...planning and mentoring junior staff. You are accountable for project...  ...coaching and feedback to team members to foster professional growth... 
    Full time
    H1b

    PwC

    Seattle, WA
    3 days ago
  •  ...complex, even for advanced developers. This new generation of applications need fast,...  ...the status-quo. You challenge the current model of the world and take leaps of faith to build...  ....ai OSS project ~30-60 days - take technical and engineering ownership of an entire feature... 
    Work at office

    Blackbird

    Bellevue, WA
    3 days ago
  • $57 per hour

     ...reinforcement learning framework, high-performance inference, and heterogeneous hardware compilation technologies for AI foundation models. Conduct research on infrastructure and...  ...related to large‑scale systems, inference optimization, compilers, or performance optimization.... 
    Hourly pay
    Internship
    Local area

    ByteDance

    Seattle, WA
    4 days ago
  • $26 per hour

    Aston Carter is seeking a Supply Chain Analyst based in Seattle, WA. This contract position entails driving continuous improvement in processes while problem-solving and designing solutions for the supply chain network. Candidates should have a Bachelor's degree in Engineering...
    Contract work

    Aston Carter

    Seattle, WA
    2 days ago
  • $112.7k - $169.1k

     ...predicting user value, optimizing bids, and...  ...— large language models, reinforcement learning...  ...experiments using causal inference, A/B testing, and...  ...clearly to technical and non-technical...  ...we want our team members to thrive. We offer...  ...days | Support for new parents through leave... 
    New grad
    Internship
    Work at office
    Worldwide
    Relocation package
    Shift work

    Unity

    Bellevue, WA
    1 day ago
  •  ...Seattle is seeking a skilled Support Analyst to provide exceptional support for members using their AI-powered solutions. The ideal candidate will have 3-5 years of experience in a technical role, with expertise in financial services technology. Responsibilities include... 

    Range

    Seattle, WA
    3 days ago
  •  ...The Compute Bare Metal Provisioning team owns the critical infrastructure responsible for automating the full server lifecycle from new platform shape (AMD/Intel/Arm/Nvidia) creation, hardware bring‑up to customer‑ready instance provisioning and firmware management. The... 
    Worldwide
    Flexible hours

    Ll Oefentherapie

    Seattle, WA
    4 days ago
  • $2,200 - $3,150 per week

     ...week travel assignment. As a member of our team, you'll have the...  ...Physical Therapy experience, but New Grads are welcome to apply Other...  ...ensure coordinated care and optimize treatment outcomes Educate...  ...and professionalism Technical/Motor Skills - Must have the... 
    New grad
    Contract work
    Temporary work
    Work from home
    Relocation package
    Shift work

    Sociable Society Talent

    Seattle, WA
    4 days ago
  •  ...Engineer to build and optimize data...  ...with data-related technical issues and support...  ...data science team members that assist them in...  ...Passionate about learning new technologies Company...  ...Jobright.ai by 2x Inferred from the...  ...Software Engineer - New Grad, Distributed Data... 
    New grad
    Full time

    Jobright.ai

    Bellevue, WA
    21 hours ago
  •  ...Python Infrastructure Engineer - Model Evaluation (AI Training)...  ...ll Do Design, build, and optimize high-performance Python systems...  ...for ML models, integrating with inference frameworks Improve reliability...  ...ongoing work and contract extension as new projects launch... 
    Hourly pay
    Ongoing contract
    Contract work
    Freelance
    Remote work
    Flexible hours

    Alignerr

    Seattle, WA
    3 days ago
  • $45 - $55 per hour

     ...your therapy career to new heights! Parkshore is partnered...  ...and pathways to ensure optimal patient outcomes Be...  ...for each team member on their professional journey...  ...the community New Grads encouraged to apply –...  ...seniors and welcoming staff of all backgrounds, skills... 
    New grad
    Contract work
    Temporary work

    Marquis Companies

    Seattle, WA
    21 hours ago
  • $40 - $50 per hour

     ...your therapy career to new heights! Parkshore is...  ...and pathways to ensure optimal patient outcomes Be...  ...development for each team member on their professional...  ...in the community New Grads encouraged to apply -...  ...seniors and welcoming staff of all backgrounds, skills... 
    New grad
    Contract work
    Temporary work

    Consonus Healthcare Services

    Seattle, WA
    3 days ago
  • $50 - $56 per hour

     ...facility has been recently updated with a new therapy gym & equipment.   Why Work With...  ...Day ~ Holiday Pay for Full-Time team members ~ Flexible schedule options ~ Career development...  ...SNF experience is a plus – but new grads are encouraged to apply! Strong... 
    New grad
    Full time
    Temporary work
    Relief
    Local area
    Relocation package
    Flexible hours

    Renton Health and Rehabilitation

    Renton, WA
    5 days ago
  •  ...in Renton, Washington. As a member of our team, you'll have the...  ...Therapy Assistant experience, but New Grads are welcome to apply Other...  ...healthcare team to support optimal recovery and overall quality...  ...attitude and professionalism Technical/Motor Skills - Must have the... 
    New grad
    Temporary work
    Work from home
    Shift work

    Fusion Medical Staffing

    Renton, WA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff — Model Optimization and Inference (New Grad). Be the first to apply!