Member of Technical Staff — Model Optimization and Inference (New Grad)

$200k - $300k

Nuance Labs

About Nuance Labs Nuance Labs is building photorealistic, real-time AI avatars with emotional intelligence: a full‑duplex audiovisual system that can listen, speak, react, interrupt, and respond like a real person. About The Role We can train a great model, but the next problem is making it fast enough to actually use in a real‑time conversation. A model that responds in 3 seconds is a demo; a model that responds in under 500 ms is a product. We’re looking for someone who’s excited about taking trained models and squeezing every last millisecond out of them. You understand—or want to deeply understand—the full stack from model weights to serving infrastructure: quantization, KV cache optimization, kernel‑level acceleration, and batching strategies. You’ve worked with vLLM, SGLang, or similar frameworks (through coursework, research, internships, or open‑source) and have opinions about where they fall short. This posting is aimed at early‑career engineers finishing or recently finished with a BS, MS, or PhD. We don’t require a PhD – we care about systems intuition, engineering chops, and the appetite to go deep. What You’ll Do Contribute to end‑to‑end inference optimization across our model stack—LLMs, audio models, and diffusion‑based components Implement and tune KV cache strategies for long‑context conversations, including eviction policies, compression, and memory‑efficient attention Work with inference serving frameworks (vLLM, SGLang, TensorRT‑LLM, etc.) and extend them for our specific workloads Profile and benchmark end‑to‑end latency and throughput; identify and systematically eliminate bottlenecks Build internal tooling that makes optimization work faster and more rigorous—profiling viewers, end‑to‑end inference test harnesses, and other infrastructure that helps the team move quickly Accelerate diffusion model inference—consistency models, step distillation, caching strategies, and custom kernel optimizations Apply quantization techniques (INT8, INT4, GPTQ, AWQ, and beyond) to reduce memory footprint and increase throughput without meaningfully degrading quality Work closely with research and infrastructure to ensure new models ship with optimized serving from day one What We’re Looking For BS, MS, or PhD in CS, ML, or a related field—completed or in the final stretch Strong fundamentals in LLM inference or ML systems—KV caching, memory layout, attention kernels, batching, or serving—picked up through coursework, research, internships, or open‑source. You don’t need to have shipped at production scale yet; you do need to learn fast and go deep. Exposure to inference serving frameworks (vLLM, SGLang, TensorRT‑LLM, or similar)—even at a research or hobby level Strong Python and PyTorch skills; familiarity with CUDA or Triton is a significant plus A systematic approach to profiling and optimization— you measure first, then optimize Curiosity about diffusion inference, speculative decoding, quantization, or other inference‑time acceleration techniques Bonus Points Internship or research experience with LLM inference, ML systems, or model serving Contributions to open‑source inference frameworks (vLLM, SGLang, TensorRT‑LLM, etc.) CUDA / Triton kernel work, even at a research or hobby scale Publications or research projects in MLSys, model compression, or inference optimization Familiarity with multimodal or streaming inference architectures Experience with hard latency SLAs in any real‑time system Compensation $200,000 – $300,000 base salary, plus meaningful equity. We think long‑term ownership matters and structure equity accordingly. Logistics Location: In‑person in Seattle, five days a week — we believe in the compounding value of working shoulder‑to‑shoulder. Visa sponsorship: We sponsor visas (O‑1, H‑1B, green card) from day one. AI‑native tooling: Do your best work with the best tools, including unlimited tokens. Benefits Health: HSA plan with ~$2,000 in annual company contributions — roughly 2× what most big tech companies put in. Time off: 15 days of PTO plus public holidays, and we close the office for a full week at year‑end. Food: Lunch, drinks, and snacks on us every workday — the small thing that quietly makes the day better. Commuter benefits: We help cover the cost of getting to the office. 401(k): In the works. Nuance Labs is an equal opportunity employer. We believe diverse teams build better AI. #J-18808-Ljbffr Nuance Labs

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the Member of Technical Staff — Model Optimization and Inference (New Grad) in Seattle, WA vacancy

Member of Technical Staff — Model Optimization and Inference
$250k - $350k
Member of Technical Staff — Model Optimization and Inference Seattle, Washington About Nuance Labs Nuance Labs is building photorealistic, real-time AI avatars with... ...closely with research and infrastructure to ensure new models ship with optimized serving from day one...
Suggested
Nuance Labs
Seattle, WA
3 days ago
Member of Technical Staff - Imagine Model
$180k
...Member Of Technical Staff - Imagine Model Palo Alto, CA; Seattle, WA About XAI XAI's mission is to create... ...data curation, modeling, training, inference serving, and product integration,... ...learning systems. Ability to deliver optimal end-to-end user experiences....
Suggested
Temporary work
Xai
Seattle, WA
3 days ago
Member of Technical Staff - RL Research (New PhD Grad)
$250k - $350k
...developing foundation models designed for it from... ...for a deeply technical Member of Technical Staff to own RL and post-training... ...modeling, policy optimization, evaluation, data feedback... ...adaptability toward new RL algorithms, model... ...infrastructure, inference serving, simulation,...
New grad
Internship
H1b
Work at office
Visa sponsorship
Shift work
Nuance Labs, Inc.
Seattle, WA
4 days ago
Sr. Multimodal Model Training and Inference Optimization Engineer
$202.16k - $368.22k
...easier way. The team has research groups dedicated to generative models for content creation, image generation, video synthesis,... ...We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and...
Suggested
Temporary work
Local area
ByteDance
Seattle, WA
21 hours ago
Member of Technical Staff - Model Training
$180k
...Member Of Technical Staff - Model Training Austin, TX; New York, NY; Palo Alto, CA; Seattle, WA About XAI XAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly...
Suggested
Temporary work
Xai
Seattle, WA
2 days ago
Senior AI Inference Engineer - Model Optimization & Deployment
$242k - $290k
...Model Optimization & Deployment Engineer The Perception team is pioneering the development of a multi-modality foundation model to drive... ...models, write custom CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic execution on edge...
Temporary work
Relocation package
Zoox
Seattle, WA
2 days ago
Member of Technical Staff - Model Training
$180k
...be able to concisely and accurately share knowledge with their teammates. ABOUT THE ROLE: You will work on the most critical modeling challenges at any given time. You will get clarity on your first project before an offer. BASIC QUALIFICATIONS: You...
Temporary work
xAI
Seattle, WA
22 days ago
Sr. Full Stack Member of Technical Staff
$159.75k - $255.6k
Sr. Full Stack Member of Technical Staff Seattle, Washington, United States Join... ...the full stack, from data, models, and infrastructure to... ...constrained devices. Architect and optimize full‑stack AI pipelines.... ...platforms for large‑scale inference and training. Strong...
Work at office
Axon
Seattle, WA
4 days ago
Member of Technical Staff — Pretraining Infra
$300k - $400k
Member of Technical Staff — Pretraining Infra Seattle, Washington About Nuance Labs... ...we're developing foundation models designed for it from the... ...long-running training jobs. Optimize large-scale training performance... ..., and adaptability to new model architectures, training...
Nuance Labs
Seattle, WA
1 day ago
Member of Technical Staff (AI-Powered EdTech)
$120k - $150k
# Member of Technical Staff (AI-Powered EdTech)Colleague AI$120K - $1600KKirkland, WA, USSeniorAI/ML EngineerInterested... ...**. We fully integrate **the best AI models and tools** into our **product design... ...and cloud technologies.* Build and optimize **AI\-driven features** for...
Permanent employment
Full time
Flexible hours
AI Pulse
Kirkland, WA
2 days ago
Member of Technical Staff — ML Data Infra
$200k - $300k
Member of Technical Staff — ML Data Infra Seattle, Washington About Nuance Labs Nuance... ...we're developing foundation models designed for it from the... ...as you are designing a new pipeline architecture from scratch... ...without losing correctness Optimize pipeline throughput and...
Nuance Labs
Seattle, WA
2 days ago
Principal Member Technical Staff
$96.8k - $223.4k
...can design and build innovative new systems from the ground up. As... ...and excellence. As a valued member of our software engineering division... ...Collaborate and lead technical discussions across multiple teams... ...principles Data management: data modeling, data warehousing, data...
Temporary work
Remote work
Flexible hours
Oracle
Seattle, WA
3 days ago
Senior Member Technical Staff (JoinOCI-SDE)
$79.2k - $178.1k
...infrastructure responsible for automating the full server lifecycle from new platform shape (AMD/Intel/Arm/Nvidia) creation, hardware... ...cloud in the industry. Responsibilities As a Senior Member of Technical Staff, you will own the software design and development for major...
Temporary work
Worldwide
Flexible hours
Oracle
Seattle, WA
4 days ago
Senior Member Technical Staff (JoinOCI-SDE)
...systems challenges, and help deliver the foundation for OCI’s most performant compute services. Responsibilities As a Senior Member of Technical Staff, you will own the software design and development for major components of Oracle’s Cloud Infrastructure. You should be a...
Temporary work
Worldwide
Flexible hours
Oracle
Seattle, WA
2 days ago
Member of Technical Staff - Media
$180k
...accurately share knowledge with their teammates. ABOUT THE ROLE: We're looking for exceptional media engineers who want to join us on a new project to deeply integrate xAI's advanced AI infrastructure into a platform used by around 600 million users every month. We're...
Temporary work
xAI
Seattle, WA
26 days ago
Strategy& Strategy Consulting - Business Model Reinvention - Senior Manager
$124k - $280k
...Strategy Consulting - Business Model Reinvention - Senior Manager... ...opportunities for growth, optimize operations, and enhance overall... ...and reinforce professional and technical standards.... ...collaborating closely with team members. We evaluate these factors thoughtfully...
Full time
H1b
PwC
Seattle, WA
3 days ago
Strategy& - Strategy Consulting Business Model Reinvention - Manager
$99k - $232k
...As a Strategy& - Business Model Reinvention - Manager, you will... ...identify growth opportunities, optimize operations, and enhance overall... ...planning and mentoring junior staff. You are accountable for project... ...coaching and feedback to team members to foster professional growth...
Full time
H1b
PwC
Seattle, WA
3 days ago
Member of Technical Staff (Rust, Search & Database Engines)
...complex, even for advanced developers. This new generation of applications need fast,... ...the status-quo. You challenge the current model of the world and take leaps of faith to build... ....ai OSS project ~30-60 days - take technical and engineering ownership of an entire feature...
Work at office
Blackbird
Bellevue, WA
3 days ago
Student Researcher (AI Foundation Model Infrastructure - Seed) - 2027 Start (PhD)
$57 per hour
...reinforcement learning framework, high-performance inference, and heterogeneous hardware compilation technologies for AI foundation models. Conduct research on infrastructure and... ...related to large‑scale systems, inference optimization, compilers, or performance optimization....
Hourly pay
Internship
Local area
ByteDance
Seattle, WA
4 days ago
Supply Chain Capacity Analyst: Optimize & Model with SQL
$26 per hour
Aston Carter is seeking a Supply Chain Analyst based in Seattle, WA. This contract position entails driving continuous improvement in processes while problem-solving and designing solutions for the supply chain network. Candidates should have a Bachelor's degree in Engineering...
Contract work
Aston Carter
Seattle, WA
2 days ago
Machine Learning Engineer, Next-Generation Recommendation Systems (New Grad / PhD)
$112.7k - $169.1k
...predicting user value, optimizing bids, and... ...— large language models, reinforcement learning... ...experiments using causal inference, A/B testing, and... ...clearly to technical and non-technical... ...we want our team members to thrive. We offer... ...days | Support for new parents through leave...
New grad
Internship
Work at office
Worldwide
Relocation package
Shift work
Unity
Bellevue, WA
1 day ago
FinTech Support Analyst - Drive Member Success
...Seattle is seeking a skilled Support Analyst to provide exceptional support for members using their AI-powered solutions. The ideal candidate will have 3-5 years of experience in a technical role, with expertise in financial services technology. Responsibilities include...
Range
Seattle, WA
3 days ago
Senior Member Technical Staff (JoinOCI-SDE)
...The Compute Bare Metal Provisioning team owns the critical infrastructure responsible for automating the full server lifecycle from new platform shape (AMD/Intel/Arm/Nvidia) creation, hardware bring‑up to customer‑ready instance provisioning and firmware management. The...
Worldwide
Flexible hours
Ll Oefentherapie
Seattle, WA
4 days ago
Traveling Physical Therapist- Seattle, WA
$2,200 - $3,150 per week
...week travel assignment. As a member of our team, you'll have the... ...Physical Therapy experience, but New Grads are welcome to apply Other... ...ensure coordinated care and optimize treatment outcomes Educate... ...and professionalism Technical/Motor Skills - Must have the...
New grad
Contract work
Temporary work
Work from home
Relocation package
Shift work
Sociable Society Talent
Seattle, WA
4 days ago
Junior Data Engineer, Broadband
...Engineer to build and optimize data... ...with data-related technical issues and support... ...data science team members that assist them in... ...Passionate about learning new technologies Company... ...Jobright.ai by 2x Inferred from the... ...Software Engineer - New Grad, Distributed Data...
New grad
Full time
Jobright.ai
Bellevue, WA
21 hours ago
Python Insfrastructure Engineer - Model Evaluation
...Python Infrastructure Engineer - Model Evaluation (AI Training)... ...ll Do Design, build, and optimize high-performance Python systems... ...for ML models, integrating with inference frameworks Improve reliability... ...ongoing work and contract extension as new projects launch...
Hourly pay
Ongoing contract
Contract work
Freelance
Remote work
Flexible hours
Alignerr
Seattle, WA
3 days ago
Physical Therapist
$45 - $55 per hour
...your therapy career to new heights! Parkshore is partnered... ...and pathways to ensure optimal patient outcomes Be... ...for each team member on their professional journey... ...the community New Grads encouraged to apply –... ...seniors and welcoming staff of all backgrounds, skills...
New grad
Contract work
Temporary work
Marquis Companies
Seattle, WA
21 hours ago
Occupational Therapist
$40 - $50 per hour
...your therapy career to new heights! Parkshore is... ...and pathways to ensure optimal patient outcomes Be... ...development for each team member on their professional... ...in the community New Grads encouraged to apply -... ...seniors and welcoming staff of all backgrounds, skills...
New grad
Contract work
Temporary work
Consonus Healthcare Services
Seattle, WA
3 days ago
PHYSICAL THERAPIST
$50 - $56 per hour
...facility has been recently updated with a new therapy gym & equipment. Why Work With... ...Day ~ Holiday Pay for Full-Time team members ~ Flexible schedule options ~ Career development... ...SNF experience is a plus – but new grads are encouraged to apply! Strong...
New grad
Full time
Temporary work
Relief
Local area
Relocation package
Flexible hours
Renton Health and Rehabilitation
Renton, WA
5 days ago
PTA - Physical Therapy
...in Renton, Washington. As a member of our team, you'll have the... ...Therapy Assistant experience, but New Grads are welcome to apply Other... ...healthcare team to support optimal recovery and overall quality... ...attitude and professionalism Technical/Motor Skills - Must have the...
New grad
Temporary work
Work from home
Shift work
Fusion Medical Staffing
Renton, WA
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff — Model Optimization and Inference (New Grad). Be the first to apply!