Member of Technical Staff - Edge Inference Engineer

Liquid AI

Overview About Liquid AI Spun out of MIT CSAIL, we build general-purpose AI systems that run efficiently across deployment targets, from data center accelerators to on-device hardware, ensuring low latency, minimal memory usage, privacy, and reliability. We partner with enterprises across consumer electronics, automotive, life sciences, and financial services. We are scaling rapidly and need exceptional people to help us get there. The Opportunity Our Edge Inference team compiles Liquid Foundation Models into optimized machine code that runs on resource-constrained devices: phones, laptops, Raspberry Pis, and watches. We are core contributors to llama.cpp and build the infrastructure that makes efficient on-device AI possible. You will work directly with the technical lead on problems that require deep understanding of both ML architectures and hardware constraints. This is high-ownership work where your code ships to production and directly impacts model performance on real devices. While San Francisco and Boston are preferred, we are open to other locations. What We're Looking For We need someone who: Works autonomously: Given a target device and performance goal, you figure out how to get there without hand-holding. You diagnose bottlenecks, prototype solutions, and iterate until you hit the target. Thinks at the hardware level: You understand cache hierarchies, memory access patterns, and instruction-level optimization. You can reason about why code is slow before reaching for a profiler. Bridges ML and systems: You understand how neural networks work mathematically (matrix operations, attention mechanisms, quantization effects) and can translate that understanding into optimized implementations. Ships production code: Our work goes upstream to open-source projects and deploys to customer devices. You write code that others can maintain and extend. The Work Implement and optimize inference kernels for CPU, NPU, and GPU architectures across diverse edge hardware Develop quantization strategies (INT4, INT8, FP8) that maximize compression while preserving model quality under strict memory budgets Contribute to llama.cpp and other open-source inference frameworks, including new model architectures (audio, vision) Profile and optimize end-to-end inference pipelines to achieve sub-100ms time-to-first-token on target devices Collaborate with ML researchers to understand model architectures and identify optimization opportunities specific to Liquid Foundation Models Must-have 5+ years of experience in systems programming with strong C++ proficiency Desired Experience Embedded software engineering experience or work on resource-constrained systems Understanding of ML fundamentals at the linear algebra level (how matrix operations, attention, and quantization work) Experience with hardware architecture concepts: cache hierarchies, memory bandwidth, SIMD/vectorization Nice-to-have Contributions to llama.cpp, ExecuTorch, or similar inference frameworks Experience with Rust for systems programming Background in custom accelerator development (TPU, NPU) or work at companies like SambaNova, Cerebras, Groq, or Google/Amazon accelerator teams Quantitative degree (mathematics, physics, or similar) combined with engineering experience What Success Looks Like (Year One) Ship optimizations that achieve measurable latency or memory improvements on at least one target edge device class Successfully upstream at least one significant contribution to llama.cpp (new architecture support, kernel optimization, or quantization improvement) Own a major workstream end-to-end, such as new model architecture support, quantization pipeline for a device constraint, or target platform enablement What We Offer Rare technical challenges: Work on novel model architectures that require custom optimization strategies. Your code ships to production and runs on real devices. Compensation: Competitive base salary with equity in a unicorn-stage company Health: We pay 100% of medical, dental, and vision premiums for employees and dependents Financial: 401(k) matching up to 4% of base pay Time Off: Unlimited PTO plus company-wide Refill Days throughout the year #J-18808-Ljbffr Liquid AI

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the Member of Technical Staff - Edge Inference Engineer in San Francisco, CA vacancy

Member of Technical Staff - Inference
$150k - $300k
...cloud LLM serving, LLM inference optimization and RL systems... ...training stack. Core Technical Responsibilities LLM... ...PyTorch: LLM Inference engine development and integration... ...working on cutting‑edge problems in AI infrastructure... ...and encourage team members to contribute to the...
Suggested
Work at office
Remote work
Visa sponsorship
Relocation package
Flexible hours
Shift work
Prime-Intellect
San Francisco, CA
1 day ago
Member of Technical Staff, ML Infrastructure & Inference
Member of Technical Staff, ML Infrastructure & Inference Overview We are a cutting-edge AI infrastructure company building a scalable cloud platform designed for next-generation... .... This opportunity is well suited to engineers who understand how modern models execute...
Suggested
Acceler8 Talent
San Francisco, CA
2 days ago
Member of Technical Staff, Inference
Member of Technical Staff — ML Systems & Inference Employment Type: Full-time Workplace: On-site About the Company We are building the execution layer for... ...changes the company. As an early member of the engineering team, you will help define the systems, standards,...
Suggested
Full time
Acceler8 Talent
San Francisco, CA
4 days ago
Member of Technical Staff - ML Systems & Inference
...gigawatt-class AI datacenters. Gimlet Labs is seeking a Member of Technical Staff focused on ML systems and inference. In this role, you will design and build the... ..., and scalable. This role is ideal for engineers who deeply understand how modern models execute in...
Suggested
Gimlet Labs
San Francisco, CA
5 days ago
Quantum Engineer (Spin) - Member of Technical Staff
$120k - $180k
Quantum Engineer - Member of Technical Staff Join to apply for the Quantum Engineer - Member of Technical Staff role at Conductor Quantum . This range is provided by Conductor Quantum. Your actual pay will be based on your skills and experience — talk with your recruiter...
Suggested
Full time
Conductor Quantum
San Francisco, CA
3 days ago
Member of Technical Staff
...Pixeltable Inc. Member of Technical Staff San Francisco, CA·Full time Apply for... ...As a founding member of the engineering team, you will impact the design... ...training/fine-tuning, and inference? You will also: Find... ...into a wide range of cutting-edge AI tools, as we continually...
Full time
Part time
Work at office
Work from home
Flexible hours
2 days per week
Pixeltable, Inc.
San Francisco, CA
2 days ago
Member of Technical Staff - Audio and Voice
$220k - $320k
...One. The Role We’re hiring a Member of Technical Staff – Audio and Voice Systems... ...is a hands‑on role for an engineer who thrives at the intersection... ...impact—turning cutting‑edge models into trusted, delightful... ...audio ingestion, streaming inference, orchestration, and...
Full time
Flexible hours
Dormont Manufacturing Company
San Francisco, CA
4 days ago
Edge Inference Engineer: Optimize On-Device AI Kernels
Liquid AI is seeking a Systems Programmer to join their Edge Inference team in San Francisco. In this role, you will implement and optimize inference kernels on various hardware, ensuring efficiency and performance. Ideal candidates have over 5 years of systems programming...
Flexible hours
Liquid AI
San Francisco, CA
4 days ago
Member of Technical Staff - Inference
...parallelism strategies, and help us squeeze every FLOP out of our hardware. What you’ll do Modify and extend state-of-the-art inference engines like vLLM and SGLang. Understand every microsecond of GPU time spent during a forward pass. You'll be able to explain every...
Sail Research
San Francisco, CA
2 days ago
Member of Technical Staff, Inference
$350k
...for large-scale experiments. Our team includes researchers and engineers from Anthropic, Google DeepMind, xAI, OpenAI, Microsoft, Apple... ...and MIT. The Role We are looking for an engineer to own the inference systems that power our models in production and research. You'...
Mirendil
San Francisco, CA
2 days ago
Member of Technical Staff - Training Platform
$150k - $300k
...runs the jobs. Core Technical Responsibilities Hosted... ...Kubernetes-based training and inference orchestration across... ...We're looking for engineers who are fluent across... ...working on cutting‑edge problems in AI infrastructure... ...and encourage team members to contribute to the...
Work at office
Local area
Remote work
Visa sponsorship
Relocation package
Flexible hours
Kubelt
San Francisco, CA
3 days ago
Member of Technical Staff, Inference
$200k - $400k
About The Role We're looking for an inference runtime engineer to push the boundaries of what's possible in LLM and diffusion model serving. Models... ..., Unsloth, LlamaFactory, etc). Written widely-shared technical blogs or side projects on vLLM or LLM inference. Logistics...
Remote work
Visa sponsorship
Shift work
Inferact
San Francisco, CA
5 days ago
Member of Technical Staff, Inference & RL Systems
$225k
About the role As a Software Engineer on the Inference & RL Systems team, you will design and operate the distributed systems that serve our models in production and power large-scale post-training workflows. This role sits at the boundary between model execution and distributed...
Relocation
Visa sponsorship
Dormont Manufacturing Co
San Francisco, CA
1 day ago
Member of Technical Staff, Hardware, Compiler Engineer
$200k - $420k
...rewriting the entire stack from scratch: personal hardware for local inference, custom training infrastructure, next‑generation UIs, and frontier deep learning research. Who we are We are scientists, engineers, and builders from the industry's top tech companies and AI labs...
Local area
Visa sponsorship
Work visa
Relocation package
Flexible hours
River AI
San Francisco, CA
2 days ago
Member of Technical Staff, DX & Data Tooling Engineer
$200k
...scale pre‑training, domain‑specific RL, ultra‑long context, and inference‑time compute to achieve this goal. About the role We're... ...workflows Raise the bar on code organization, packaging, and engineering best practices What we’re looking for Nice-to-Haves Strong software...
Work at office
Relocation
Visa sponsorship
Magic Inc
San Francisco, CA
3 days ago
Robotics Software Engineer: Edge Inference & Teleoperation
...seeking an experienced Robotics Software Engineer focused on embodied AI and bridging the gap... ...pipelines, on- and off-device inference, and hands-on lab experimentation to prove... ...model inference pipelines directly on the edge hardware. Integrate teleoperation workflows...
Efference
San Francisco, CA
2 days ago
Member of Technical Staff
...benchmarking company. We support labs, engineers and enterprises to understand AI... ...benchmarks don't just measure the cutting edge of AI, they are actively shaping the frontier... ...what cutting edge means. We're hiring Members of Technical Staff to design the evaluations that set the...
Artificial Analysis, Inc.
San Francisco, CA
5 days ago
Member of Technical Staff
$227.5k - $401k
...Adyen, everything we do is engineered for ambition. We create an... ...individuals who tackle unique technical challenges at scale and... ...and application of cutting‑edge AI research within the financial... ...technology sector. As a Member of Technical Staff, you will operate with a high...
Work at office
Immediate start
Relocation
Flexible hours
Adyen
San Francisco, CA
1 day ago
Member of Technical Staff
$250k
...servers. The team is small, technical, and moving fast, with... ...Industry: AI Tools. The Role Member of Technical Staff who can handle everything... ...scalable pipelines for training, inference, and data processing... ...stack: Python; modern engineering / ML frameworks; AWS or GCP...
Full time
David Joseph & Company
San Francisco, CA
2 days ago
Member of Technical Staff
$10k
...exploding, and we’re expanding our engineering team to move faster & meet... ...manual work, 4× lower costs. Technical Challenges 1. Browser Agents... ...on AWS means distributed inference, caching, queue orchestration... ...good enough. You obsess over edge cases, build robust error handling...
Temporary work
Work at office
Relocation package
Sphinx
San Francisco, CA
2 days ago
Member of Technical Staff
...reliably than most of the world's software engineers. AI is already generating quantum... ...of science. Role Overview As a Member of Technical Staff you will shape Conductor's core offerings... ...for data collection, labelling, and inference. Integrate with external systems for...
Conductor Quantum
San Francisco, CA
2 days ago
Member of Technical Staff
...Shapes every single day, and everyone talks to users. Member of Technical Staff is the title we use for engineers who own hard problems end to end across the stack.... ...with LLM training, fine-tuning, evaluation, inference, or RAG at scale High-performance Python backends...
Shapes
San Francisco, CA
2 days ago
Member of Technical Staff
...Member Of Technical Staff We're looking for a member of technical staff to build and deploy production... ...scalable pipelines for training, inference, and data processing Improve... ...Bachelor's or Master's in computer science, engineering, or related field Strong...
ERAGON
San Francisco, CA
1 day ago
Member of Technical Staff
$140k - $200k
...Member of Technical Staff Harper is an AI-native commercial insurance company in San Francisco. We... ...operate Harper - not features around the edges, the actual intelligence that runs the... ...unusual: at most companies a junior engineer waits in line behind layers of process...
Work at office
Relocation
Harper Group
San Francisco, CA
5 days ago
Member of Technical Staff - Data Quality Engineer (Post-training)
...architectures, but from better data. As a member of the Data Team, your mission is to... ...large data campaigns. We’re looking for engineers who combine strong engineering fundamentals... ...ability to clearly articulate complex technical concepts across teams. What We Offer: We...
Relocation package
Reflection
San Francisco, CA
2 days ago
Edge Engineer
$95k
...What You’ll Do We’re hiring Edge Engineers to partner closely with our... ...This is a hands‑on, highly technical role where you will work across... ...troubleshooting of cameras, inference pipelines, and data uploads... ...Roboflow users turned team members, open source contributors, a...
Remote work
Work from home
Relocation package
Flexible hours
Roboflow
San Francisco, CA
2 days ago
Sieve — Member of Technical Staff, Applied Research
$150k - $350k
Sieve — Member of Technical Staff, Applied Research Type: Full-time | On-site | San Francisco, CA Compensation... ...do. The Role As an applied research engineer, you'll build high-performance... ...processing, parallelism, pipelining, inference optimization, and occasional fine-...
Full time
Work experience placement
H1b
Work at office
Visa sponsorship
davidjoseph-co
San Francisco, CA
3 days ago
Founding Member of Technical Staff, AI Infrastructure
# Founding Member of Technical Staff, AI Infrastructure**Location:** San Francisco / Bay Area preferred... ...cheaper and easier to own by turning inference behavior, traces, workload replay,... ...About the RoleThis is a broad founding engineering role for a senior builder who can...
Full time
Remote work
Touchdown Labs, Inc.
San Francisco, CA
1 day ago
Senior Member of Technical Staff - Infrastructure Security
Member of Technical Staff - Infrastructure Security We're partnering with a frontier AI research company... .... Their team includes researchers, engineers, and operators from some of the world'... ...technology companies, working at the cutting edge of model development, infrastructure,...
Xcede
San Francisco, CA
1 day ago
Member of Technical Staff - Sandbox Platform
$150k - $300k
...Location Type On-site Department Engineering Building Open... ...and reliable at scale. Core Technical Responsibilities Infrastructure... ...researchers working on cutting-edge problems in AI infrastructure... ...development and encourage team members to contribute to the broader...
Full time
Work at office
Remote work
Visa sponsorship
Relocation package
Flexible hours
Menlo Ventures
San Francisco, CA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff - Edge Inference Engineer. Be the first to apply!