Member of Technical Staff - Edge Inference Engineer
Liquid AI
Overview About Liquid AI Spun out of MIT CSAIL, we build general-purpose AI systems that run efficiently across deployment targets, from data center accelerators to on-device hardware, ensuring low latency, minimal memory usage, privacy, and reliability. We partner with enterprises across consumer electronics, automotive, life sciences, and financial services. We are scaling rapidly and need exceptional people to help us get there. The Opportunity Our Edge Inference team compiles Liquid Foundation Models into optimized machine code that runs on resource-constrained devices: phones, laptops, Raspberry Pis, and watches. We are core contributors to llama.cpp and build the infrastructure that makes efficient on-device AI possible. You will work directly with the technical lead on problems that require deep understanding of both ML architectures and hardware constraints. This is high-ownership work where your code ships to production and directly impacts model performance on real devices. While San Francisco and Boston are preferred, we are open to other locations. What We're Looking For We need someone who: Works autonomously: Given a target device and performance goal, you figure out how to get there without hand-holding. You diagnose bottlenecks, prototype solutions, and iterate until you hit the target. Thinks at the hardware level: You understand cache hierarchies, memory access patterns, and instruction-level optimization. You can reason about why code is slow before reaching for a profiler. Bridges ML and systems: You understand how neural networks work mathematically (matrix operations, attention mechanisms, quantization effects) and can translate that understanding into optimized implementations. Ships production code: Our work goes upstream to open-source projects and deploys to customer devices. You write code that others can maintain and extend. The Work Implement and optimize inference kernels for CPU, NPU, and GPU architectures across diverse edge hardware Develop quantization strategies (INT4, INT8, FP8) that maximize compression while preserving model quality under strict memory budgets Contribute to llama.cpp and other open-source inference frameworks, including new model architectures (audio, vision) Profile and optimize end-to-end inference pipelines to achieve sub-100ms time-to-first-token on target devices Collaborate with ML researchers to understand model architectures and identify optimization opportunities specific to Liquid Foundation Models Must-have 5+ years of experience in systems programming with strong C++ proficiency Desired Experience Embedded software engineering experience or work on resource-constrained systems Understanding of ML fundamentals at the linear algebra level (how matrix operations, attention, and quantization work) Experience with hardware architecture concepts: cache hierarchies, memory bandwidth, SIMD/vectorization Nice-to-have Contributions to llama.cpp, ExecuTorch, or similar inference frameworks Experience with Rust for systems programming Background in custom accelerator development (TPU, NPU) or work at companies like SambaNova, Cerebras, Groq, or Google/Amazon accelerator teams Quantitative degree (mathematics, physics, or similar) combined with engineering experience What Success Looks Like (Year One) Ship optimizations that achieve measurable latency or memory improvements on at least one target edge device class Successfully upstream at least one significant contribution to llama.cpp (new architecture support, kernel optimization, or quantization improvement) Own a major workstream end-to-end, such as new model architecture support, quantization pipeline for a device constraint, or target platform enablement What We Offer Rare technical challenges: Work on novel model architectures that require custom optimization strategies. Your code ships to production and runs on real devices. Compensation: Competitive base salary with equity in a unicorn-stage company Health: We pay 100% of medical, dental, and vision premiums for employees and dependents Financial: 401(k) matching up to 4% of base pay Time Off: Unlimited PTO plus company-wide Refill Days throughout the year #J-18808-Ljbffr Liquid AI
$150k - $300k
...cloud LLM serving, LLM inference optimization and RL systems... ...training stack. Core Technical Responsibilities LLM... ...PyTorch: LLM Inference engine development and integration... ...working on cutting‑edge problems in AI infrastructure... ...and encourage team members to contribute to the...SuggestedWork at officeRemote workVisa sponsorshipRelocation packageFlexible hoursShift work- Member of Technical Staff, ML Infrastructure & Inference Overview We are a cutting-edge AI infrastructure company is building a scalable cloud platform designed for next-generation... .... This opportunity is well suited to engineers who understand how modern models execute at...Suggested
$200k - $350k
...Member Of Technical Staff, Inference & Serving Inception creates the world's fastest, most efficient AI models. Our Mercury model is the world'... ...with best-in-class quality. We are the AI researchers and engineers behind such breakthrough AI technologies as diffusion...SuggestedImmediate startFlexible hours$180k
...Member Of Technical Staff - Inference Palo Alto, CA About Xai Xai's mission is to create AI systems that can accurately understand the universe... .... Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who...SuggestedTemporary work- ...AI datacenters. Mission Gimlet Labs is seeking a Member of Technical Staff focused on ML systems and inference. In this role, you will design and build... ...predictable, and scalable. This role is ideal for engineers who deeply understand how modern models execute in...Suggested
$120k - $180k
Quantum Engineer - Member of Technical Staff Join to apply for the Quantum Engineer - Member of Technical Staff role at Conductor Quantum . This range is provided by Conductor Quantum. Your actual pay will be based on your skills and experience — talk with your recruiter...Full time- ...Gimlet Labs is seeking an Member of Staff focused on AI... ...experimenting with novel inference efficiency techniques... ...evaluating cutting-edge AI research Researching... ...in computer science, engineering, or comparable area... ...alongside highly technical engineers, and help shape...Internship
- Liquid AI is seeking a Systems Programmer to join their Edge Inference team in San Francisco. In this role, you will implement and optimize inference kernels on various hardware, ensuring efficiency and performance. Ideal candidates have over 5 years of systems programming...Flexible hours
$150k - $280k
...Member of Technical Staff (Backend) San Francisco, CA Compensation: $150,... ...growth and is expanding its engineering team to accelerate development... ...: - Distributed inference - Caching - Queue orchestration... ...signals, and compliance edge cases. - Experiment with...Full timeTemporary workH1bWork at officeVisa sponsorshipRelocation package$150k - $300k
...the jobs. Core Technical Responsibilities... ...Kubernetes-based training and inference orchestration across... ...We're looking for engineers who are fluent across... ...researchers working on cutting-edge problems in AI... ...development and encourage team members to contribute to the...Work at officeLocal areaRemote workVisa sponsorshipRelocation packageFlexible hours- Pixeltable Inc. Member of Technical Staff San Francisco, CA·Full time Apply for... ...As a founding member of the engineering team, you will impact the design... ...training/fine-tuning, and inference? You will also: Find... ...into a wide range of cutting-edge AI tools, as we continually...Full timePart timeWork at officeWork from homeFlexible hours2 days per week
- ...is a team of researchers, engineers, designers, and more, who are... ...and shape the future! Member of Technical Staff, Search Why this role? We are... ...train and improve upon cutting-edge search models. Gather high... ...team to ensure that inference is fast and stable. Collaborate...Full timeWork at officeRemote workFlexible hours
- .... The Role We’re hiring a Member of Technical Staff - AI/ML to design, build, and... ...is a hands‑on role for an engineer who thrives at the... ...application — turning cutting‑edge models into real‑world value... ...from data ingestion through inference, ensuring reliability, scalability...Full timeFlexible hours
- Member of Technical Staff — AI/ML Engineering (Financial Technology) Build intelligent systems that redefine how businesses... ...the opportunity to apply cutting‑edge artificial intelligence techniques... ...data ingestion, model training, inference, and monitoring while ensuring...Full timeFlexible hours
- Member of Technical Staff — Voice & Audio AI Systems Build intelligent voice experiences... .... You will take cutting‑edge advancements in speech... .... This is a hands‑on engineering role for someone who enjoys... ...audio ingestion, streaming inference, orchestration, and monitoring...Full timeFlexible hours
- ...architectures, but from better data. As a member of the Data Team, your mission is to... ...large data campaigns. We're looking for engineers who combine strong engineering... ...the ability to clearly articulate complex technical concepts across teams What We Offer...Relocation package
$200k
...scale pre‑training, domain‑specific RL, ultra‑long context, and inference‑time compute to achieve this goal. About the role We're... ...workflows Raise the bar on code organization, packaging, and engineering best practices What we’re looking for Nice-to-Haves Strong software...Work at officeRelocationVisa sponsorship- ...the role Gimlet Labs is seeking a Member of Technical Staff (Intern) to help develop Gimlet's... ...Evaluating and implementing cutting-edge AI research Researching ways to improve... ...pursuing degree in computer science, engineering, or comparable area of study...Internship
$95k
...What You’ll Do We’re hiring Edge Engineers to partner closely with our... ...This is a hands‑on, highly technical role where you will work across... ...troubleshooting of cameras, inference pipelines, and data uploads... ...Roboflow users turned team members, open source contributors, a...Remote workWork from homeRelocation packageFlexible hours$167.2k - $209k
...applications. We are seeking a Senior Engineer 2 to join our AI Inference Data Plane team. In this role, you will be a key technical leader responsible for... ...’ll be a part of a cutting‑edge technology company with an... ...changes the world. As a member of the team, you will be a...Local areaRemote workWorldwideFlexible hours$200k
...specific RL, ultra-long context, and inference-time compute to achieve this goal. About... ...'s most important decisions. As a Member of Technical Staff on Evals, you will build both the... ...Excitement about helping researchers and engineers make better decisions through...Visa sponsorshipRelocation package$140k - $200k
...Member of Technical Staff Harper is an AI-native commercial insurance company in San Francisco. We... ...operate Harper - not features around the edges, the actual intelligence that runs the... ...unusual: at most companies a junior engineer waits in line behind layers of process...Work at officeRelocation$200k - $350k
...for both clients and candidates. Member of Technical Staff - Pre-Training Infrastructure... ...research environments. ~ Strong systems engineering skills spanning machine learning infrastructure... ...performance. Exposure to cutting-edge robotics, multimodal AI, and large-...Work at officeVisa sponsorship$256k - $276k
...vision at Postman. The Opportunity As a Member of Technical Staff on AI Infrastructure, you will build and... ...that power AI model post training, inference, and data pipelines. You will collaborate with engineering and research teams to ensure performance, scalability...Work at officeFlexible hours3 days per week- ...Member Of Technical Staff We're looking for a member of technical staff to build and deploy production... ...scalable pipelines for training, inference, and data processing Improve... ...Bachelor's or Master's in computer science, engineering, or related field Strong...
$200k - $300k
...Member of Technical Staff (Platform) Title of Role: Member of Technical Staff (Platform)... ...conducted. This company leverages cutting-edge technology to engage with thousands of... ...user experience. Tackle complex engineering challenges, making informed trade-offs...Work at office- ...Role As a Forward Deployed Engineer at Sieve, you'll work on... ...post-processing, parallelism, inference optimization, fine-tuning, and... ...ambiguous needs into concrete technical systems Strong Python... ...filtering, labeling, evaluation, and edge cases Able to break...
- ...You’ll Do Translate cutting‑edge research into production‑ready... ...‑functional teams (product, engineering, design) to deliver ML solutions... ...end‑to‑end ML deployment and inference systems, especially for low‑... ...along with many of our team members, has contributed to many of...H1bRemote workVisa sponsorship
- ...Shapes every single day, and everyone talks to users. Member of Technical Staff is the title we use for engineers who own hard problems end to end across the stack.... ...with LLM training, fine-tuning, evaluation, inference, or RAG at scale High-performance Python backends...
- ...React | AI Agents | Forward-Deployed Engineering Location: San Francisco, CA • In-Person... ...Apple, Ramp, Stripe, and Meta. As a Member of Technical Staff , you will own products end‑to‑end... ...join a company building at the front edge of enterprise AI. You will work with...Work at office
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Member of Technical Staff - Edge Inference Engineer. Be the first to apply!
- remote support technician San Francisco, CA
- personal computer support technician San Francisco, CA
- customer support analyst San Francisco, CA
- systems support technician San Francisco, CA
- help desk administrator San Francisco, CA
- decision support analyst San Francisco, CA
- technical support assistant San Francisco, CA
- technical analyst San Francisco, CA
- technical assistant San Francisco, CA
- IT support technician San Francisco, CA

