Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Member of Technical Staff - Edge Inference Engineer

Liquid AI

Overview About Liquid AI Spun out of MIT CSAIL, we build general-purpose AI systems that run efficiently across deployment targets, from data center accelerators to on-device hardware, ensuring low latency, minimal memory usage, privacy, and reliability. We partner with enterprises across consumer electronics, automotive, life sciences, and financial services. We are scaling rapidly and need exceptional people to help us get there. The Opportunity Our Edge Inference team compiles Liquid Foundation Models into optimized machine code that runs on resource-constrained devices: phones, laptops, Raspberry Pis, and watches. We are core contributors to llama.cpp and build the infrastructure that makes efficient on-device AI possible. You will work directly with the technical lead on problems that require deep understanding of both ML architectures and hardware constraints. This is high-ownership work where your code ships to production and directly impacts model performance on real devices. While San Francisco and Boston are preferred, we are open to other locations. What We're Looking For We need someone who: Works autonomously: Given a target device and performance goal, you figure out how to get there without hand-holding. You diagnose bottlenecks, prototype solutions, and iterate until you hit the target. Thinks at the hardware level: You understand cache hierarchies, memory access patterns, and instruction-level optimization. You can reason about why code is slow before reaching for a profiler. Bridges ML and systems: You understand how neural networks work mathematically (matrix operations, attention mechanisms, quantization effects) and can translate that understanding into optimized implementations. Ships production code: Our work goes upstream to open-source projects and deploys to customer devices. You write code that others can maintain and extend. The Work Implement and optimize inference kernels for CPU, NPU, and GPU architectures across diverse edge hardware Develop quantization strategies (INT4, INT8, FP8) that maximize compression while preserving model quality under strict memory budgets Contribute to llama.cpp and other open-source inference frameworks, including new model architectures (audio, vision) Profile and optimize end-to-end inference pipelines to achieve sub-100ms time-to-first-token on target devices Collaborate with ML researchers to understand model architectures and identify optimization opportunities specific to Liquid Foundation Models Must-have 5+ years of experience in systems programming with strong C++ proficiency Desired Experience Embedded software engineering experience or work on resource-constrained systems Understanding of ML fundamentals at the linear algebra level (how matrix operations, attention, and quantization work) Experience with hardware architecture concepts: cache hierarchies, memory bandwidth, SIMD/vectorization Nice-to-have Contributions to llama.cpp, ExecuTorch, or similar inference frameworks Experience with Rust for systems programming Background in custom accelerator development (TPU, NPU) or work at companies like SambaNova, Cerebras, Groq, or Google/Amazon accelerator teams Quantitative degree (mathematics, physics, or similar) combined with engineering experience What Success Looks Like (Year One) Ship optimizations that achieve measurable latency or memory improvements on at least one target edge device class Successfully upstream at least one significant contribution to llama.cpp (new architecture support, kernel optimization, or quantization improvement) Own a major workstream end-to-end, such as new model architecture support, quantization pipeline for a device constraint, or target platform enablement What We Offer Rare technical challenges: Work on novel model architectures that require custom optimization strategies. Your code ships to production and runs on real devices. Compensation: Competitive base salary with equity in a unicorn-stage company Health: We pay 100% of medical, dental, and vision premiums for employees and dependents Financial: 401(k) matching up to 4% of base pay Time Off: Unlimited PTO plus company-wide Refill Days throughout the year #J-18808-Ljbffr Liquid AI

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff - Edge Inference Engineer in San Francisco, CA vacancy
  • $150k - $300k

     ...cloud LLM serving, LLM inference optimization and RL systems...  ...training stack. Core Technical Responsibilities LLM...  ...PyTorch: LLM Inference engine development and integration...  ...working on cutting‑edge problems in AI infrastructure...  ...and encourage team members to contribute to the... 
    Suggested
    Work at office
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours
    Shift work

    Prime-Intellect

    San Francisco, CA
    5 days ago
  •  ...AI datacenters. Mission Gimlet Labs is seeking a Member of Technical Staff focused on ML systems and inference. In this role, you will design and build...  ...predictable, and scalable. This role is ideal for engineers who deeply understand how modern models execute in... 
    Suggested

    Gimlet Labs, Inc.

    San Francisco, CA
    1 day ago
  • $120k - $180k

    Quantum Engineer - Member of Technical Staff Join to apply for the Quantum Engineer - Member of Technical Staff role at Conductor Quantum . This range is provided by Conductor Quantum. Your actual pay will be based on your skills and experience — talk with your recruiter... 
    Suggested
    Full time

    Conductor Quantum

    San Francisco, CA
    2 days ago
  •  ...Gimlet Labs is seeking an Member of Staff focused on AI Research (Intern...  ...and experimenting with novel inference efficiency techniques such...  ...Monitoring and evaluating cutting-edge AI research Researching...  ...degree in computer science, engineering, or comparable area of study... 
    Suggested
    Internship

    Gimlet Labs

    San Francisco, CA
    1 day ago
  • Liquid AI is seeking a Systems Programmer to join their Edge Inference team in San Francisco. In this role, you will implement and optimize inference kernels on various hardware, ensuring efficiency and performance. Ideal candidates have over 5 years of systems programming... 
    Suggested
    Flexible hours

    Liquid AI

    San Francisco, CA
    3 days ago
  •  ...Member of Technical Staff, Model EfficiencyWho are we?Our mission is to scale intelligence...  ...is a team of researchers, engineers, designers, and more, who...  ...the boundaries of LLM inference efficiency. We develop...  ...with a team on the cutting edge of AI researchWeekly lunch... 
    Full time
    Work at office
    Remote work
    Flexible hours

    Cohere

    San Francisco, CA
    1 day ago
  • Pixeltable Inc. Member of Technical Staff San Francisco, CA·Full time Apply for...  ...As a founding member of the engineering team, you will impact the design...  ...training/fine-tuning, and inference? You will also: Find...  ...into a wide range of cutting-edge AI tools, as we continually... 
    Full time
    Part time
    Work at office
    Work from home
    Flexible hours
    2 days per week

    Pixeltable, Inc.

    San Francisco, CA
    4 days ago
  •  .... The Role We’re hiring a Member of Technical Staff - AI/ML to design, build, and...  ...is a hands‑on role for an engineer who thrives at the...  ...application — turning cutting‑edge models into real‑world value...  ...from data ingestion through inference, ensuring reliability, scalability... 
    Full time
    Flexible hours

    Stuut

    San Francisco, CA
    1 day ago
  • $150k - $300k

     ...runs the jobs. Core Technical Responsibilities Hosted...  ...Kubernetes-based training and inference orchestration across...  ...We're looking for engineers who are fluent across...  ...working on cutting‑edge problems in AI infrastructure...  ...and encourage team members to contribute to the... 
    Work at office
    Local area
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours

    Prime Intellect

    San Francisco, CA
    12 days ago
  •  ...Page One. Role We are hiring a Member of Technical Staff - Audio and Voice Systems. The engineer will design, build and deploy...  ...business impact, turning cutting‑edge models into trustworthy,...  ...spanning audio ingestion, streaming inference, orchestration and monitoring with... 
    Full time
    Flexible hours

    Dormont Manufacturing Co

    San Francisco, CA
    4 days ago
  • $150k

     ...Robotics (FAR) team is seeking a Member of Technical Staff to drive foundational...  ...the intersection of cutting‑edge AI research and real‑world...  ...action models, efficient model inference, video tokenization Design...  ...research with practical engineering implementation in real... 
    Local area

    Amazon Science

    San Francisco, CA
    3 days ago
  • Overview Build low-latency inference pipelines for on-device deployment, enabling real-time next-token and diffusion-based control loops in robotics Design and optimize distributed inference systems on GPU clusters, pushing throughput with large-batch serving and efficient... 
    Remote job

    Genesis AI

    San Francisco, CA
    1 day ago
  • $150k - $280k

    Member of Technical Staff (Backend) San Francisco, CA Compensation: $150,000 - $...  ...growth and is expanding its engineering team to accelerate...  ...agent pipelines, distributed inference, and automation frameworks....  ...risk signals, and compliance edge cases. Experiment with frontier... 
    Full time
    Temporary work
    H1b
    Work at office
    Visa sponsorship
    Relocation package

    Fuku

    San Francisco, CA
    3 days ago
  • Member of Technical Staff — AI/ML Engineering (Financial Technology) Build intelligent systems that redefine how businesses...  ...the opportunity to apply cutting‑edge artificial intelligence techniques...  ...data ingestion, model training, inference, and monitoring while ensuring... 
    Full time
    Flexible hours

    Andiamo

    San Francisco, CA
    2 days ago
  •  ...is a team of researchers, engineers, designers, and more, who are...  ..."ML Engineer" role. As a Member of Technical Staff, Applied ML, you will: Work...  ...and distributed training or inference pipelines. Understanding...  ...with a team on the cutting edge of AI research Weekly lunch... 
    Full time
    Work at office
    Remote work
    Flexible hours

    Cohere

    San Francisco, CA
    3 days ago
  • Member of Technical Staff — Voice & Audio AI Systems Build intelligent voice experiences...  .... You will take cutting‑edge advancements in speech...  .... This is a hands‑on engineering role for someone who enjoys...  ...audio ingestion, streaming inference, orchestration, and monitoring... 
    Full time
    Flexible hours

    Andiamo

    San Francisco, CA
    5 days ago
  •  ...Data Team Engineer Data is playing an increasingly crucial role at the frontier of AI...  ...architectures, but from better data. As a member of the Data Team, your mission is to...  ...the ability to clearly articulate complex technical concepts across teams What We Offer... 
    Relocation package

    Reflection AI

    San Francisco, CA
    3 days ago
  •  ...The Role We’re hiring a Member of Technical Staff - AI/ML to design, build, and...  ...is a hands-on role for an engineer who thrives at the intersection...  ...— turning cutting-edge models into real-world value...  ...pipelines from ingestion to inference — reliable, maintainable, and... 
    Full time
    Flexible hours

    Stuut

    San Francisco, CA
    16 hours ago
  • $200k

     ...scale pre‑training, domain‑specific RL, ultra‑long context, and inference‑time compute to achieve this goal. About the role We're...  ...workflows Raise the bar on code organization, packaging, and engineering best practices What we’re looking for Nice-to-Haves Strong software... 
    Work at office
    Relocation
    Visa sponsorship

    Magic

    San Francisco, CA
    2 days ago
  •  ...code faster and more reliably than most of the world's software engineers. Code is math; math is an instruction set; an instruction set...  ...Free healthcare, dental, and vision Work on the most important technical problem of our generation #J-18808-Ljbffr Conductor Quantum

    Conductor Quantum

    San Francisco, CA
    2 days ago
  • $200k - $350k

     ...Member of ML Technical Staff Title of Role: Member of ML Technical Staff Location: San Francisco...  ...We're representing a cutting-edge company at the forefront of artificial...  ...contribute to the continuous improvement of engineering practices. Analyze model... 
    Work at office
    Visa sponsorship

    Recruiting from Scratch

    San Francisco, CA
    3 days ago
  • $95k

     ...What You’ll Do We’re hiring Edge Engineers to partner closely with our...  ...This is a hands‑on, highly technical role where you will work across...  ...troubleshooting of cameras, inference pipelines, and data uploads...  ...Roboflow users turned team members, open source contributors, a... 
    Remote work
    Work from home
    Relocation package
    Flexible hours

    Roboflow

    San Francisco, CA
    1 day ago
  •  ...What we are looking for? Seeking a Member of Technical Staff - Backend with 5+ years of experience...  ...Design and build the integration of ML inference, monitoring systems, LLM interactions...  ...of experience in backend software engineering, with a focus on Python in well-established... 
    Work experience placement

    RST Recruitment

    San Francisco, CA
    22 days ago
  • $150k - $350k

     ...Job Description Job Description Member of Technical Staff, Applied Research — Sieve Location:...  ...a deeply technical applied research engineering role sitting between research and production...  ...optimization Parallelized inference systems and pipeline orchestration... 
    Full time
    H1b
    Visa sponsorship

    David Joseph & Company

    San Francisco, CA
    15 days ago
  • $170k - $220k

    Member of Technical Staff - Infrastructure & LLMs Location: San Francisco, CA (Hybrid) Compensation: $170...  ...a deeply curious and technically strong engineer to join a lean, high-performance team building next-generation inference infrastructure for LLMs. This is an opportunity... 
    Full time
    Temporary work
    Immediate start
    Visa sponsorship
    Work visa

    Amadeus Search

    San Francisco, CA
    2 days ago
  •  ...forefront of gene modulation innovation. We combine cutting‑edge CRISPR technologies with advanced computational approaches...  ...The Role We are seeking a highly motivated and experienced Member of Technical Staff, Computational Biology to join our dynamic R&D team. In this... 

    Algen Biotechnologies

    San Francisco, CA
    4 days ago
  •  ...Role As a Deployed Research Engineer at Sieve, you’ll work on highly...  ...‑processing, parallelism, inference optimization, fine‑tuning, and...  ...ambiguous needs into concrete technical systems Strong Python developer...  ..., labeling, evaluation, and edge cases Able to break customer... 

    Sieve

    San Francisco, CA
    2 days ago
  • Member of Technical Staff, Applied AI The opportunity We are looking for a Member of Technical Staff...  ...team of machine learners, protein engineers and biologists, jointly working to change...  ...architectures, training dynamics and inference behaviour. You are a skilful ML... 
    Flexible hours

    Latent Labs

    San Francisco, CA
    2 days ago
  • $150k - $300k

     ...distributed system with performance engineering at its core. The role will...  ...and reliable at scale. Core Technical Responsibilities...  ...researchers working on cutting-edge problems in AI infrastructure...  ...development and encourage team members to contribute to the broader... 
    Work at office
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours

    Prime Intellect

    San Francisco, CA
    3 days ago
  • $227.5k - $401k

     ...Adyen, everything we do is engineered for ambition. We create an...  ...individuals who tackle unique technical challenges at scale and...  ...and application of cutting‑edge AI research within the financial...  ...technology sector. As a Member of Technical Staff, you will operate with a high... 
    Work at office
    Immediate start
    Relocation
    Flexible hours

    Adyen

    San Francisco, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff - Edge Inference Engineer. Be the first to apply!