AI Inference Engineer

$110k - $270k

Full-time

quadric, Inc

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.

Role

The AI Inference Engineer in Quadric is the key bridge between the world of AI/LLM models and Quadric unique platforms. The AI Inference Engineer at Quadric will [1] port AI models to Quadric platform; [2] optimize the model deployment for efficient inference; [3] profile and benchmark the model performance. This senior technical role demands deep knowledge of AI model algorithms, system architecture and AI toolchains/frameworks.

This California Bay Area based role follows a hybrid schedule, with at least two in-office days per week at our Burlingame office, the ability to commute regularly, and occasional additional onsite days as needed based on team and business priorities. The team and company also gather periodically for onsite meetings and offsite events, which are valued opportunities to connect, collaborate, and align.

Responsibilities

Quantize, prune and convert models for deployment
Port models to Quadric platform using Quadric toolchain
Optimize inference deployment for latency, speed
Benchmark and profile model performance and accuracy
Collaborate across related areas of the AI inference stack to support team and business priorities
Develop tools to scale and speed up the deployment
Make Improvement to SDK and runtime
Provide technical support and documents to customers and developer community

Requirements

Bachelor’s or Master’s in Computer Science and/or Electric Engineering.

5+ years of experience in AI/LLM model inference and deployment frameworks/tools
experience with model quantization (PTQ, QAT) and tools
experience with model accuracy measures
experience with model inference performance profiling
experience with at least one of the following frameworks: onnxruntime, Pytorch, vLLM, huggingface-transformer, neural-compressor, llamacpp
Proficiency in C/C++ and Python
Demonstrate good capability in problem solving, debug and communication

Benefits

At Quadric, we value Integrity, Humility, and Happiness. What we expect from one another is simple and clear: Initiative, Collaboration, and Completion. We are a collaborative team focused on building something extraordinary in the edge computing space.

Competitive salary and meaningful equity
Medical, dental, and vision plans starting on day one
401(k) retirement plan
Flexible paid time off (unlimited, non-accrual) to support work-life balance
When working in-office, enjoy company-provided lunches and a stocked kitchen
Convenient office location within walking distance of the Caltrain station
Support for commuting, including monthly parking or Caltrain passes
Downtown Burlingame office location, close to shops, cafes, and local amenities
A politics-free, highly collaborative environment where talented people can do their best work and make an immediate impact
The opportunity to build long-term career relationships in a company that values strong personal connections alongside professional excellence

The base salary range for this position is $110,000 to $270,000. This range reflects the full span of levels and geographies at which Quadric hires for this role. The actual base salary offered will depend on a number of factors, including the specific level of the role, years and depth of relevant experience, technical skills and competencies, the criticality of the role to the business, internal equity, and work location. In addition to base salary, this role is eligible for equity and a discretionary annual performance bonus as applicable to the role and level.

Quadric also offers the generous benefits package outlined above and other programs designed to support your health and wellbeing.

Founded in 2016 and based in downtown Burlingame, California, Quadric is building the world’s first supercomputer designed for the real-time needs of edge devices. Quadric aims to empower developers in every industry with superpowers to create tomorrow’s technology, today. The company was co-founded by technologists from MIT and Carnegie Mellon, who were previously the technical co-founders of the Bitcoin computing company 21.

Quadric is proud to be an equal opportunity employer. We are committed to creating an inclusive environment where people from all backgrounds can do their best work. We consider all qualified applicants without regard to race, color, religion, sex, gender identity or expression, sexual orientation, national origin, age, disability, veteran status, or any other protected characteristic under applicable law.

If this role resonates with you, we encourage you to apply even if your experience does not perfectly match every qualification. We value potential, curiosity, and a willingness to learn just as much as direct experience. Skills and growth come in many forms, and we would love to hear your story.

By submitting an application, you acknowledge that Quadric will collect and process your personal information as part of the hiring process. Please review our Privacy Policy to understand how we handle your data.

Apply

Vacancy posted more than 2 months ago

Similar jobs that could be interesting for youBased on the AI Inference Engineer in Burlingame, CA vacancy

AI Inference Engineer
$110k - $270k
...software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices,... ...conventional C++ DSP and control code. Role The AI Inference Engineer in Quadric is the key bridge between the world of AI/LLM models...
Suggested
Work at office
Local area
Immediate start
Flexible hours
2 days per week
quadric.io
Burlingame, CA
3 days ago
Senior AI Inference Engineer - Model Optimization & Deployment
$242k - $290k
...Model Optimization & Deployment Engineer The Perception team is pioneering the development... ...kernels, and build highly concurrent inference code to ensure real-time, deterministic execution... ...latency and maximize memory bandwidth on AI accelerators. Write production-level,...
Suggested
Temporary work
Relocation package
Zoox
San Mateo, CA
5 days ago
AI Inference Engineer Intern - Model Pruning
$45 - $60 per hour
...optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices,... ...receive hands-on experience working alongside industry experts in AI and semiconductor technology, with access to mentorship and...
Suggested
Hourly pay
Temporary work
Internship
Work at office
Relocation
quadric, Inc
Burlingame, CA
14 days ago
AI Engineer / Senior AI Engineer
$100k - $150k
...AI Engineer / Senior AI Engineer Blue Matter is a rapidly growing management consultancy focused on the biopharmaceutical industry.... ...ETL processes to support AI model training, fine-tuning, and inference workflows Investigate and evaluate emerging AI frameworks,...
Suggested
Temporary work
Immediate start
Flexible hours
Blue Matter
South San Francisco, CA
1 day ago
AI/ML Engineer
...Job Title: Senior AI Engineer (AI COE) Location: San Mateo, CA, USA (Onsite) Hire Type - Fulltime Only (NO C2C) Why should you... ...consistency, and hallucination reduction Optimize training and inference for cost, latency, and scalability. Collaborate with GenAI...
Suggested
Full time
Worldwide
Rakuten Symphony
San Mateo, CA
14 hours ago
AI Research Engineer
$200k - $350k
...Research Engineer | San Francisco | Full-Time Brief Overview Applied AI lab building world models for 3D game environments. Early-stage, well-backed, 3 weeks... ..., data-efficient training pipelines, real-time inference optimization, and agentic codegen tools that enable...
Full time
Visa sponsorship
Relocation package
Flexible hours
Harnham
San Mateo, CA
3 days ago
AI Applications Engineer
$110k - $270k
...software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices,... ...and conventional C++ DSP and control code. Role The AI Applications Engineer is the key bridge between development engineering and...
Work at office
Local area
Immediate start
Worldwide
Flexible hours
quadric.io
Burlingame, CA
4 days ago
AI Runtime Engineer
...Runtime Engineer – AI Runtime & Execution About the Role We're looking for a Runtime Engineer to help build and optimise the execution layer that powers next-generation AI workloads. Working at the intersection of systems software, compiler technology, and hardware...
Oho Group
San Mateo, CA
14 hours ago
Founding AI Engineer
$160k - $250k
...Title: Founding AI Engineer (Research & Systems) Target: PhDs & Research Masters from Stanford, MIT, Berkeley, CMU focused on AI, ML, NLP, Agents. Location: San Francisco, CA | On-Site Compensation: $160K - $250K | 0.8% - 2.0% Equity Visa Sponsorship: Available...
H1b
Immediate start
Visa sponsorship
Aimhire
San Mateo, CA
1 day ago
Senior AI/ML Platform Engineer
$148k - $247k
...software. Our team is at the forefront of AI, cloud, and data platform adoption,... ...teamwork. ¹ As a Senior AI/ML Platform Engineer, you will architect and scale the ML platform... ...Preferred Experience with real-time model inference and streaming ML pipelines. Deep...
Full time
Part time
Immediate start
Flexible hours
Guidewire
San Mateo, CA
5 days ago
AI Kernel Engineer
$110k - $270k
...software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices,... ...code and conventional C++ DSP and control code. Role: The AI Kernel Engineer in Quadric plays the key role to enable a large number of AI...
Full time
Temporary work
Work from home
quadric, Inc
Burlingame, CA
more than 2 months ago
Software Engineer, Inference
$187.5k - $395k
...About Luma AI Luma's mission is to build multimodal AI to expand human imagination and capabilities. We believe that multimodality... ...Ship new model architectures by integrating them into our inference engine Collaborate closely across research, engineering and...
Luma AI
Redwood City, CA
3 days ago
Senior AI Engineer
$152.7k - $249.2k
...commercial service. Overview We're looking for a Senior AI Engineer to help bring pragmatic, production-grade AI capabilities into... ..., model evaluation, CI/CD for ML, reproducible training/inference pipelines. Experience building internal AI frameworks/platforms...
Temporary work
Joby Aviation
San Carlos, CA
1 day ago
Senior AI Engineer
...areas of their business. Bring your curiosity for learning, bold ideas, courage and passion to drive life-changing impact to ZS. AI Engineer We are seeking an AI Engineer with experience in software development and LLM solution development,. The ideal candidate will...
Work experience placement
Work at office
Local area
Remote work
Work from home
Worldwide
Flexible hours
2 days per week
3 days per week
ZS
South San Francisco, CA
3 days ago
AI Engineer, Computer Vision
$240k - $280k
...AI Engineer, Computer Vision San Bruno, California Mill is a waste prevention technology company reimagining what it means to eliminate waste, starting with food. We build smart systems and infrastructure for homes, businesses, and municipalities that transform food...
Mill
San Bruno, CA
4 days ago
Senior AI Developer Productivity Engineer
$192k - $300k
...and applications. We define and enforce the best practices for engineering across the company. Our approach involves using established programming... ...solutions while establishing best practices for responsible AI integration in our development pipeline Qualifications...
Temporary work
Zoox
San Mateo, CA
5 days ago
Head of AI Engineering
...At The ReWork Group, we partner with high-growth startups and forward-thinking companies to build the future. As the Head of AI Engineering, you'll own their AI research agenda, leading the design of foundation-model and reinforcement-learning systems that reason...
The ReWork Group
San Mateo, CA
14 hours ago
AI Engineering Intern (unpaid)
...ABOUT HEKA Heka is a stealth-stage startup building AI infrastructure for clinical referral workflows. We're early but real.... ...Stanford, Harvard, and Cornell alumni with backgrounds spanning AI/ML engineering, quantitative research, and healthcare. Because we're small and...
Full time
Internship
Heka Intelligence
San Mateo, CA
1 day ago
Senior AI Infrastructure Engineer - Computer Vision
...About Obvio AI Each year, more than 40,000 people in the U.S. leave home and never... ...the pipeline—ingestion, preprocessing, inference, validation, and delivery—and build... ...back without pipeline downtime. Set the engineering standard. This is an early hire. You'll...
Local area
Obvio
San Carlos, CA
1 day ago
Forward Deployed AI Engineer, Talent
$112.5k - $300k
...adversity, and can do the impossible at record breaking speeds. About You & the Role Zipline is looking for Forward Deployed AI Engineers, Talent, who will be at the forefront of bringing GenAI into one of the most complex real-world logistics systems in the world....
Full time
Work at office
Local area
Zipline
South San Francisco, CA
7 hours ago
Senior AI Software Engineer — Full-Stack & GenAI
$80 - $85 per hour
A leading clinical development solutions provider seeks a Software Developer to innovate and develop applications that integrate AI capabilities. You will design user-centric interfaces, write maintainable code, and work with cross-functional teams. Candidates should have...
Hourly pay
Contract work
Integrated Resources, Inc ( IRI )
South San Francisco, CA
14 hours ago
Sr AI Fullstack Engineer (GC/Citizens, W2 Only)
...Job Description Our client, a world leader in biotechnology and life sciences, is looking for a "Sr AI Fullstack Engineer" based out of South San Francisco, CA. Job Duration: Long Term Contract (Possibility Of Extension) Pay Rate : $85/hr on W2 DOE Company...
Long term contract
Work at office
Dawar Consulting
South San Francisco, CA
1 day ago
Senior AI/ML Engineer
$188k - $250k
...On: Build and deploy NLP and LLM systems that analyze AI Answering engine outputs and public web content to produce measurable brand intelligence... ...experiments to improve precision/recall, latency, and total inference spend (model selection, prompt and context optimization,...
Local area
Meltwater
Redwood City, CA
2 days ago
Full Stack AI Software Engineer
$216k - $283k
...Full Stack AI Software Engineer Organizations everywhere struggle under the crushing costs and complexities of "solutions" that promise to simplify their lives. To create a better experience for their customers and employees. To help them grow. Software is a choice...
Work at office
Flexible hours
3 days per week
Freshworks
San Mateo, CA
2 days ago
AI/ML Scientist Lead Engineer
...Physics AI Leader Luminary helps engineering companies be more competitive by getting to market faster, creating new, better products, and reducing development risk. We do this with our Physics AI platform, the fastest and easiest way to build and deploy models to understand...
Luminary Cloud, Inc.
San Mateo, CA
1 day ago
Staff Machine Learning Engineer, Responsible AI Engineering
$169.1k - $270.8k
...to the world. Progress starts with you. Job Description AI Governance (AIG)Engineeringteam is part of the Data and AI... ...experience with a PhD * MS or Ph.D. degree in Computer Science, Engineering, or related field. * 8+ years' work experience in the software...
Work experience placement
Work at office
Local area
Visa
San Mateo, CA
4 days ago
Senior AI/ML Engineer LLM & Agent Stack
...Senior AI/ML Engineer — LLM & Agent Stack Every production AI system, whether it's powering customer support, writing code, analyzing financial data, or diagnosing medical conditions, needs the same foundational infrastructure. A way to route between models. A way...
TrueFoundry
San Mateo, CA
5 days ago
Lead Applied AI / GenAI Engineer
$165k - $185k
...together to support the most exciting missions in the world! About the Role We're looking for an experienced engineer to design and deliver AI-powered applications, intelligent workflows, and automation solutions that drive measurable business impact. In this role...
Qualys
San Mateo, CA
5 days ago
Principal AI/ML Engineer, Reliability
$295.25k - $345.04k
...breadth of the Roblox stack. Availability of the platform is a key company goal. We are hiring our first Principal Machine Learning engineer within our team. As a Principal Machine Learning Engineer within Reliability, you will set the 3-5 year technical...
Full time
Work experience placement
Seasonal work
H1b
Work at office
Local area
Visa sponsorship
Monday to Friday
Roblox
San Mateo, CA
1 day ago
Tech Lead - AI Engineering
...We are seeking a Tech Lead with deep experience in AI/ML to shape architecture and lead delivery in a rapidly evolving domain. This role is ideal for a senior engineer from a leading AI lab (Anthropic, OpenAI, Google DeepMind, and the like) or Big Tech (Google, Microsoft...
T3
San Mateo, CA
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Inference Engineer. Be the first to apply!