AI Inference Engineer Intern - Model Pruning
$45 - $60 per hourquadric, Inc
Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.
Note: Our preference is for this internship to be based out of our Burlingame, California office. Candidates should be based in the Bay Area or able to relocate for the internship period and available to work on site. Responsibilities:Model pruning: Prune the model to speed up inference with re-training to maintain accuracy.
Requirements
- MS student in CS or related fields.
- Proficiency in Python
- Experience with model pruning and training in PyTorch
- Experience in quantization, and vision model accuracy metrics.
Benefits
At Quadric, we value Integrity, Humility, and Happiness. What we expect from one another is simple and clear: Initiative, Collaboration, and Completion. We are a collaborative team focused on building something extraordinary in the edge computing space.
The hourly rate for this temporary internship position is $45.00/hour to $60.00/hour. The actual rate offered will depend on a number of factors, including the specific level of the role, years and depth of relevant experience and education, technical skills and competencies, and work location.
Quadric interns receive hands-on experience working alongside industry experts in AI and semiconductor technology, with access to mentorship and meaningful project ownership from day one.
Founded in 2016 and based in downtown Burlingame, California, Quadric is building the world’s first supercomputer designed for the real-time needs of edge devices. Quadric aims to empower developers in every industry with superpowers to create tomorrow’s technology, today. The company was co-founded by technologists from MIT and Carnegie Mellon, who were previously the technical co-founders of the Bitcoin computing company 21.
Quadric is proud to be an equal opportunity employer. We are committed to creating an inclusive environment where people from all backgrounds can do their best work. We consider all qualified applicants without regard to race, color, religion, sex, gender identity or expression, sexual orientation, national origin, age, disability, veteran status, or any other protected characteristic under applicable law.
If this role resonates with you, we encourage you to apply even if your experience does not perfectly match every qualification. We value potential, curiosity, and a willingness to learn just as much as direct experience. Skills and growth come in many forms, and we would love to hear your story.
By submitting an application, you acknowledge that Quadric will collect and process your personal information as part of the hiring process. Please review our Privacy Policy to understand how we handle your data.
$45 - $60 per hour
...targeted to run neural network (NN) inference workloads in a wide variety... .... Responsibilities: Model pruning: Prune the model to speed up... ...work location. Quadric interns receive hands-on experience working... ...industry experts in AI and semiconductor technology,...InternshipHourly payTemporary workWork at officeRelocation$242k - $290k
...Model Optimization & Deployment Engineer The Perception team is pioneering the development... ...and build highly concurrent inference code to ensure real-time,... ...quantization (PTQ, QAT), pruning, mixed-precision inference... ...maximize memory bandwidth on AI accelerators. Write...SuggestedTemporary workRelocation package$110k - $270k
...neural network (NN) inference workloads in a wide variety... .... Role The AI Inference Engineer in Quadric is the key... ...the world of AI/LLM models and Quadric unique... ...Quantize, prune and convert models for... ...role to the business, internal equity, and work location...SuggestedWork at officeLocal areaImmediate startFlexible hours2 days per week- ...Quadric is seeking an experienced AI Kernel Engineer to develop and optimize AI kernels for efficient inference on the Quadric platform. This role requires strong expertise... ..., the position follows a hybrid work model with regular in-office collaboration. The successful...SuggestedWork at officeFlexible hours
$45 - $60 per hour
...targeted to run neural network (NN) inference workloads in a wide variety of... ...for internship focused on model optimization for Quadric's... ...and work location. Quadric interns receive hands-on experience working... ...alongside industry experts in AI and semiconductor technology,...InternshipHourly payTemporary work$110k - $270k
...to run neural network (NN) inference workloads in a wide variety... ...control code. Role: The AI Inference Engineer in Quadric is the key bridge... ...between the world of AI/LLM models and Quadric unique platforms... ...: Quantize, prune and convert models for deployment...Full timeTemporary workWork from home- ...Heka is a stealth-stage startup building AI infrastructure for clinical referral workflows... ...alumni with backgrounds spanning AI/ML engineering, quantitative research, and healthcare. Because we're small and early, an intern gets unusually direct access here, real code...InternshipFull time
$100k - $150k
...AI Engineer / Senior AI Engineer Blue Matter is a rapidly growing management... ...deploying AI-powered solutions for both internal teams and clients. You will be at... ...and ETL processes to support AI model training, fine-tuning, and inference workflows Investigate and...Temporary workImmediate startFlexible hours$110k - $270k
...targeted to run neural network (NN) inference workloads in a wide variety... ...data science team focused on model optimization for Quadric's... ...California Bay Area based engineering role is intended to be primarily... ...configs. Publish internal white papers, external benchmarks...Work at officeLocal areaImmediate startFlexible hours- ...Job Title: Senior AI Engineer (AI COE) Location: San Mateo, CA, USA... ...next-generation, cloud-based, international mobile services. Building on... ...Large Language Models ? Do you want to work on continued... ...reduction Optimize training and inference for cost, latency, and...Full timeWorldwide
$200k - $350k
...Research Engineer | San Francisco | Full-Time Brief Overview Applied AI lab building world models for 3D game environments. Early-stage, well... ...training pipelines, real-time inference optimization, and agentic... ..., shipping experience # Internal panel (60–90 min with...Full timeVisa sponsorshipRelocation packageFlexible hours$110k - $270k
...to run neural network (NN) inference workloads in a wide variety... ...control code. Role The AI Applications Engineer is the key bridge between... ...Experience with quantization and model accuracy analysis a plus... ...the role to the business, internal equity, and work location....Work at officeLocal areaImmediate startWorldwideFlexible hours- Alation is seeking a UX Engineer Intern in Redwood City, CA, to contribute to frontend code across the product. You'll bring design sensibility... ..., and HTML/CSS. This intern position offers a hybrid work model with local candidates preferred, as relocation is not available...InternshipLocal areaRelocation
- ...industry's product data. Our proprietary AI technology maintains a current and... ...Opportunity We’re seeking an Applied AI Engineer Intern to join Parspec in Summer 2026 (start date... ..., prototype, evaluate, and iterate on models and AI-powered workflows used in customer...InternshipSummer workWork at officeRemote workFlexible hours
- ...Python, Swift. Why Join Us? Be a part of a pioneering team at a seed stage company with big ambitions. Work with experienced engineers who are experts in devex Work in an environment that values creativity and innovation. Contribute to a platform that's set...InternshipSummer internshipWork at officeWorldwide
$143k - $156k
...PhD Data Scientist, Intern Stripe is a financial infrastructure platform for businesses... ...our products, and our business have the models, data products, and insights needed to make... ...Apply machine learning, causal inference, or advanced analytics on large datasets...InternshipSummer workWork at officeImmediate start$148k - $247k
...is at the forefront of AI, cloud, and data platform... ...Senior AI/ML Platform Engineer, you will architect and... ...from data ingestion to model monitoring. Design... ...Experience with real-time model inference and streaming ML... ...development and internal career growth opportunities...Full timePart timeImmediate startFlexible hours- ...We're looking for a summer intern to build out our Venture Capital arm. You will lead this initiative and work directly with the CEO of... ...directly and explain why you're a good fit at: ****@*****.***.ai Company Description Quanta Ventures Fund is a venture studio...InternshipSummer internshipLive in
- A leading financial infrastructure company is offering a summer internship focused on machine learning engineering. This role invites PhD candidates in Computer Science to tackle intricate problems that enhance product offerings, requiring strong programming skills and...InternshipSummer internship
- SonoThera is seeking a Nucleic Acid Analytics Intern for a summer internship in South San Francisco. As an intern, you will support the Gene Therapy group in characterizing gene therapy vector delivery efficiency. Responsibilities include performing assays, data analysis...InternshipSummer internshipLocal area
$160k - $250k
...Title: Founding AI Engineer (Research & Systems) Target: PhDs & Research Masters from Stanford, MIT, Berkeley, CMU focused on AI, ML,... ...mission is to move beyond simple RAG and chain-of-thought, creating models that can dynamically plan, execute, and learn in complex...H1bImmediate startVisa sponsorship- ...Runtime Engineer – AI Runtime & Execution About the Role We're looking for a Runtime... ...play a key role in ensuring that compiled models execute with maximum performance, scalability... ...opportunity for you to develop internal tooling, telemetry systems, and diagnostic...
$187.5k - $395k
...About Luma AI Luma's mission is to build multimodal AI... ...intelligence. To go beyond language models and build more aware, capable... ...by integrating them into our inference engine Collaborate closely across... ...and deployments Build internal tooling to measure, profile,...$192k - $257k
...large-scale Foundation models, VLMs, and VLAs to make... ...quantization, distillation, and pruning, among other things,... ...of strong software engineers and act as a force multiplier for our internal customers. This team... ...cutting-edge ML Training OR Inference performance...Temporary workRelocation package$170k - $277.5k
...deep learning infrastructure engineer, you will be responsible for building... ...'s Deep Learning (DL) and AI efforts. You will be working... ...high-performance deep learning inference for CV workloads that can... ...Profile CV and Vision Language Models (VLMs) to analyze performance,...Full timeLocal areaRelocation package- ...to drive life-changing impact to ZS. AI Engineer We are seeking an AI Engineer with experience... ...layer from data all-the-way to the AI model output • Design, develop and deploy... ...to career progression opportunities Internal mobility paths that empower growth via s...Work experience placementWork at officeLocal areaRemote workWork from homeWorldwideFlexible hours2 days per week3 days per week
- ...We are seeking a motivated Marketing Summer Intern to join our dynamic Marketing department at Student Medicover. As an intern, you will have the opportunity to gain hands-on experience in various marketing projects and campaigns. This internship is ideal for students...InternshipSummer internship
$240k - $280k
...AI Engineer, Computer Vision San Bruno, California Mill is a waste prevention technology company reimagining what it means to eliminate... ...compute directly into our high-capacity food recycler; models running on the edge identify, classify, and quantify food scraps...$192k - $300k
...define and enforce the best practices for engineering across the company. Our approach involves... ...our own. We're leveraging Large Language Models (LLMs) to improve development velocity... ...establishing best practices for responsible AI integration in our development pipeline...Temporary work$152.7k - $249.2k
...We're looking for a Senior AI Engineer to help bring pragmatic, production... ...into production to improve internal workflows (e.g., knowledge... ...core ML/LLM infrastructure (model gateways, prompt/agent orchestration... ...ML, reproducible training/inference pipelines. Experience...Temporary work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Inference Engineer Intern - Model Pruning. Be the first to apply!


