AI Model Optimization Architect

$158.4k - $237.6k

Qualcomm

Company Qualcomm Technologies, Inc. Job Area Engineering Group, Engineering Group > Machine Learning Engineering General Summary Qualcomm is leveraging its strengths in compute, connectivity, and AI acceleration to play a central role in the evolution of Cloud AI. The Qualcomm Cloud AI team develops hardware and software platforms enabling efficient inference of large-scale foundation models. We are seeking a Staff Engineer – AI Model Optimization Architect to lead end-to-end model transformation and optimization for LLMs, VLMs, diffusion, and multimodal models on Qualcomm inference accelerators. This role works closely with compiler, performance, and accuracy teams to translate models into accelerator efficient execution while balancing throughput, latency, memory, and quality. The scope spans Day0 enablement through production deployment, with a strong emphasis on scaling optimizations to future architectures. Key Responsibilities Architect and deliver model optimization strategies that transform PyTorch models for efficient inference on Qualcomm accelerators. Drive graph capture and deployment using PyTorch, ONNX, and torch.compile, including model rewrites and graph-level transformations. Design and implement fusion kernels using DSL based approaches (e.g., Triton), enabling fused operations and performance critical algorithmic rewrites. Partner deeply with compiler, performance, and accuracy teams to co-design lowering strategies, kernel fusion, layout decisions, and runtime integration. Profile and optimize LLM/VLM/diffusion inference for throughput and latency across batch sizes, sequence lengths, and serving modes. Own transformer specific optimizations including KVcache management, decoding behavior, and long context performance. Enable and optimize continuous batching (dynamic/iteration-level scheduling), understanding its impact on memory, scheduling, and tail latency. Architect and scale distributed inference strategies (e.g., sharding and parallelism) across multi-core and multi-device systems. Establish reusable approaches to scale model optimizations to new hardware architectures, creating robust patterns and tooling. Debug complex performance or stability issues to root cause and drive production ready solutions. Required Qualifications Expert level expertise in PyTorch and inference focused model optimization; strong Python engineering skills. Hands on experience with torch.compile / TorchDynamo or related graph capture and compilation workflows. Deep understanding of transformer architectures, attention mechanisms, MoEs, and performance trade-offs. Practical experience with KVcache behavior, serving time optimizations, and memory/performance tradeoffs. Strong foundation in computer architecture, ML accelerators, and distributed systems. Proven ability to lead cross-functional technical efforts and influence design decisions. MS in Computer Science, Machine Learning, Computer Engineering, or Electrical Engineering, or equivalent experience. Preferred / Bonus Qualifications Experience developing fusion kernels using Triton or similar DSLs, and collaborating with ML compiler teams. Familiarity with LLM serving stacks and continuous batching systems. Background in numerical methods, performance/accuracy trade-off analysis, or evaluation frameworks. PhD in a relevant field. Minimum Qualifications Bachelor's degree in Computer Science, Engineering, Information Systems, or related field and 4+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience. Master's degree in Computer Science, Engineering, Information Systems, or related field and 3+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience. PhD in Computer Science, Engineering, Information Systems, or related field and 2+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience. EEO Employer Qualcomm is an equal opportunity employer. If you are an individual with a disability and need an accommodation during the application/hiring process, rest assured that Qualcomm is committed to providing an accessible process. You may e-mail View email address on click.appcast.io or call Qualcomm's toll-free number found here. Upon request, Qualcomm will provide reasonable accommodations to support individuals with disabilities to be able participate in the hiring process. Qualcomm is also committed to making our workplace accessible for individuals with disabilities. (Keep in mind that this email address is used to provide reasonable accommodations for individuals with disabilities. We will not respond here to requests for updates on applications or resume inquiries). Pay Range And Other Compensation & Benefits $158,400.00 - $237,600.00 The above pay scale reflects the broad, minimum to maximum, pay scale for this job code for the location for which it has been posted. Even more importantly, please note that salary is only one component of total compensation at Qualcomm. We also offer a competitive annual discretionary bonus program and opportunity for annual RSU grants (employees on sales-incentive plans are not eligible for our annual bonus). In addition, our highly competitive benefits package is designed to support your success at work, at home, and at play. Your recruiter will be happy to discuss all that Qualcomm has to offer – and you can review more details about our US benefits at this link. If you would like more information about this role, please contact Qualcomm Careers. #J-18808-Ljbffr Qualcomm

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the AI Model Optimization Architect in San Diego, CA vacancy

Staff AI Model Optimization Architect for Transformers
A leading technology company in San Diego is looking for a Staff Engineer - AI Model Optimization Architect to lead model transformation for large-scale AI models. The role involves optimizing inference on Qualcomm's accelerators and requires expert knowledge in PyTorch...
Suggested
Qualcomm
San Diego, CA
1 day ago
Staff Machine Learning Engineer - Model Optimization & Quantization
$158.4k - $237.6k
...Summary: About the Role Join the Qualcomm AI Hub team and help developers integrate machine learning... ...In this role you will develop tools to help developers optimize and deploy machine learning models on edge and mobile hardware. AIMET is Qualcomm's open-...
Suggested
Work experience placement
Immediate start
Work from home
Qualcomm
San Diego, CA
2 days ago
Senior Staff ML Engineer - Edge AI & Model Optimization
$178.4k - $267.6k
...Engineer in San Diego, California, to work with cutting-edge AI technologies and frameworks. The ideal candidate will have... ...with generative AI workflows. Responsibilities include architecting model optimization techniques and collaborating with various teams. The role...
Suggested
Stryker Corporation
San Diego, CA
4 days ago
Senior AI Inference Engineer - Model Optimization & Deployment
...a multi-modality foundation model to drive the next generation... ...intelligence. As a Model Optimization & Deployment Engineer, you will... ...-tuning (LoRA, QLoRA). Architect and implement model conversion... ...bandwidth and minimize latency on AI accelerators. Write...
Suggested
Temporary work
Relocation package
Zoox
San Diego, CA
3 days ago
Robotics AI Model Optimization Engineer (On-Device)
Qualcomm is looking for a Physical AI Model Optimization Engineer in San Diego to optimize AI models for robotic applications. You will work with advanced models, applying Qualcomm’s toolchains to ensure optimal deployment on Snapdragon chipsets. This role demands a strong...
Suggested
Nutanix
San Diego, CA
4 days ago
AI Accuracy Architect
$158.4k - $237.6k
...leadership in compute, connectivity, and AI acceleration to play a central role... ...inference of large scale foundation models. We are seeking a Staff Engineer – AI Accuracy Architect to lead accuracy centric architecture and optimization for LLMs, VLMs, and emerging...
Work experience placement
Work from home
Qualcomm
San Diego, CA
6 days ago
Senior AI Performance Architect
$126.7k - $217.9k
...AI Accelerator Architecture Engineer Today, more intelligence... ...the largest of today's models. The AI Architecture team... ...algorithm development, kernel optimization, down to hardware accelerator... ...accelerator and GPU architectures Architect enhancements required for...
Work experience placement
Flexible hours
Night shift
Qualcomm
San Diego, CA
3 days ago
IT Architect - SAP Finance
$152.6k - $228.8k
...with Finance leadership to identify opportunities for process optimization and leverage technology to achieve business goals.**Requirements... ...Illumina prohibits the use of generative artificial intelligence (AI) in the application and interview process. If you require...
For contractors
Local area
Illumina
San Diego, CA
1 day ago
GPU Performance Architect - Mobile GPUs & RTL Optimization
$124k - $208.4k
A global technology leader seeks a Senior Engineer, GPU Architect to optimize and analyze mobile GPU performance. Within a creative and supportive environment, you'll leverage over 5 years of experience in GPU architecture, performance analysis, and programming with C/C++...
Samsung Semiconductor
San Diego, CA
2 days ago
ServiceNow - ServiceNow AI Architect Senior Manager- Tech Consulting - Open Location
$171.6k - $392.1k
...working world. ServiceNow – ServiceNow AI Architect Senior Manager In the digital... ...Experience in waterfall and agile delivery models – including supporting management... ...models Proficiency in developing and optimizing data pipelines for AI-driven solutions...
Summer holiday
Worldwide
Flexible hours
EY
San Diego, CA
2 days ago
AI Accuracy Architect for LLM/VLM Inference
$158.4k - $237.6k
A leading technology company in San Diego is seeking a Staff Engineer to focus on AI accuracy architecture for large language models. This senior technical role will involve optimizing model quality, performance, and power across Qualcomm platforms. The successful candidate...
Qualcomm
San Diego, CA
11 hours ago
Senior AI Platform Architect - Windows & Snapdragon
$200.8k - $301.2k
A leading technology firm in San Diego seeks a Senior Windows Platform Architect to enhance AI systems on the Snapdragon platform. You will drive performance and power optimizations, ensuring competitive developer experiences. The ideal candidate has extensive AI architecture...
Qualcomm
San Diego, CA
1 day ago
Senior Satellite Resource Optimization Architect
$139.5k - $258.1k
Apple Inc. in San Diego seeks a Senior Engineer for its Satellite Communications Group. This role focuses on optimizing satellite resource usage and operates in a dynamic environment, handling design and implementation challenges. The successful candidate will have deep...
Apple Inc.
San Diego, CA
1 day ago
AI Marketing Architect, Performance & Creative
Intuit Inc. is looking for an AI Marketing Manager to join its Marketing Futures Team. This role will own the strategy and execution... ...marketing campaigns, aiming to enhance creative production and optimize performance metrics. The ideal candidate will have 4-6 years of...
Intuit Inc.
San Diego, CA
4 days ago
AI Transformation Architect for Internal Systems (Remote)
A leading IT solutions provider in California is seeking an experienced AI Transformation Consultant to optimize internal systems through AI. The consultant will assess workflows, identify AI opportunities, and implement scalable automation solutions. Candidates should...
Remote job
GigaKOM
San Diego, CA
2 hours ago
Platform Architect/AWS solution Architect
...Job Title: Platform Architect/AWS solution Architect Location: Onsite... ...experience with AWS AI/ML services (SageMaker, Bedrock... ...Databricks Al capabilities (Model Serving, Feature Store, MLflow... ..., re-architecting). • Cost Optimization: Ability to design cost-effective...
Shift work
Jobs via Dice
San Diego, CA
2 days ago
ServiceNow - ServiceNow AI Architect Senior Manager- Tech Consulting - Open Location
$171.6k - $392.1k
...working world. ServiceNow - ServiceNow AI Architect Senior Manager In the digital economy,... ...in waterfall and agile delivery models - including supporting management activities... ...models Proficiency in developing and optimizing data pipelines for AI‑driven solutions...
Summer holiday
Worldwide
Flexible hours
Ernst & Young Oman
San Diego, CA
11 hours ago
AI Model Systems Software Engineer
$122.8k - $184.2k
...Summary:We are looking for an AI Performance System Software... ...evolve the benchmarking and optimization of reference AI networks that... ...leads, software, and hardware architects.Ideal candidate has knowledge... ...methodsKnowledge of state of the art in AI models for one or more of the...
Work from home
Nutanix
San Diego, CA
11 hours ago
Senior Agentic AI Architect/Engineer
$146.88k - $220.32k
...Job Description As a Senior Agentic AI Architect/Engineer , you will be a technical leader... .... You have experience leading projects, optimizing ML systems for performance, and building... ...Performance Optimization: Increase model serving layers and high-throughput data...
Work experience placement
Remote work
Visa sponsorship
Work visa
Flexible hours
Rockwell Automation
San Diego, CA
19 hours ago
Blue Team AI Benchmark Architect
...benchmarks for defense. The ideal candidate will have hands-on experience in blue-team activities including threat hunting and incident response, along with strong scripting and cloud skills. Join us to help fill a critical gap in AI security evaluations. #J-18808-Ljbffr...
Mercor Inc
El Cajon, CA
2 days ago
Principal AI Architect
...approach. Teradata delivers real business value with AI. What You'll Do As the Principal AI Architect for Teradata AI Studio, you will define the... ...integrates with Teradata Vantage's query engine, model registry, feature store, and agent harness. You will...
Flexible hours
Teradata
San Diego, CA
4 days ago
Principal SOC Architect, AI Accelerators (Multiple Locations
$172.8k - $304.9k
...Group, Engineering Group ASICS Engineering General Summary: Qualcomm is looking for an experienced SoC architect to work on the next generation AI products in the datacenter. We are looking for a data center engineer whose expertise spans security, RAS and/...
Work experience placement
Work from home
Qualcomm
San Diego, CA
3 days ago
Agentic AI Architect — Healthcare RCM Platform
$185k - $215k
...healthcare technology company in San Diego is seeking an experienced AI Architect to design and implement AI platforms that streamline complex... .... This role demands significant expertise in large language models and distributed AI systems. Ideal candidates will possess...
XiFin, Inc.
San Diego, CA
2 days ago
AI-Driven Vocational Training Architect
Mercor is seeking experienced vocational and workforce-training professionals to help develop digital workspaces for AI evaluation as part of Project Atlas. Ideal candidates should have at least 3 years of experience in relevant fields such as curriculum design and apprenticeship...
Hourly pay
Apprenticeship
Mercor
San Diego, CA
1 day ago
Campaign Architect for Adobe Journey Optimizer Campaigns
A global professional services company in San Diego is seeking a Marketing Services Campaign Manager to oversee campaign execution. The candidate will develop strategies within marketing automation platforms and liaise with clients to enhance marketing effectiveness. Ideally...
Accenture
San Diego, CA
3 days ago
ServiceNow- ServiceNow AI Architect Manager - Tech Consulting - Open Location
$142.6k - $261.5k
...help to build a better working world. ServiceNow– ServiceNow AI Architect Manager In the digital economy, it takes more than good... ..., including crafting effective prompts for large language models (LLMs) and optimising AI responses for business context. Demonstrated...
Summer holiday
Worldwide
Flexible hours
EY
San Diego, CA
2 days ago
AI Transformation Architect
A global technology leader is seeking a Cloud Architect in San Diego, California. This role focuses on solving complex enterprise AI transformation challenges, designing multi-agent systems, and leveraging cutting-edge AI technologies. Candidates should have a minimum...
Accenture
San Diego, CA
1 day ago
AI Workflow Architect - Onsite, 4‑Month Project
AEMI Holdings, LLC in San Diego, CA, is seeking an experienced AI Workflow Specialist for a part-time temporary role lasting 4 months. The specialist will analyze current processes and implement AI-enabled workflows to improve efficiency across QA and business functions...
Temporary work
Part time
AEMI Holdings, LLC
San Diego, CA
4 days ago
AI-GPU Architect: Next-Gen ML/LLM Hardware
$161.8k - $242.6k
A leading technology company in California seeks innovative GPU architects to design next-generation GPU architectures. The role involves collaboration with various teams to advance Artificial Intelligence and Machine Learning capabilities across platforms. Candidates...
Qualcomm
San Diego, CA
3 days ago
AI Architect
$185k - $215k
...interested in harnessing technology and AI to transform healthcare? At XiFin , we believe... ...real difference. About The Role The AI Architect will lead the design and implementation... ...that combine large language models (LLMs), knowledge graphs, retrieval systems...
Flexible hours
XiFin, Inc.
San Diego, CA
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Model Optimization Architect. Be the first to apply!