Software Development Manager, LLM Inference Model Enablement, Neuron SDK

$212.7k - $287.7k

Amazon Locker

Description

DESCRIPTION

AWS Utility Computing (UC) provides product innovations, from foundational services such as Amazon Elastic Compute Cloud (EC2), to new product innovations that continue to set AWS's services and features apart in the industry.

We develop AWS Neuron, the complete software stack for Trainium, Amazon's custom cloud-scale

machine learning accelerators. Come optimize LLMs such as Llama and GPT-OSS to run really fast on Trainium.

As the SDM for the LLM Inference Model Enablement team, you will lead a team of expert AI/ML engineers to onboard and optimize state-of-the-art open-source and customer LLMs, both dense and MoE, for inference on Neuron and Trainium and Inferentia accelerators. You will also drive improvements in model enablement speed and experience, while advancing inference usability and quality through inference features, infrastructure optimization, tools, and automation.

The ideal candidate will have a strong background in LLM model architectures, model performance optimizations, and inference techniques, such as delivering high-performance models using distributed inference libraries. You should be capable of managing demanding, fast-changing priorities. You should have a strong technical ability to understand and deliver as part of a vertically integrated system stack consisting of the PyTorch inference library, Neuron compiler, runtime, and collectives.

A day in the life

You will work with your senior management and technical leaders to define the model enablement and performance optimization for the latest SOTA LLMs, build and deliver them to customers.

Meanwhile, lead the team to continue improving the model onboarding experience, as well as enhancing inference usability and quality for Neuron-supported models.

You will manage changing priorities as new models and new technologies emerge, and you adapt your team's work to manage them. You will dive deep to help your team solve technical challenges.

About the team

About AWS

Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating - that's why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.

Diverse Experiences

AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying.

Work/Life Balance

We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there's nothing we can't achieve in the cloud.

Mentorship & Career Growth

We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.

Basic Qualifications

3+ years of engineering team management experience
7+ years of working directly within engineering teams experience
3+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience
Experience partnering with product or program management teams

Preferred Qualifications

Experience in communicating with users, other technical teams, and senior leadership to collect requirements, describe software product features, technical designs, and product strategy
Experience in recruiting, hiring, mentoring/coaching and managing teams of Software Engineers to improve their skills, and make them more effective, product software engineers

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company's reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.

The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at .

USA, CA, Cupertino - 212,700.00 - 287,700.00 USD annually

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the Software Development Manager, LLM Inference Model Enablement, Neuron SDK in Cupertino, CA vacancy

Software Development Engineer, AI/ML, AWS Neuron, Model Inference
$165.2k - $223.6k
...AWS) builds AWS Neuron, the software development kit used to... ...The AWS Neuron SDK, developed by... ...PyTorch and JAX enabling unparalleled ML inference and training... ...wide range of models and supporting... ...wide variety of LLM model families... ..., and product managers to deliver...
Suggested
Work experience placement
Internship
Local area
Flexible hours
Amazon
Cupertino, CA
2 days ago
Software Development Manager, Neuron Tools, Annapurna Labs
$212.7k - $287.7k
...Description AWS Neuron is the complete software stack for the AWS Inferentia... ...As the Software Development Manager for the Tools Team,... ...training and inference solutions. This organization... ...professional and enable them to take on... ...and training large models, working with...
Suggested
Local area
Work from home
Flexible hours
Amazon
Cupertino, CA
4 days ago
Lead AI Engineer (FM Hosting, LLM Inference)
$197.3k - $225.1k
...Engineer (FM Hosting, LLM Inference) Overview At Capital... ...of customers. Our AI models and platforms empower... ..., technical program managers, and product managers... ..., and support AI software components including foundation... ..., software, and AI enable you to see and exploit...
Suggested
Full time
Part time
Local area
Capital One Financial Corp
San Jose, CA
10 days ago
Software Development Manager - Compiler
$212.7k - $287.7k
...delivers best-in-class ML inference performance at the... ...cloud. This is all enabled by edge software stack, the AWS Neuron Software Development Kit (SDK), which includes an ML... ...SW Engineering Manager with strong leadership... ...market. As deep learning models become more versatile...
Suggested
Local area
Work from home
Relocation
Flexible hours
Amazon
Cupertino, CA
5 days ago
Multimodal Model Training and Inference Optimization Engineer
$156k - $387.6k
..., CapCut, and Lemon8, enabling users to make and share... ...to generative models for content creation,... ...Multimodal Model Training and Inference Optimization Engineer... ...optimization. - Strong software engineering skills, including... ...handling and managing confidential information...
Suggested
Temporary work
Local area
ByteDance
San Jose, CA
5 days ago
Principal ML Architect - Security AI & Advanced Model Systems
$254k - $349.25k
...to lead the design and development of next-generation AI... ...deep expertise in model architecture, training... ...environments Optimize inference systems for low latency... ...and modern LLM techniques Retrieval... ...performance AI systems , enabling the organization to lead...
Flexible hours
Proofpoint
Sunnyvale, CA
5 days ago
Principal Software & Architecture Lead
...is building neuron™ , a unified... ...proactively manage power... ...As Principal Software & Architecture... ...between hands-on development, technical... ...clean data models, and durable... ...layer, AI/ML inference, and front-end... ...that enable seamless connectivity... ...with LLM/agent frameworks...
Teserac, Inc.
Sunnyvale, CA
2 days ago
Model Training
$175k - $350k
...Model Training Engineer At Inflection AI, our public... ...(LLMs) and APIs that enable builders, agents, and... ...pipelines that turn a general LLM into a brand-fluent,... .... Collaborate with inference, safety, and product... ...following stages: Hiring Manager Conversation – An...
Full time
Humanx
Palo Alto, CA
1 day ago
Senior Research Scientist, Foundation Model (LLM/VLM)
$204k - $259k
...provided over ten million rider-only trips, enabled by its experience autonomously driving over... ...role you will report to a Technical Lead Manager. You will: Conduct applied foundation model research and development Design compelling experiments by training...
Full time
Remote work
Waymo
Mountain View, CA
13 hours ago
Senior Data Scientist, Model Risk & Data Analytics, Internal Audit - AMS
$136.8k - $292.6k
...company's risk management, governance and... ...products that enable and empower continuous... ...for auditing models, including... ...with emerging LLM auditing methodologies... ...- Professional Development: Continue to... ...pipelines, and inference behaviors. -... ...end and back end software development...
Temporary work
Local area
Tik Tok
San Jose, CA
4 days ago
Lead AI Engineer (AI Foundations, LLM Customization and Finetuning)
$197.3k - $225.1k
...(AI Foundations, LLM Customization and... ...customers. Our AI models and platforms empower... ...technical program managers, and product... ...deploy, and support AI software components including... ...language model inference, similarity search... ...software, and AI enable you to see and exploit...
Full time
Part time
Local area
Capital One
San Jose, CA
9 days ago
Lead AI Engineer (AI Foundations, LLM Core and Agentic AI)
$197.3k - $225.1k
...(AI Foundations, LLM Core and Agentic AI... ...customers. Our AI models and platforms empower... ...technical program managers, and product... ...deploy, and support AI software components... ...large language model inference, similarity search... ...software, and AI enable you to see and exploit...
Full time
Part time
Local area
Capital One
San Jose, CA
2 days ago
Tech Lead, Data & Inference Engineer
...Tech Lead, Data & Inference Engineer Cupertino, California... ...and YouTube, they enable marketing teams to... ...applied intelligence from model optimization to... ...Lead the design, development and scaling of an end... ...augmented generation. Manage version control, caching...
Full time
Catalyst Labs, LLC
Cupertino, CA
5 days ago
Research Intern - World-Action Foundation Model, Robotics
...Research Intern - World-Action Foundation Model, Robotics Sunnyvale, California,... ...of flexibility and trust our employees to manage their schedules responsibly. This may include... ...group is to create cutting-edge technology enabling next-generation physical AI, with...
For contractors
For subcontractor
Casual work
Internship
Work at office
Immediate start
Remote work
Day shift
Applied Intuition
Sunnyvale, CA
3 days ago
Member of Technical Staff - Model Training
$175k - $350k
...agent powered by Inflection AI's foundation model, proving that AI can be personal,... ...post-training pipelines that turn a general LLM into a brand-fluent, production-ready assistant... ...quality targets. Collaborate with inference, safety, and product teams to land improvements...
Inflection AI
Palo Alto, CA
10 hours ago
Member of Technical Staff - Imagine Model
$180k
...Member Of Technical Staff - Imagine Model Palo Alto, CA; Seattle, WA About XAI... ...experiences beyond text, with a strong focus on enabling high-fidelity understanding and... ...span data curation, modeling, training, inference serving, and product integration, covering...
Temporary work
Xai
Palo Alto, CA
12 hours ago
Chief Technology Officer (CTO)
...AI, and cloud platform that enables frictionless, high-assurance... ...deeply integrated hardware, software, and AI systems. The CTO will... ...-only, and not business development-oriented. The CTO is the company... ...optimized for on-device inference and real-world environments....
Texas State Library and Archives Commision
Sunnyvale, CA
3 days ago
Senior MLOps Technical Lead
$64k
...focus is to support the development of next-generation... ...test agents that enable faster insights, improved... ...and integrate AI/ML models into existing software systems.... ...Systems Has shipped AI/LLM features serving real... ...AI systems Streaming inference and async agent orchestration...
Full time
Contract work
Local area
HCLTech
Cupertino, CA
6 hours ago
Chief Technology Officer (CTO)
...to spearhead the development of next-generation... ...autonomous and edge-enabled systems. The CTO... ...deployment of AI/ML or LLM-enabled systems on... ...large language model technologies, and... ...will have a strong software engineering... ...with embedded AI inference frameworks and edge...
Confidential
San Jose, CA
1 day ago
Machine Learning Engineer, Model Optimization
$170k - $216k
...million rider-only trips, enabled by its experience... ...world data, to (2) develop models and model training at... ...report to a Technical Lead Manager. You will:... ...model training and model inference through model architecture/ hardware co-development, optimize for a naturally...
Full time
Remote work
Waymo
Mountain View, CA
13 hours ago
Machine Learning Engineer, Model Quantization, Tesla AI
$124k
...AI, we're not just training models, we're building the foundation... ...autonomously operate computers and software systems at enterprise scale.... ...architecture for quantized inference, if you excel at making... ...At Tesla AI we celebrate and enable speed, ownership, and real-world...
Hourly pay
Full time
Temporary work
Immediate start
Flexible hours
Tesla
Palo Alto, CA
1 day ago
AI/LLM Sr Manager of Software Engineering - Java and Python
...Senior Manager Of Software Engineering When you mentor and... ...engineers delivering AI-enabled capabilities across... ...Establish and iterate on AI development strategies for use-... ...delivery, including LLM integration... ...including API design, data modeling, resiliency, testing...
Flexible hours
Shift work
Chase
Palo Alto, CA
1 day ago
Research Scientist - Vision Language Model
...Institute of Foundation Models We are a dedicated... ...understanding, using, and risk-managing foundation models. Our... ...challenges in AI development. You will participate... ...model modularity, and inference optimization. Build... ..., and open-source software. Represent MBZUAI at...
Institute of Foundation Models
Sunnyvale, CA
4 days ago
Technical Program Manager - World Model
$220k - $320k
...the Institute of Foundation Models We are a dedicated research... ...understanding, using, and risk-managing foundation models. Our... ...and impactful challenges in AI development. You will participate in the... ...difference between pre-training and inference, know what a checkpoint is,...
Visa sponsorship
Flexible hours
Institute of Foundation Models
Sunnyvale, CA
10 hours ago
Lead Machine Learning Engineer - Agentic Models, LLM, RAG, GenAI
...shipping systems that manage multi-step... ...research, design, development, and deployment of... ...edge deep learning models across all Eightfold... ...a track record of software artifacts or academic... ...strategies (QLORA, DPO) and inference optimization (vLLM, TensorRT-LLM). Desired Skills...
Work experience placement
Work at office
Remote work
Flexible hours
3 days per week
Eightfold LLC
Santa Clara, CA
6 days ago
Lead ML Inference Engineer, Advertising
$246.5k
...to the content they love, enable content publishers to build... ...is our Machine Learning and Inference Platform that powers the entire... ..., design, and lead the development of a state-of-the-art Inference... ...that span across hardware, software, and models. We're looking for a strong...
Work at office
Local area
Remote work
Monday to Thursday
Flexible hours
Roku
San Jose, CA
2 days ago
Principal Product Manager AI Model Partnerships
$148.1k - $282.1k
...ecosystem of third-party AI models. This Principal PM role is the... ...the one who passes them up Manage the model onboarding workflow... ...with technical tradeoffs: inference cost, data residency, rate limiting... ...Manager, and GenStudio enable people and businesses to turn...
Temporary work
Local area
Worldwide
Shift work
Adobe
San Jose, CA
3 days ago
Senior Staff Machine Learning Engineer - Foundation Model
$244.14k - $413.16k
...Learning Engineer - Foundation Model Santa Clara, CA XPENG is a... ...the modeling and algorithmic development of XPENG's next-generation... ...directly shape the intelligence that enables XPENG's future L3/L4... ...end-to-end driving models, or LLM/VLM architectures (e.g., ViT,...
Full time
XPENG
Santa Clara, CA
1 day ago
Senior Software Engineer, AI Model Lifecycle
$172.43k - $230.95k
...Senior Software Engineer For The Ai Model Lifecycle Team Crusoe is on a mission to accelerate... ...in building a comprehensive managed platform for the entire application development lifecycle, with a specific... ...on GPU systems and inference frameworks. Benefits...
Temporary work
Crusoe
Sunnyvale, CA
2 days ago
Head of Software
$200k - $350k
...Responsibilities: Software Strategy and Leadership... ...Learning Systems Oversee ML model development for edge AI applications.... ...with hardware and SDK teams to enable efficient deployment across... ...computing hardware. Innovate on inference optimization for speed,...
TetraMem Inc
San Jose, CA
46 minutes ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Development Manager, LLM Inference Model Enablement, Neuron SDK. Be the first to apply!