Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Development Manager, LLM Inference Model Enablement, Neuron SDK

$212.7k - $287.7k

Amazon Locker

Description

DESCRIPTION

AWS Utility Computing (UC) provides product innovations, from foundational services such as Amazon Elastic Compute Cloud (EC2), to new product innovations that continue to set AWS's services and features apart in the industry.

We develop AWS Neuron, the complete software stack for Trainium, Amazon's custom cloud-scale

machine learning accelerators. Come optimize LLMs such as Llama and GPT-OSS to run really fast on Trainium.

As the SDM for the LLM Inference Model Enablement team, you will lead a team of expert AI/ML engineers to onboard and optimize state-of-the-art open-source and customer LLMs, both dense and MoE, for inference on Neuron and Trainium and Inferentia accelerators. You will also drive improvements in model enablement speed and experience, while advancing inference usability and quality through inference features, infrastructure optimization, tools, and automation.

The ideal candidate will have a strong background in LLM model architectures, model performance optimizations, and inference techniques, such as delivering high-performance models using distributed inference libraries. You should be capable of managing demanding, fast-changing priorities. You should have a strong technical ability to understand and deliver as part of a vertically integrated system stack consisting of the PyTorch inference library, Neuron compiler, runtime, and collectives.

A day in the life

You will work with your senior management and technical leaders to define the model enablement and performance optimization for the latest SOTA LLMs, build and deliver them to customers.

Meanwhile, lead the team to continue improving the model onboarding experience, as well as enhancing inference usability and quality for Neuron-supported models.

You will manage changing priorities as new models and new technologies emerge, and you adapt your team's work to manage them. You will dive deep to help your team solve technical challenges.

About the team

About AWS

Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating - that's why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.

Diverse Experiences

AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying.

Work/Life Balance

We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there's nothing we can't achieve in the cloud.

Mentorship & Career Growth

We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.

Basic Qualifications

  • 3+ years of engineering team management experience

  • 7+ years of working directly within engineering teams experience

  • 3+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience

  • Experience partnering with product or program management teams

Preferred Qualifications

  • Experience in communicating with users, other technical teams, and senior leadership to collect requirements, describe software product features, technical designs, and product strategy

  • Experience in recruiting, hiring, mentoring/coaching and managing teams of Software Engineers to improve their skills, and make them more effective, product software engineers

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company's reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.

The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at .

USA, CA, Cupertino - 212,700.00 - 287,700.00 USD annually

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Software Development Manager, LLM Inference Model Enablement, Neuron SDK in Cupertino, CA vacancy
  • $165.2k - $223.6k

     ...AWS) builds AWS Neuron, the software development kit used to...  ...The AWS Neuron SDK, developed by...  ...PyTorch and JAX enabling unparalleled ML inference and training...  ...wide range of models and supporting...  ...wide variety of LLM model families...  ..., and product managers to deliver... 
    Suggested
    Work experience placement
    Internship
    Local area
    Flexible hours

    Amazon

    Cupertino, CA
    2 days ago
  • $212.7k - $287.7k

     ...Description AWS Neuron is the complete software stack for the AWS Inferentia...  ...As the Software Development Manager for the Tools Team,...  ...training and inference solutions. This organization...  ...professional and enable them to take on...  ...and training large models, working with... 
    Suggested
    Local area
    Work from home
    Flexible hours

    Amazon

    Cupertino, CA
    4 days ago
  • $197.3k - $225.1k

     ...Engineer (FM Hosting, LLM Inference) Overview At Capital...  ...of customers. Our AI models and platforms empower...  ..., technical program managers, and product managers...  ..., and support AI software components including foundation...  ..., software, and AI enable you to see and exploit... 
    Suggested
    Full time
    Part time
    Local area

    Capital One Financial Corp

    San Jose, CA
    10 days ago
  • $212.7k - $287.7k

     ...delivers best-in-class ML inference performance at the...  ...cloud. This is all enabled by edge software stack, the AWS Neuron Software Development Kit (SDK), which includes an ML...  ...SW Engineering Manager with strong leadership...  ...market. As deep learning models become more versatile... 
    Suggested
    Local area
    Work from home
    Relocation
    Flexible hours

    Amazon

    Cupertino, CA
    5 days ago
  • $156k - $387.6k

     ..., CapCut, and Lemon8, enabling users to make and share...  ...to generative models for content creation,...  ...Multimodal Model Training and Inference Optimization Engineer...  ...optimization. - Strong software engineering skills, including...  ...handling and managing confidential information... 
    Suggested
    Temporary work
    Local area

    ByteDance

    San Jose, CA
    5 days ago
  • $254k - $349.25k

     ...to lead the design and development of next-generation AI...  ...deep expertise in model architecture, training...  ...environments Optimize inference systems for low latency...  ...and modern LLM techniques Retrieval...  ...performance AI systems , enabling the organization to lead... 
    Flexible hours

    Proofpoint

    Sunnyvale, CA
    5 days ago
  •  ...is building neuron™ , a unified...  ...proactively manage power...  ...As Principal Software & Architecture...  ...between hands-on development, technical...  ...clean data models, and durable...  ...layer, AI/ML inference, and front-end...  ...that enable seamless connectivity...  ...with LLM/agent frameworks... 

    Teserac, Inc.

    Sunnyvale, CA
    2 days ago
  • $175k - $350k

     ...Model Training Engineer At Inflection AI, our public...  ...(LLMs) and APIs that enable builders, agents, and...  ...pipelines that turn a general LLM into a brand-fluent,...  .... Collaborate with inference, safety, and product...  ...following stages: Hiring Manager Conversation – An... 
    Full time

    Humanx

    Palo Alto, CA
    1 day ago
  • $204k - $259k

     ...provided over ten million rider-only trips, enabled by its experience autonomously driving over...  ...role you will report to a Technical Lead Manager. You will: Conduct applied foundation model research and development Design compelling experiments by training... 
    Full time
    Remote work

    Waymo

    Mountain View, CA
    13 hours ago
  • $136.8k - $292.6k

     ...company's risk management, governance and...  ...products that enable and empower continuous...  ...for auditing models, including...  ...with emerging LLM auditing methodologies...  ...- Professional Development: Continue to...  ...pipelines, and inference behaviors. -...  ...end and back end software development... 
    Temporary work
    Local area

    Tik Tok

    San Jose, CA
    4 days ago
  • $197.3k - $225.1k

     ...(AI Foundations, LLM Customization and...  ...customers. Our AI models and platforms empower...  ...technical program managers, and product...  ...deploy, and support AI software components including...  ...language model inference, similarity search...  ...software, and AI enable you to see and exploit... 
    Full time
    Part time
    Local area

    Capital One

    San Jose, CA
    9 days ago
  • $197.3k - $225.1k

     ...(AI Foundations, LLM Core and Agentic AI...  ...customers. Our AI models and platforms empower...  ...technical program managers, and product...  ...deploy, and support AI software components...  ...large language model inference, similarity search...  ...software, and AI enable you to see and exploit... 
    Full time
    Part time
    Local area

    Capital One

    San Jose, CA
    2 days ago
  •  ...Tech Lead, Data & Inference Engineer Cupertino, California...  ...and YouTube, they enable marketing teams to...  ...applied intelligence from model optimization to...  ...Lead the design, development and scaling of an end...  ...augmented generation. Manage version control, caching... 
    Full time

    Catalyst Labs, LLC

    Cupertino, CA
    5 days ago
  •  ...Research Intern - World-Action Foundation Model, Robotics Sunnyvale, California,...  ...of flexibility and trust our employees to manage their schedules responsibly. This may include...  ...group is to create cutting-edge technology enabling next-generation physical AI, with... 
    For contractors
    For subcontractor
    Casual work
    Internship
    Work at office
    Immediate start
    Remote work
    Day shift

    Applied Intuition

    Sunnyvale, CA
    3 days ago
  • $175k - $350k

     ...agent powered by Inflection AI's foundation model, proving that AI can be personal,...  ...post-training pipelines that turn a general LLM into a brand-fluent, production-ready assistant...  ...quality targets. Collaborate with inference, safety, and product teams to land improvements... 

    Inflection AI

    Palo Alto, CA
    10 hours ago
  • $180k

     ...Member Of Technical Staff - Imagine Model Palo Alto, CA; Seattle, WA About XAI...  ...experiences beyond text, with a strong focus on enabling high-fidelity understanding and...  ...span data curation, modeling, training, inference serving, and product integration, covering... 
    Temporary work

    Xai

    Palo Alto, CA
    12 hours ago
  •  ...AI, and cloud platform that enables frictionless, high-assurance...  ...deeply integrated hardware, software, and AI systems. The CTO will...  ...-only, and not business development-oriented. The CTO is the company...  ...optimized for on-device inference and real-world environments.... 

    Texas State Library and Archives Commision

    Sunnyvale, CA
    3 days ago
  • $64k

     ...focus is to support the development of next-generation...  ...test agents that enable faster insights, improved...  ...and integrate AI/ML models into existing software systems....  ...Systems Has shipped AI/LLM features serving real...  ...AI systems Streaming inference and async agent orchestration... 
    Full time
    Contract work
    Local area

    HCLTech

    Cupertino, CA
    6 hours ago
  •  ...to spearhead the development of next-generation...  ...autonomous and edge-enabled systems. The CTO...  ...deployment of AI/ML or LLM-enabled systems on...  ...large language model technologies, and...  ...will have a strong software engineering...  ...with embedded AI inference frameworks and edge... 

    Confidential

    San Jose, CA
    1 day ago
  • $170k - $216k

     ...million rider-only trips, enabled by its experience...  ...world data, to (2) develop models and model training at...  ...report to a Technical Lead Manager. You will:...  ...model training and model inference through model architecture/ hardware co-development, optimize for a naturally... 
    Full time
    Remote work

    Waymo

    Mountain View, CA
    13 hours ago
  • $124k

     ...AI, we're not just training models, we're building the foundation...  ...autonomously operate computers and software systems at enterprise scale....  ...architecture for quantized inference, if you excel at making...  ...At Tesla AI we celebrate and enable speed, ownership, and real-world... 
    Hourly pay
    Full time
    Temporary work
    Immediate start
    Flexible hours

    Tesla

    Palo Alto, CA
    1 day ago
  •  ...Senior Manager Of Software Engineering When you mentor and...  ...engineers delivering AI-enabled capabilities across...  ...Establish and iterate on AI development strategies for use-...  ...delivery, including LLM integration...  ...including API design, data modeling, resiliency, testing... 
    Flexible hours
    Shift work

    Chase

    Palo Alto, CA
    1 day ago
  •  ...Institute of Foundation Models We are a dedicated...  ...understanding, using, and risk-managing foundation models. Our...  ...challenges in AI development. You will participate...  ...model modularity, and inference optimization. Build...  ..., and open-source software. Represent MBZUAI at... 

    Institute of Foundation Models

    Sunnyvale, CA
    4 days ago
  • $220k - $320k

     ...the Institute of Foundation Models We are a dedicated research...  ...understanding, using, and risk-managing foundation models. Our...  ...and impactful challenges in AI development. You will participate in the...  ...difference between pre-training and inference, know what a checkpoint is,... 
    Visa sponsorship
    Flexible hours

    Institute of Foundation Models

    Sunnyvale, CA
    10 hours ago
  •  ...shipping systems that manage multi-step...  ...research, design, development, and deployment of...  ...edge deep learning models across all Eightfold...  ...a track record of software artifacts or academic...  ...strategies (QLORA, DPO) and inference optimization (vLLM, TensorRT-LLM). Desired Skills... 
    Work experience placement
    Work at office
    Remote work
    Flexible hours
    3 days per week

    Eightfold LLC

    Santa Clara, CA
    6 days ago
  • $246.5k

     ...to the content they love, enable content publishers to build...  ...is our Machine Learning and Inference Platform that powers the entire...  ..., design, and lead the development of a state-of-the-art Inference...  ...that span across hardware, software, and models. We're looking for a strong... 
    Work at office
    Local area
    Remote work
    Monday to Thursday
    Flexible hours

    Roku

    San Jose, CA
    2 days ago
  • $148.1k - $282.1k

     ...ecosystem of third-party AI models. This Principal PM role is the...  ...the one who passes them up Manage the model onboarding workflow...  ...with technical tradeoffs: inference cost, data residency, rate limiting...  ...Manager, and GenStudio enable people and businesses to turn... 
    Temporary work
    Local area
    Worldwide
    Shift work

    Adobe

    San Jose, CA
    3 days ago
  • $244.14k - $413.16k

     ...Learning Engineer - Foundation Model Santa Clara, CA XPENG is a...  ...the modeling and algorithmic development of XPENG's next-generation...  ...directly shape the intelligence that enables XPENG's future L3/L4...  ...end-to-end driving models, or LLM/VLM architectures (e.g., ViT,... 
    Full time

    XPENG

    Santa Clara, CA
    1 day ago
  • $172.43k - $230.95k

     ...Senior Software Engineer For The Ai Model Lifecycle Team Crusoe is on a mission to accelerate...  ...in building a comprehensive managed platform for the entire application development lifecycle, with a specific...  ...on GPU systems and inference frameworks. Benefits... 
    Temporary work

    Crusoe

    Sunnyvale, CA
    2 days ago
  • $200k - $350k

     ...Responsibilities: Software Strategy and Leadership...  ...Learning Systems Oversee ML model development for edge AI applications....  ...with hardware and SDK teams to enable efficient deployment across...  ...computing hardware. Innovate on inference optimization for speed,... 

    TetraMem Inc

    San Jose, CA
    46 minutes ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Development Manager, LLM Inference Model Enablement, Neuron SDK. Be the first to apply!