Lead Engineer, ML Network Stack - Annapurna Labs
$193.3k - $261.5kAmazon Locker
Description
We are seeking an experienced engineer and technical leader to join our team that owns the network stack for EC2 distributed AI/ML systems. The team develops support for a variety of frameworks and communication libraries including NCCL, NVSHMEM, NIXL, NCCL GIN, and Perplexity kernels. Solid knowledge of Linux, networking, and performant coding is important. Experience with embedded systems is valued, and experience with high-speed networking or HPC/RDMA interconnects is highly valued.
If you like solving hard problems, want to work with HPC and ML customers, iterate fast and deliver meaningful solutions at scale, then come join us! This truly is a role at the forefront of AI/ML-you'll be working on features for the largest clusters, with the largest customers, for the largest AI models. This is a role for a technical lead with the expectation to grow into a technical manager role. We are specifically seeking candidates who want to develop their career as a technical manager.
The organization you would be joining is Annapurna Labs, an integral part of AWS that develops hardware and software components that are critical building blocks for EC2 infrastructure. Every instance in EC2 is running some type of hardware designed by Annapurna Labs. We specialize in designing software, systems, and chips that optimize the AWS customer experience.
Key job responsibilities
Be the lead engineer on a team that builds and maintains the infrastructure that monitors and reports on functionality and performance of massive testing workloads run at scale. Use internal Amazon CI/CD tools, Linux, and public AWS products to automate the delivery of our software to customers, saving developer time. Write Python code that effortlessly spools up large clusters and runs benchmarks and applications for ML and HPC workloads. Use AWS Managed Grafana and Athena to digest the massive amount of performance data generated by these workloads and create dashboards for developers and stakeholders. Invent automatic mechanisms to alert developers to functional and performance regressions so they never reach reach customers. Manage the complexity of infrastructure that covers many instance types, software stacks, Linux operating systems, cutting-edge releases and make it easy to evolve.
About the team
The organization you would be joining is Annapurna Labs, an integral part of AWS that develops hardware and software components that are critical building blocks for EC2 infrastructure. Every instance in EC2 is running some type of hardware designed by Annapurna Labs. We specialize in designing software, systems, and chips that optimize the AWS customer experience.
Diverse Experiences
AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying.
Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there's nothing we can't achieve in the cloud.
Mentorship & Career Growth
We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
Basic Qualifications
5+ years of non-internship professional software development experience
5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
3+ years as a mentor, tech lead or leading engineering teams
3+years experience in SW/HW Co-Design
Preferred Qualifications
Bachelor's degree in computer science or equivalent
Experience creating automated dashboards and visualization (such as Grafana)
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company's reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at .
USA, CA, Cupertino - 193,300.00 - 261,500.00 USD annually
USA, WA, Seattle - 168,100.00 - 227,400.00 USD annually
$212.7k - $287.7k
...the team that owns the network stack for EC2 distributed AI/ML systems. The team develops... ...seeking an experienced engineering manager for a mid-sized... ...highly regarded. You'll be leading senior, mid-level, and... ...you would be joining is Annapurna Labs, an integral part of AWS...SuggestedLocal areaFlexible hours$184.9k - $250.2k
...Trainium chip delivers industry-leading ML inference and training... ...is enabled by edge software stack, the AWS Neuron Software Development... ...and partners. Amazon Annapurna Labs drives innovation in silicon... ...impact with world-class engineering talent. Our multidisciplinary...SuggestedWork from homeFlexible hours- ...Inference Technology, Neuron SDK job at Annapurna Labs (U.S.) Inc.. Seattle, WA. DESCRIPTION... ...develop AWS Neuron, the complete software stack for Trainium, Amazon's custom... ...blocks team, you will guide your expert AI engineers to build fundamental inference technology...Suggested
- ...Lead Systems Engineer (Rust) - AI Platform About the Role What if your... ...looking for a Senior Rust Full-Stack Engineer to build high-... ...infrastructure for leading AI labs - working on systems that matter... ...systems Familiarity with AI/ML workflows, model training, or...SuggestedHourly payOngoing contractContract workFreelanceRemote workFlexible hours
- A leading technology firm is looking for a Lead Senior Software Engineer to guide the Customer Success Excellence team in designing and delivering high-value engineering... ...Ideal candidates will have over 8 years of full-stack experience and a strong background in leading...Suggested
$264.8k - $331k
...As the leading data and evaluation partner for frontier... ...industry's leading AI labs to provide high quality... ...scientists and research engineers focused on developing... ...pipelines using modern ML frameworks. Publish... ...-quality data and full-stack technologies that power...Full time- ...Lead Software Engineer We have an opportunity to impact your career and provide an adventure where... ...concepts, including transformer architecture, ML training, and inference. Experience... ...Ray.io, Slurm). Strong knowledge of network architecture, database programming (SQL...
$143.7k - $194.4k
...scale built on a fully custom stack of hardware, firmware and applications... ...this team. EC2 Provisioning engineers become subject matter experts... ...at global scale. The Network Provisioning team owns IP management... ...built to support modern ML platforms. We take pride in operational...InternshipFlexible hours$136k - $212k
A leading global technology firm is looking for a Team Lead, Network Automation Infrastructure, to spearhead next-generation network device automation. The role involves building a full-stack, SDLC-integrated ecosystem and requires deep expertise in SONiC, automation frameworks...Remote work$172.5k - $260.1k
...Job Category Software Engineering Job Details About Salesforce... ...up your career at the company leading workforce transformation in the... ...as a member of a Data Science, ML Science, or ML Engineering team... ...landscapes (CRM, Modern Data Stack, Analytics & BI, CRM and AI) to...$152k - $241.5k
NVIDIA seeks a senior software engineer to join the AI Networking co-design and benchmark R&D... ...include tools that use ML-based combinatorial optimization... ...LLM training and inference stacks. A strong passion for... ...performance analysis insights.* Lead performance test planning,...$202.16k - $368.22k
...Responsibilitie About the team Networking brings together... ...to building a world-leading hyperscale data center... ...passionate development engineer who combines deep networking... ...across the entire stack, from optical transceivers... ...- A Passion for AIOps/ML/LLM Practices: - A keen...Temporary workLocal area- ...Tech Lead, Data & Inference Engineer Seattle, Washington, United States About the Job Tech Lead, Data... ...efforts. About Us Catalyst Labs is a leading talent agency with a specialized... ...data lake and the entire pipeline stack. Document lineage, trade offs and...Full time
$139k - $204k
...Senior Engineer, Network Observability Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue... ...scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises,... ...Detection: Hands-on experience applying ML techniques or tools (e.g., TensorFlow,...Temporary workCasual workWork at officeRemote workFlexible hours$172.5k - $260.1k
...heart of it all. Ready to level-up your career at the company leading workforce transformation in the agentic era? You're in the... ...of AI, and you are the future of Salesforce. As a Lead Network Engineer at Salesforce, you will be part of the Infrastructure Strategy...Shift work- A tech company specializing in data management is seeking a Senior Full-Stack Engineer to build an advanced platform for maritime awareness. You will work on real-world projects impacting national security, with responsibilities including developing data models, modifying...Remote work
- Dority & Manning, P.A. is looking for a Full Stack Developer in Seattle, WA to enhance their technological landscape through the entire software development lifecycle. Key responsibilities include developing user-friendly interfaces, overseeing firmwide automation initiatives...
$13 per hour
...Senior/Lead AI Software Engineer Join an agile team with deep startup roots. We operate as a high-velocity... ...engineers to improve code quality, AI/ML practices, and system design... ...move faster, build fluency across the stack, and contribute well beyond your core specialty...Immediate start$171k - $311k
...are, join our team. KPMG is currently seeking a Technical Lead Manager, Forward Deployed Engineering to join our AI & Data Labs practice. Responsibilities: Lead a pod of elite, AI-native full-stack engineers with a bias to action to rapidly build and deploy end...H1bLocal area$203k - $417k
...the planet build Superintelligence. The labs pushing the edge? They run on Lambda. Our... ...You’ll Do Help to build Lambda’s cloud networking infrastructure Contribute to automation... ...an understanding of the Linux networking stack and internals Have python and/or bash...Work at officeLocal areaWork from homeFlexible hours$203.5k
...product, design, architecture and engineering, and client stakeholders,... ...also draw on data science and ML engineering skills as needed,... ...decision support) using modern LLM stacks. Implement agentic workflows... ...non-technical stakeholders; lead working sessions, present recommendations...Full timeTemporary workApprenticeshipWork at officeLocal areaWork from homeHome office3 days per week$152k - $230k
Serko is seeking a Senior Full Stack Engineer in Seattle to lead development of next-gen AI products. This role involves creating internal tools with React and Next.js, as well as designing backend services in Python. The ideal candidate will have 5+ years in full-stack...Flexible hours- A leading technology company is seeking a Software Engineer in Seattle specializing in Cloud HPC and Accelerator Networking. You will have the opportunity to influence the design and delivery of cutting-edge infrastructure solutions critical to the organization’s operations...
- ...Lead Software Engineer We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible... ...(p50/p99/p999) across the full Spring Boot/Kafka/gRPC/JMS stack This position is subject to Section 19 of the Federal Deposit...For contractors
- ...Duties Acts as the lead subject matter expert in... ...and fullstack software engineering on an agile delivery team... ...concepts in AI/ML, large language models,... ...nocode development, and by networking and speaking on behalf... ...the Microsoft technology stack, including .NET framework...
- ...hands-on experience in Azure data engineering. The role focuses on designing... ..., Data Factory, Power BI) • Lead Azure-based data engineering... ...Enable analytics, BI, and AI/ML use cases on Fabric and Azure... ...understanding of Azure data engineering stack (ADF, Synapse, ADLS,...
- ...re looking for a senior Rust engineer to design and build high-... ...evaluation systems used by leading AI research labs. This is a fully remote contract... ...workflows Develop full-stack tooling and backend services... ...Familiarity with AI/ML workflows, model training, or...Hourly payOngoing contractContract workFreelanceRemote work
- ...Systems Software Engineer - Machine Learning Ops (AI Infrastructure... ...evaluation systems used by leading AI research labs. This isn't theoretical... ...evaluation workflows Develop full-stack tooling and backend services... ...used in production ML environments Collaborate...Hourly payContract workFreelanceRemote work
- ...Senior Python Full-Stack Engineer - AI Data & Infrastructure About the Role What if... ...annotation tooling, and evaluation systems that leading AI labs depend on to train and improve next-... ...systems Familiarity with AI/ML workflows, model training pipelines, or...Hourly payOngoing contractContract workFreelanceRemote workFlexible hours
$148.2k - $300.96k
...ByteDance and Volcano Engine Public Cloud. Our mission... ...across compute, networking, and storage for cloud... ...computing. Our technology stack spans: - Cloud virtualization... ...and scheduling for AI/ML workloads We work at... ...with great people. We lead with curiosity, humility...Temporary workLocal area
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Lead Engineer, ML Network Stack - Annapurna Labs. Be the first to apply!
- lead operating engineer Seattle, WA
- lead engineer Seattle, WA
- lead infrastructure engineer Seattle, WA
- lead algorithm engineer Seattle, WA
- lead industrial engineer Seattle, WA
- lead network engineer Seattle, WA
- lead system engineer Seattle, WA
- ip network engineer Seattle, WA
- senior network engineer remote Seattle, WA
- network implementation engineer Seattle, WA

