Senior SDE: Low-Latency ML/HPC Networking Architect

Itlearn360

Senior Software Development Engineer, Annapurna Labs, Elastic Collectives job at Annapurna Labs (U.S.) Inc.. Cupertino, CA. DESCRIPTION Annapurna Labs, a crucial part of AWS, is responsible for developing hardware and software components for EC2 infrastructure. Our team focuses on building networking solutions that for Machine Learning (ML) and High-Performance Computing (HPC) workloads on AWS. We are seeking an experienced engineer with low-level latency networking or interconnect expertise to optimize customer experience by designing systems that enable scaling network-intensive workloads over thousands of CPUs, GPUs, and TPUs. This role is on the forefront of AI/ML, we spend a good deal of the day optimizing the networking for the latest AI workload such as LLMs. Our ideal candidate will have extensive experience in low-latency networking and collective operations, such as HPC network fabric or machine learning accelerator cluster systems. Also applicable is experience high-frequency trading networking, high-speed wireless networking, or low latency interconnects such as PCIe or CXL. Proficiency in C/C++ and a deep understanding of Linux and kernel-level programming are essential. Strong problem-solving skills and the ability to troubleshoot complex networking issues are required, along with excellent communication skills to work effectively in a collaborative team environment. A day in the life Working at Annapurna Labs means engaging with a diverse and inclusive team culture that embraces differences and fosters a sense of belonging. You will participate in innovative learning experiences and benefit offerings, such as the CORE and AmazeCon conferences. Your day will involve designing and optimizing networking solutions, collaborating with cross-functional teams, and engaging with customers to gather feedback and continuously improve our offerings. About the team Work/Life Balance: Our team places a high value on work-life balance, believing in establishing a flow that energizes both personal and professional life. We offer flexible working hours and encourage you to find a balance that suits you, ensuring long-term happiness and fulfillment. It’s not about the number of hours spent at work or home but about creating a harmonious balance that enhances both aspects of your life. Mentorship & Career Growth: We are dedicated to supporting new team members with a mix of experience levels and tenures, fostering an environment of knowledge sharing and mentorship. Our commitment to your career growth includes assigning projects that help you develop into a well-rounded professional capable of taking on more complex tasks in the future. Join us at Annapurna Labs and be part of a team that is shaping the future of networking solutions for ML and HPC workloads on AWS! BASIC QUALIFICATIONS - 5+ years of non-internship professional software development experience

5+ years of programming with at least one software programming language experience
5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
Experience as a mentor, tech lead or leading an engineering team

#J-18808-Ljbffr Itlearn360

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the Senior SDE: Low-Latency ML/HPC Networking Architect in Cupertino, CA vacancy

Senior Network Architect
$208k - $333.5k
...world! The NVIDIA Enterprise Network Architecture team is seeking... ...based compute, storage, and GPU/HPC clusters. Design high‑... ...compute workloads and GPU‑dense AI/ML training and inference... ...dark‑fiber systems to provide low‑latency, loss‑minimal connectivity between...
Senior
NVIDIA
Santa Clara, CA
2 days ago
Network Architect
...effortlessly run large-scale ML applications, without the hassle... ...About The Role As a Network Architect on the Cluster Architecture Team... ...fabrics for AI/ML and HPC systems. Identify and resolve... ...ensuring high resource utilization, low latency, and high-throughput...
Suggested
CEREBRAS SYSTEMS INC.
Sunnyvale, CA
1 day ago
Senior Software Engineer - HPC
$152k - $241.5k
...We are looking for a Senior Software Engineer to join... ...continue improving our HPC infrastructure. Our team... ...push the limits of scale, latency, and reliability. Continuously... ..., resilient, and low‑latency services.... ...clusters, large-scale AI/ML platforms, or systems managed...
Senior
NVIDIA
Santa Clara, CA
17 hours ago
Senior AI Networking Solutions Architect - Hyperscale Infra
NVIDIA Gruppe in Santa Clara is looking for an experienced Networking Solutions Architect to support advanced computing networking solutions for AI/ML and HPC. You will work with leading tech companies to develop solutions that leverage NVIDIA’s cutting-edge technologies...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Networking Solutions Architect for AI & Hyperscale HPC
$152k - $241.5k
NVIDIA Gruppe is seeking an experienced Solutions Architect in Santa Clara to support accelerated computing networking solutions for AI/ML and HPC. You will develop and demonstrate solutions with major tech companies while addressing customer needs and performance issues...
Suggested
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Specialist Field Engineer - HPC/AI/ML
$165k - $220k
...About the role: As a Senior Specialist Field... ...offerings, focusing on AI/ML workloads within high-performance compute (HPC) environments Collaborate... ...experience as a Solutions Architect, Field Engineer,... ...facing publications/talks on latency, optimization, or advanced...
Senior
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
CoreWeave
Sunnyvale, CA
13 days ago
Network Cluster Architect - Data center Infrastructure
...Network Cluster Architect - Data Center Infrastructure Work Locations (2) Submit... ...Apple's services and AI/ML workloads. We are seeking experienced... ...factors like bandwidth, latency, scalability, and cost.... ...strategic recommendations to senior leadership. Develop functional...
Apple
Cupertino, CA
2 days ago
Senior Solutions Architect, Spectrum-X Low Level
$184k - $287.5k
...NVIDIA networking designs and manufactures high-performance... ...) we make powerful ML/AI platforms possible.... ...networking Sr. Solutions Architect at NVIDIA you will have... ...platforms to run AI and HPC workloads Address sophisticated... ...languages (from low-level C programming...
Senior
Remote work
NVIDIA
Santa Clara, CA
17 hours ago
Senior AI & HPC Networking Solutions Architect
$184k - $356.5k
NVIDIA Gruppe, based in Santa Clara, is seeking a highly experienced Sr. Solutions Architect specializing in embedded software engineering to support innovative networking technologies. This role offers significant agency and the opportunity to work closely with both customers...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Networking Solutions Architect - AI/HPC (Equity)
$224k - $431.25k
NVIDIA Gruppe in Santa Clara is seeking experienced networking engineers to join the Solutions Architecture team. This role focuses on integrating cutting-edge NVIDIA networking products and requires both technical proficiency and strong customer-facing skills. The ideal...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Sr. Staff Software Engineer - HPC Network Engineering
$181k - $297k
...largest professional network, built to create... ...We are seeking an HPC Network Engineer... ...high-performance, low-latency Ethernet fabrics for... ...supporting AI/ML training, inference... ...traffic. As a Senior Staff Software Engineer... ...(tech lead, architect, principal/IC leadership...
Senior
For contractors
Work at office
Flexible hours
LinkedIn
Mountain View, CA
1 day ago
Sr. Software Engineer, Low Latency Computing (Starlink)
$160k - $225k
...life on Mars. SR. SOFTWARE ENGINEER, LOW LATENCY COMPUTING (STARLINK) At SpaceX we’re leveraging... ..., high-bandwidth satellite-based global network. Participate in and lead architecture... ...Pay range: Software Engineer /Senior: $160,000.00 - $225,000.00/per year...
Senior
Permanent employment
Temporary work
Worldwide
Weekend work
SpaceX
Sunnyvale, CA
2 days ago
Senior Mixed-Signal IC Designer — Low-Power HPC
A leading technology firm in San Jose is seeking a Senior Mixed-Signal Analog IC Design Engineer to develop cutting-edge low-power solutions for high-performance applications. The ideal candidate will possess a deep understanding of mixed-signal design, a Ph.D. or Master...
Senior
Bitdeer Technologies Group
San Jose, CA
2 days ago
Senior Staff Software Engineer - WiFi
$118.2k - $185k
...Principal Engineer- WiFi RUCKUS Networks specializes in delivering high-performance networking... ...Intelligence (AI) and Machine Learning (ML) to enhance network performance and... ...standards, focusing on high performance, low latency, and guaranteed service delivery. Collaborate...
Senior
Local area
Vistance Networks
Sunnyvale, CA
1 day ago
Senior Wireless Network Architect (SD-WAN & Security)
TechDigital Group is hiring for a role focused on overseeing network configuration and maintenance, ensuring connectivity across the organization. The position requires managing firewalls, leading architecture design initiatives, and troubleshooting network issues to maintain...
Senior
TechDigital Group
Sunnyvale, CA
4 days ago
Senior Networking Software Architect
$184k - $287.5k
...looking for a brilliant Software & Systems Architect to join the NIC Software/Firmware... ...diverse use cases. Conduct research in network protocols, new network technologies, networking... ...and training), storage, cyber security, HPC, and emulation offloads. What We Need...
Senior
NVIDIA
Santa Clara, CA
4 days ago
Network Architect
...# Network Architect Location - Santa Clara. This is a hands-on architecture position... ...scale DCs inter-connects and fabric for HPC, AI, and GPU computing clusters.... ..., and dark fiber deployments to ensure low latency and high reliability. Partner with...
Tranzeal
Santa Clara, CA
4 days ago
Infrastructure Architect - AI & Data Center
...Title: Infrastructure Architect (AI & Data Center) Location: San... ...GPUs, Virtual Machines, and networking. Develop a comprehensive... ...objectives while maintaining low-latency performance. 3. Special Projects... ...infrastructure. ~ AI/ML Ops: Proven expertise in Kubernetes...
Trilyon, Inc.
San Jose, CA
1 day ago
Senior WAN Network Engineer
...learning users to effortlessly run large-scale ML applications, without the hassle of... ...We are seeking a highly skilled WAN Network Engineer to design, implement, manage, and... ...network across leased lines and dark fiber for low latency and 99.999% availability....
Senior
CEREBRAS SYSTEMS INC.
Sunnyvale, CA
2 days ago
Senior AI and ML HPC Cluster Engineer
As a member of the GPU AI/HPC Infrastructure team, you will provide leadership in the design... ...we encounter including: compute, networking, and storage design for large scale, high-... ...automation solutions. Build and maintain AI and ML heterogeneous clusters on-premises and in...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Math Libraries Engineer - AI and HPC
$184k - $287.5k
...Libraries team is looking for a senior engineer to join our... ...kernel generation for AI and HPC, specifically targeting matrix... ...maintenance overhead through re-architecting. To be successful in your responsibilities... .... Experience with low level programming using assembly...
Senior
Remote work
NVIDIA
Santa Clara, CA
17 hours ago
Senior AI and HPC Observability Engineer
$152k - $241.5k
...enable researchers and engineers to develop the next generation of AI/ML systems. By joining us, you’ll help design solutions that power... ...heart of this transformation. We are looking for a strong AI & HPC Observability Engineer to build and scale next-generation...
Senior
NVIDIA
Santa Clara, CA
2 days ago
Senior Staff Software Engineer - High Performance GPU Inference Systems
$248.71k - $292.6k
...Engineering : Design and implement scalable, low-latency runtime systems that coordinate thousands... ...: Work closely with teams across ML compilers, orchestration, cloud infrastructure... ...services). Deploying and optimizing ML/HPC workloads on GPU clusters (Kubernetes, Slurm...
Senior
I did my part and supported the Regular Toilet
Palo Alto, CA
4 days ago
Senior Software Engineer
...infrastructure foundation for our advanced AI/ML research and product development. You'll architect, build, and run our platforms... ...tuning. Design and build low latency, scalable, and reliable... ...Experience with high‑performance compute (HPC) schedulers, capacity planning,...
Senior
Tensec
Palo Alto, CA
4 days ago
Senior Platform Telemetry Engineer
$152k - $241.5k
...required for strong scaling for HPC and generative AI workload.... ...product management and other architects to narrow down on requirements... ...architecture, optimize firmware for low latency APIs. Strong knowledge of... ...Confidential Compute. Experience with ML and multi-variable...
Senior
Remote work
NVIDIA
Santa Clara, CA
2 days ago
HPC Network Architect - GPU RDMA / InfiniBand
A leading social media platform in San Jose is seeking a HPC Network Engineer. You will design and operate high-performance computing networks and collaborate with cross-functional teams to innovate network solutions. The ideal candidate has a Bachelor's degree in Computer...
TikTok
San Jose, CA
2 days ago
Senior Software Engineer - Apple Ads
$181.1k - $318.4k
...Senior Software Engineer - Apple Ads At Apple, we focus deeply on our customers' experience... ...owns a collection of highly available, low latency services central to serving high quality... ...: Work closely with product managers, architects, and other engineers to deliver high-...
Senior
Relocation
Apple
Cupertino, CA
3 days ago
Senior ML Accelerator Engineer - GPU
$128.7k - $261.3k
...libraries that sit at the heart of our on-vehicle ML inference for ADAS and autonomous driving... ...the car while consistently meeting strict latency, throughput, and reliability targets. If... ...-based Experience with low latencyorreal time systems Experience withlower...
Senior
Local area
Work from home
Relocation package
Flexible hours
General Motors
Sunnyvale, CA
3 days ago
Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference
$193.3k - $261.5k
...Inferentia and Trainium ML accelerators. This... ...technology You will architect and implement business... ...inference performance for both latency and throughput on such... ...'ll bring expertise in low-level optimization,... ...sharing and mentorship. Our senior members enjoy one-on-...
Senior
Work experience placement
Internship
Local area
Flexible hours
Amazon
Cupertino, CA
17 hours ago
Senior ML Engineer - Real-Time Ad Inventory Forecasting
Netflix, Inc. is seeking an experienced individual to build end-to-end ML model deployment infrastructure for low-latency real-time advertising systems. Responsibilities include handling large volumes of data and productionizing models for campaign effectiveness. Salary...
Senior
Netflix, Inc.
Los Gatos, CA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior SDE: Low-Latency ML/HPC Networking Architect. Be the first to apply!