Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior SDE: Low-Latency ML/HPC Networking Architect

Itlearn360

Senior Software Development Engineer, Annapurna Labs, Elastic Collectives job at Annapurna Labs (U.S.) Inc.. Cupertino, CA. DESCRIPTION Annapurna Labs, a crucial part of AWS, is responsible for developing hardware and software components for EC2 infrastructure. Our team focuses on building networking solutions that for Machine Learning (ML) and High-Performance Computing (HPC) workloads on AWS. We are seeking an experienced engineer with low-level latency networking or interconnect expertise to optimize customer experience by designing systems that enable scaling network-intensive workloads over thousands of CPUs, GPUs, and TPUs. This role is on the forefront of AI/ML, we spend a good deal of the day optimizing the networking for the latest AI workload such as LLMs. Our ideal candidate will have extensive experience in low-latency networking and collective operations, such as HPC network fabric or machine learning accelerator cluster systems. Also applicable is experience high-frequency trading networking, high-speed wireless networking, or low latency interconnects such as PCIe or CXL. Proficiency in C/C++ and a deep understanding of Linux and kernel-level programming are essential. Strong problem-solving skills and the ability to troubleshoot complex networking issues are required, along with excellent communication skills to work effectively in a collaborative team environment. A day in the life Working at Annapurna Labs means engaging with a diverse and inclusive team culture that embraces differences and fosters a sense of belonging. You will participate in innovative learning experiences and benefit offerings, such as the CORE and AmazeCon conferences. Your day will involve designing and optimizing networking solutions, collaborating with cross-functional teams, and engaging with customers to gather feedback and continuously improve our offerings. About the team Work/Life Balance: Our team places a high value on work-life balance, believing in establishing a flow that energizes both personal and professional life. We offer flexible working hours and encourage you to find a balance that suits you, ensuring long-term happiness and fulfillment. It’s not about the number of hours spent at work or home but about creating a harmonious balance that enhances both aspects of your life. Mentorship & Career Growth: We are dedicated to supporting new team members with a mix of experience levels and tenures, fostering an environment of knowledge sharing and mentorship. Our commitment to your career growth includes assigning projects that help you develop into a well-rounded professional capable of taking on more complex tasks in the future. Join us at Annapurna Labs and be part of a team that is shaping the future of networking solutions for ML and HPC workloads on AWS! BASIC QUALIFICATIONS - 5+ years of non-internship professional software development experience

  • 5+ years of programming with at least one software programming language experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Experience as a mentor, tech lead or leading an engineering team
#J-18808-Ljbffr Itlearn360

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Senior SDE: Low-Latency ML/HPC Networking Architect in Cupertino, CA vacancy
  • $208k - $333.5k

     ...world! The NVIDIA Enterprise Network Architecture team is seeking...  ...based compute, storage, and GPU/HPC clusters. Design high‑...  ...compute workloads and GPU‑dense AI/ML training and inference...  ...dark‑fiber systems to provide low‑latency, loss‑minimal connectivity between... 
    Senior

    NVIDIA

    Santa Clara, CA
    2 days ago
  •  ...effortlessly run large-scale ML applications, without the hassle...  ...About The Role As a Network Architect on the Cluster Architecture Team...  ...fabrics for AI/ML and HPC systems. Identify and resolve...  ...ensuring high resource utilization, low latency, and high-throughput... 
    Suggested

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    1 day ago
  • $152k - $241.5k

     ...We are looking for a Senior Software Engineer to join...  ...continue improving our HPC infrastructure. Our team...  ...push the limits of scale, latency, and reliability. Continuously...  ..., resilient, and low‑latency services....  ...clusters, large-scale AI/ML platforms, or systems managed... 
    Senior

    NVIDIA

    Santa Clara, CA
    17 hours ago
  • NVIDIA Gruppe in Santa Clara is looking for an experienced Networking Solutions Architect to support advanced computing networking solutions for AI/ML and HPC. You will work with leading tech companies to develop solutions that leverage NVIDIA’s cutting-edge technologies... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

    NVIDIA Gruppe is seeking an experienced Solutions Architect in Santa Clara to support accelerated computing networking solutions for AI/ML and HPC. You will develop and demonstrate solutions with major tech companies while addressing customer needs and performance issues... 
    Suggested

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $165k - $220k

     ...About the role: As a Senior Specialist Field...  ...offerings, focusing on AI/ML workloads within high-performance compute (HPC) environments Collaborate...  ...experience as a Solutions Architect, Field Engineer,...  ...facing publications/talks on latency, optimization, or advanced... 
    Senior
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    13 days ago
  •  ...Network Cluster Architect - Data Center Infrastructure Work Locations (2) Submit...  ...Apple's services and AI/ML workloads. We are seeking experienced...  ...factors like bandwidth, latency, scalability, and cost....  ...strategic recommendations to senior leadership. Develop functional... 

    Apple

    Cupertino, CA
    2 days ago
  • $184k - $287.5k

     ...NVIDIA networking designs and manufactures high-performance...  ...) we make powerful ML/AI platforms possible....  ...networking Sr. Solutions Architect at NVIDIA you will have...  ...platforms to run AI and HPC workloads Address sophisticated...  ...languages (from low-level C programming... 
    Senior
    Remote work

    NVIDIA

    Santa Clara, CA
    17 hours ago
  • $184k - $356.5k

    NVIDIA Gruppe, based in Santa Clara, is seeking a highly experienced Sr. Solutions Architect specializing in embedded software engineering to support innovative networking technologies. This role offers significant agency and the opportunity to work closely with both customers... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $224k - $431.25k

    NVIDIA Gruppe in Santa Clara is seeking experienced networking engineers to join the Solutions Architecture team. This role focuses on integrating cutting-edge NVIDIA networking products and requires both technical proficiency and strong customer-facing skills. The ideal... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $181k - $297k

     ...largest professional network, built to create...  ...We are seeking an HPC Network Engineer...  ...high-performance, low-latency Ethernet fabrics for...  ...supporting AI/ML training, inference...  ...traffic. As a Senior Staff Software Engineer...  ...(tech lead, architect, principal/IC leadership... 
    Senior
    For contractors
    Work at office
    Flexible hours

    LinkedIn

    Mountain View, CA
    1 day ago
  • $160k - $225k

     ...life on Mars. SR. SOFTWARE ENGINEER, LOW LATENCY COMPUTING (STARLINK) At SpaceX we’re leveraging...  ..., high-bandwidth satellite-based global network. Participate in and lead architecture...  ...Pay range: Software Engineer /Senior: $160,000.00 - $225,000.00/per year... 
    Senior
    Permanent employment
    Temporary work
    Worldwide
    Weekend work

    SpaceX

    Sunnyvale, CA
    2 days ago
  • A leading technology firm in San Jose is seeking a Senior Mixed-Signal Analog IC Design Engineer to develop cutting-edge low-power solutions for high-performance applications. The ideal candidate will possess a deep understanding of mixed-signal design, a Ph.D. or Master... 
    Senior

    Bitdeer Technologies Group

    San Jose, CA
    2 days ago
  • $118.2k - $185k

     ...Principal Engineer- WiFi RUCKUS Networks specializes in delivering high-performance networking...  ...Intelligence (AI) and Machine Learning (ML) to enhance network performance and...  ...standards, focusing on high performance, low latency, and guaranteed service delivery. Collaborate... 
    Senior
    Local area

    Vistance Networks

    Sunnyvale, CA
    1 day ago
  • TechDigital Group is hiring for a role focused on overseeing network configuration and maintenance, ensuring connectivity across the organization. The position requires managing firewalls, leading architecture design initiatives, and troubleshooting network issues to maintain... 
    Senior

    TechDigital Group

    Sunnyvale, CA
    4 days ago
  • $184k - $287.5k

     ...looking for a brilliant Software & Systems Architect to join the NIC Software/Firmware...  ...diverse use cases. Conduct research in network protocols, new network technologies, networking...  ...and training), storage, cyber security, HPC, and emulation offloads. What We Need... 
    Senior

    NVIDIA

    Santa Clara, CA
    4 days ago
  •  ...# Network Architect Location - Santa Clara. This is a hands-on architecture position...  ...scale DCs inter-connects and fabric for HPC, AI, and GPU computing clusters....  ..., and dark fiber deployments to ensure low latency and high reliability. Partner with... 

    Tranzeal

    Santa Clara, CA
    4 days ago
  •  ...Title: Infrastructure Architect (AI & Data Center) Location: San...  ...GPUs, Virtual Machines, and networking. Develop a comprehensive...  ...objectives while maintaining low-latency performance. 3. Special Projects...  ...infrastructure. ~ AI/ML Ops: Proven expertise in Kubernetes... 

    Trilyon, Inc.

    San Jose, CA
    1 day ago
  •  ...learning users to effortlessly run large-scale ML applications, without the hassle of...  ...We are seeking a highly skilled WAN Network Engineer to design, implement, manage, and...  ...network across leased lines and dark fiber for low latency and 99.999% availability.... 
    Senior

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    2 days ago
  • As a member of the GPU AI/HPC Infrastructure team, you will provide leadership in the design...  ...we encounter including: compute, networking, and storage design for large scale, high-...  ...automation solutions. Build and maintain AI and ML heterogeneous clusters on-premises and in... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...Libraries team is looking for a senior engineer to join our...  ...kernel generation for AI and HPC, specifically targeting matrix...  ...maintenance overhead through re-architecting. To be successful in your responsibilities...  .... Experience with low level programming using assembly... 
    Senior
    Remote work

    NVIDIA

    Santa Clara, CA
    17 hours ago
  • $152k - $241.5k

     ...enable researchers and engineers to develop the next generation of AI/ML systems. By joining us, you’ll help design solutions that power...  ...heart of this transformation. We are looking for a strong AI & HPC Observability Engineer to build and scale next-generation... 
    Senior

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $248.71k - $292.6k

     ...Engineering : Design and implement scalable, low-latency runtime systems that coordinate thousands...  ...: Work closely with teams across ML compilers, orchestration, cloud infrastructure...  ...services). Deploying and optimizing ML/HPC workloads on GPU clusters (Kubernetes, Slurm... 
    Senior

    I did my part and supported the Regular Toilet

    Palo Alto, CA
    4 days ago
  •  ...infrastructure foundation for our advanced AI/ML research and product development. You'll architect, build, and run our platforms...  ...tuning. Design and build low latency, scalable, and reliable...  ...Experience with high‑performance compute (HPC) schedulers, capacity planning,... 
    Senior

    Tensec

    Palo Alto, CA
    4 days ago
  • $152k - $241.5k

     ...required for strong scaling for HPC and generative AI workload....  ...product management and other architects to narrow down on requirements...  ...architecture, optimize firmware for low latency APIs. Strong knowledge of...  ...Confidential Compute. Experience with ML and multi-variable... 
    Senior
    Remote work

    NVIDIA

    Santa Clara, CA
    2 days ago
  • A leading social media platform in San Jose is seeking a HPC Network Engineer. You will design and operate high-performance computing networks and collaborate with cross-functional teams to innovate network solutions. The ideal candidate has a Bachelor's degree in Computer... 

    TikTok

    San Jose, CA
    2 days ago
  • $181.1k - $318.4k

     ...Senior Software Engineer - Apple Ads At Apple, we focus deeply on our customers' experience...  ...owns a collection of highly available, low latency services central to serving high quality...  ...: Work closely with product managers, architects, and other engineers to deliver high-... 
    Senior
    Relocation

    Apple

    Cupertino, CA
    3 days ago
  • $128.7k - $261.3k

     ...libraries that sit at the heart of our on-vehicle ML inference for ADAS and autonomous driving...  ...the car while consistently meeting strict latency, throughput, and reliability targets. If...  ...-based Experience with low latencyorreal time systems Experience withlower... 
    Senior
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    3 days ago
  • $193.3k - $261.5k

     ...Inferentia and Trainium ML accelerators. This...  ...technology You will architect and implement business...  ...inference performance for both latency and throughput on such...  ...'ll bring expertise in low-level optimization,...  ...sharing and mentorship. Our senior members enjoy one-on-... 
    Senior
    Work experience placement
    Internship
    Local area
    Flexible hours

    Amazon

    Cupertino, CA
    17 hours ago
  • Netflix, Inc. is seeking an experienced individual to build end-to-end ML model deployment infrastructure for low-latency real-time advertising systems. Responsibilities include handling large volumes of data and productionizing models for campaign effectiveness. Salary... 
    Senior

    Netflix, Inc.

    Los Gatos, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior SDE: Low-Latency ML/HPC Networking Architect. Be the first to apply!