Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Principal Firmware Engineer - Server Manageability and Observability

$272k - $431.25k

NVIDIA Gruppe

NVIDIA data center systems, such as DGX and HGX, have become core to NVIDIA’s rapidly growing enterprise and cloud provider businesses. These platforms bring together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We’re looking for a strong technical architect to own the end‑to‑end architecture of these products, at the system software level, covering firmware, kernel drivers, operating systems, and user mode drivers. You will work with component leads internally and engage with industry‑leading cloud service providers on taking these products to market. What you’ll be doing Serve as the primary technical point of contact for major customers, leading technological discussions, defining KPIs, gathering requirements, and addressing complex technical queries. As a system software architect, lead technical innovation and strategic collaborations with major hyperscalers to architect next‑generation data center products. Align NVIDIA’s roadmap with major customers’ requirements through direct engagement. Develop and drive adoption of new technologies and protocols. Make critical technical decisions in ambiguous situations, mitigating risks through left‑shift strategies. What we need to see Deep expertise in scalable and performant server system architecture, focusing on SW/HW interfaces. Extensive experience with complex system software for accelerators (GPUs, DPUs, FPGAs). Mastery of system firmware (SBIOS, OpenBMC), embedded systems, and Linux kernel internals. Proficiency in Out‑of‑Band and In‑Band management architectures, device management protocols (MCTP, PLDM, SPDM, RDE) and system management protocols (Redfish, IPMI). Extensive knowledge of networking technologies and protocols, including TCP/IP, Ethernet, InfiniBand, as well as advanced switching and routing concepts. Experience collaborating with platform security experts to define tradeoffs between security and ease of use. Demonstrated success in leading complex, cross‑functional projects to completion, showcasing the ability to influence and achieve results without direct authority in large‑scale, collaborative environments. Demonstrable experience in implementing left‑shift strategy to de‑risk program execution. BS or MS degree in Computer Science, Electrical Engineering or related field (or equivalent experience). 15+ years in the area of system architecture and design. Ways to stand out from the crowd Knowledge of cloud and cluster level deployment and management systems. Participation and contributions in standards bodies such as OCP and DMTF. Familiarity with NVIDIA HPC programming models and libraries (CUDA, cuDNN, DOCA). Knowledge of enterprise storage architectures and distributed parallel processing paradigms. Benefits Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000USD–431,250USD. You will also be eligible for equity and benefits. Application Information Applications for this job will be accepted at least until May20,2026. Equal Opportunity Employment NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr NVIDIA Gruppe

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Principal Firmware Engineer - Server Manageability and Observability in Santa Clara, CA vacancy
  • $272k - $431.25k

     ...system software level. Including firmware, kernel drivers, operating...  ...in scalable and performant server system architecture,...  ...in Out-of-Band and In-Band management architectures, device management...  ...Computer Science, Electrical Engineering or related field (or equivalent... 
    Suggested
    Shift work

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $272k - $431.25k

     .... We are looking for expert engineers to come and help design rack...  ...architect to own end to end manageability architecture for these products...  ...you’ll be doing: Drive server management for large...  ...implemented in right way with each firmware and software module Collaborate... 
    Suggested
    Work at office

    NVIDIA

    Santa Clara, CA
    17 hours ago
  • $211.8k - $317.8k

     ...Inc. Job Area: Engineering Group, Engineering Group...  ...production-ready ARM server platforms. By joining...  ...silicon enablement, firmware, OS integrations,...  ...Performance, and Limits Management Software & Firmware...  ...including fleet management, observability, and policy-based... 
    Suggested
    Work experience placement
    Work from home

    Qualcomm

    Santa Clara, CA
    2 days ago
  •  ...design specifications and develop firmware applications for low-power,...  ...for embedded systems. Manage and maintain source code repositories...  ...teams in Digital and beyond. Engineering for device systems spans deep...  ...to products. The Principal Firmware Engineer provides thought... 
    Suggested
    Temporary work
    Local area
    Immediate start
    Remote work

    Brambles

    Santa Clara, CA
    3 days ago
  •  ...Senior / Principal Firmware Engineer Location: Santa Clara, CA Duration: Full-time/Perm Responsible...  ...complex SoC/silicon products for Server, Storage, and/or Networking applications...  ...kits (SDKs) to execute on system management controllers (e.g. BMC). Experience... 
    Suggested
    Permanent employment
    Full time

    InterSources

    Santa Clara, CA
    18 days ago
  • $211.8k - $317.8k

     ...is seeking an experienced ARM Server Power, Performance, and Limits Management Software & Firmware Architect to define the end‑...  ...interfaces suitable for fleet‑level observability and automation. Collaborate...  ...Bachelor’s degree in Engineering, Information Systems, Computer... 

    Qualcomm

    Santa Clara, CA
    4 days ago
  • $211.8k - $317.8k

     ...Qualcomm Technologies, Inc. Job Area: Engineering Group, Engineering Group Software...  ...Summary: As a CPU Performance Management FW Developer, you are responsible for working...  ...solution, and implement embedded firmware, to manage performance of the CPU subsystem... 
    Work experience placement
    Remote work
    Work from home
    Relocation

    Qualcomm

    Santa Clara, CA
    2 days ago
  • $211.8k - $317.8k

     ...technology innovator. This role is a Qualcomm Software Engineer (CPU Performance Management FW Developer) based in Santa Clara, CA or Austin, TX, with...  ...and interface information. Responsibilities Drive the firmware design, implementation and verification in pre‑silicon and... 
    Work experience placement
    Remote work
    Relocation

    Qualcomm

    Santa Clara, CA
    4 days ago
  •  ..., training, and enterprise deployment. As the Business Manager for AI Workstations & Servers, you will own the commercial success, channel strategy,...  ...role partners cross-functionally with product management, engineering, manufacturing, global sales, and channel partners to... 

    Acer

    Santa Clara, CA
    17 hours ago
  •  ...Software/Firmware Engineering Program Manager (EPM) The Software/Firmware EPM leads all engineering activities required for development testing and production release of software and firmware used with Client Labs' products. This is a high-impact position that is directly... 
    Work at office

    InterSources

    Santa Clara, CA
    1 day ago
  • $250k - $312k

     ...Manager, Detection & Response Code Red Partners is partnering with one of the most...  ...technical team responsible for detection engineering, incident response, and security...  ...tooling • Own the detection lifecycle from observability and log ingestion through detection-as-... 
    Shift work

    Code Red Partners

    Sunnyvale, CA
    2 hours ago
  • $272k - $431.25k

     ...Software Engineer We are seeking software engineers to work on next-generation high-speed...  ...a GPU or high-performance computing server will encounter in its lifecycle, by collaborating...  ...and debugging skills Ability to self-manage, show leadership, and have good interpersonal... 

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $224k - $356.5k

     ...improving efficiency, and scaling. As Technical Lead Manager, you will lead the engineering team within NVIDIA’s Dynamo organization. Your responsibility...  ...including operators, Helm charts, and GPU observability tooling (DCGM, dcgm-exporter, PyNVML). Background in... 
    Local area
    Worldwide

    NVIDIA

    Santa Clara, CA
    17 hours ago
  • $160k - $250k

     ...starts with you. About the Role: This is a Technical Engineering Manager role (50% Management / 50% Technical) responsible for owning...  ...- a lightweight sensor installed on client machines that observes system activity and recognizes malicious behavior, paired with... 
    Work experience placement
    Work at office
    Local area
    2 days per week
    3 days per week

    CrowdStrike Holdings, Inc.

    Sunnyvale, CA
    2 days ago
  •  ...Engineering Manager, Inference ML Runtime Sunnyvale CA or Toronto Canada Cerebras Systems builds the world's largest AI chip, 56 times...  ...optimization (latency, throughput, memory efficiency); observability and reliability across the inference stack. Ensure high-... 

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    17 hours ago
  • $206k - $303k

     ...Principal Engineer - Observability CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    3 days ago
  •  ...and insights to be collected - and the Firmware team is at the heart of this transformation...  .... In this role, you will lead and manage the definition of architecture and implementation...  ...multiple teams in Digital and beyond. Engineering for Brambles device systems spans deep... 
    For contractors
    Local area
    Remote work

    Brambles

    Santa Clara, CA
    3 days ago
  • $168k - $231k

     ...flexibility to do it in their own way. The Role: As a Principal Firmware Engineer, you will play a critical role in designing, developing,...  ...systems. Project Leadership: Lead firmware projects, managing timelines, resources, and collaboration with hardware and... 
    Immediate start
    Remote work
    Work from home
    Flexible hours

    Logitech

    San Jose, CA
    5 days ago
  • $165k - $267.5k

     ...Job Summary We are seeking a highly motivated Software Engineering Manager to lead and grow development teams working on Cortex, Palo...  ...Preferred Qualifications ~ Previous experience in cybersecurity, observability, or large-scale multi-tenant platforms is a plus.... 
    Remote work

    Palo Alto Networks

    Santa Clara, CA
    2 days ago
  • $200k - $287.5k

     ...redefine the future of how work gets done. Observe by Snowflake is an AI-powered...  ...built on the Snowflake AI Data Cloud and engineered for scale. We ingest and store logs, metrics...  ...Software Engineer for the Observe Data Management team. This team owns the core pipelines... 
    Flexible hours

    Snowflake Computing

    Menlo Park, CA
    4 days ago
  • $230k - $375k

     ...predictably as the program accelerates. As Senior Manager of AV Cloud Capacity & Performance Engineering, you own the team and function responsible for...  ...findings as engineering-grade recommendations, not observational reports. Own strategic vendor and cloud provider... 
    Work experience placement
    Work at office
    Local area
    Work from home
    Flexible hours

    General Motors

    Sunnyvale, CA
    2 days ago
  • $272k

    NVIDIA Gruppe is seeking an expert in server firmware development. This role requires over 15 years of experience, focusing on managing data center health and optimizing firmware...  ...has a strong educational background in engineering and expertise in C/C++, Python, and data... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $147k - $237.5k

     ...stronger relationships, and the kind of precision that drives great outcomes. Job Summary We are looking for a Principal Vulnerability Management Engineer to join the Cortex DevSecOps group and bolster our vulnerability management practices. This role focuses on securing... 
    Full time
    Work at office
    Visa sponsorship
    Work visa

    Palo Alto Networks, Inc.

    Santa Clara, CA
    4 days ago
  •  ...and more connected. We are looking for a technically deep Engineering Manager to lead the AI team at Coram. This team is small, highly capable...  ...Establish strong engineering standards around reliability, observability, and model evaluation What We’re Looking For Several... 
    Shift work

    Coram AI

    Sunnyvale, CA
    17 hours ago
  • $200k - $322k

     ...profound global impact. NVIDIA is seeking a Senior Manager of Site Reliability Engineering to lead and reshape how IT operations function at scale...  ...into an intelligent, automated operating model using observability, AI insights, and orchestration. This leader will apply... 

    NVIDIA

    Santa Clara, CA
    17 hours ago
  • $184k - $287.5k

     ...NVIDIA is seeking a Senior Firmware Engineer to join our CSP Engagements team, focusing on system software for Datacenter...  ...: Design and develop firmware solutions for manageability and observability of data center servers. Actively participate in hardware bring-up... 

    NVIDIA

    Santa Clara, CA
    17 hours ago
  • $190.28k - $285k

     ...Firmware Engineering For Photonic Fabric Products Marvell's semiconductor solutions are the essential building blocks of the data infrastructure...  ...guide their adoption into the product roadmap People management Directly manage a team of 6–9 firmware engineers... 
    Permanent employment
    Internship
    Work from home

    Marvell

    Santa Clara, CA
    3 days ago
  • $206.9k - $279.9k

     ...Description We are seeking a Principal PMT for the Publisher Decision Engine - Ad Server team to own and improve Programmatic Guaranteed deal delivery, which...  ...About the team You'll join a group of product managers, applied scientists, and engineers Amazon's advertising... 
    Local area
    Flexible hours
    Day shift

    Amazon

    Sunnyvale, CA
    2 days ago
  • Tesla, located in Palo Alto, is seeking a Software Engineer for the Battery Management System Team. In this role, you will develop high-quality software, focusing on firmware drivers and real-time software algorithms that enhance vehicle performance and reliability. The... 

    Tesla

    Palo Alto, CA
    4 days ago
  •  ...applications, without the hassle of managing hundreds of GPUs or TPUs....  ...Technical Program Manager (Server and Network Systems) on the...  ...Science, Electrical/Computer Engineering, or equivalent experience....  ...fleet management (provisioning, firmware/BIOS, drivers, field triage).... 

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Principal Firmware Engineer - Server Manageability and Observability. Be the first to apply!