Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

System Software Engineer, Distributed Systems

NVIDIA Gruppe

The VLSI Productivity and Infrastructure team supports 1000+ chip design engineers by building tools and platforms that supercharge their everyday work. Our mission is to make chip designers faster. We build and operate long shelf‑life systems spanning build automation, observability, analytics, automated error detection/remediation, and codebase modernization, with a strong commitment to stability. Our core workflow infrastructure runs as userspace software on bare‑metal Linux hosts (no sudo, no containers). We coordinate shared state and artifacts via NFS, launch long‑running, compute‑heavy workflows on IBM LSF, and provide adjacent services for APIs and observability. This is a high‑ownership environment where you'll often be the expert on what you build. What you will be doing: Design, build, and deliver core components of our next‑generation productivity platforms Develop reliable userspace infrastructure for long‑running engineering workflows at scale on bare‑metal Linux hosts Build state coordination over NFS (atomicity, idempotency/dedup, partial‑write recovery, without privileged ops) Build and improve orchestration around IBM LSF (submission/tracking, retries/cancel, log capture, fairness/backpressure) Convert legacy codebases into modern powerhouses using incremental migration techniques (e.g., Perl to Go), with stage gates, parity strategies, and strong observability Debug and improve performance and reliability across Linux and Kubernetes, including operational tooling Collaborate with engineering users to turn ambiguous workflows into durable production systems What we need to see: B.S. CS/EE (or equivalent experience) 5+ years developing and operating production software in Go and/or Python, ideally in large codebases Strong Linux fundamentals: processes, filesystems, permissions, synchronization/locks, concurrency, and debugging Solid distributed‑systems thinking: failures, retries/timeouts, backoff, idempotency, and operational rigor Experience building long‑runtime automation or services on shared compute clusters (batch schedulers, build systems) Ability to translate ambitious, high‑level goals into a safe delivery plan (instrumentation, staged rollout, measurable outcomes) Ways to stand out from the crowd: Hands‑on experience with shared filesystems at scale (NFS), or coordination patterns on eventually‑consistent storage Experience with batch job scheduling, shared compute fleets, or build systems Track record of incremental modernization (tests, shadow runs, canaries, rollback plans) Experience partitioning/optimizing metadata‑heavy systems and reducing I/O or R/W hot spots Strong incident/debug tactics: clear root‑cause analysis, remediation, and guardrails as well as rapid comprehension and ownership of unfamiliar codebases in any language (including LLM‑generated code) to implement high‑leverage changes With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level3, and 184,000 USD - 287,500 USD for Level4. You will also be eligible for equity and benefits. NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr

Vacancy posted 22 hours ago
Similar jobs that could be interesting for youBased on the System Software Engineer, Distributed Systems in Santa Clara, CA vacancy
  • $152k - $241.5k

     ...supports 1000+ chip design engineers by building tools and platforms...  ...and operate long shelf-life systems spanning build automation,...  ...infrastructure runs as userspace software on bare-metal Linux hosts (...  ...role with an emphasis on distributed systems and operational... 
    Suggested

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $140k - $240k

     ...Cerebras Systems builds the world's largest AI chip, 56 times larger...  ..., security-first based engineering. Cerebras cluster involves complex...  ...cluster management software stack - all the way from a bare...  ...leadership/management role in distributed systems security. ~ Proven... 
    Suggested

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    2 days ago
  • $184k - $356.5k

    NVIDIA Gruppe is seeking an Engineering Manager to lead a team solving AI's infrastructure problems with systems-level software. You will guide engineers in building distributed AI systems, balancing project delivery with innovative research. The ideal candidate has over... 
    Suggested

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $200k - $400k

     ...Institute Of Foundation Models Engineer The Institute of Foundation Models (IFM) designs...  ...operates ultra-scale GPU supercomputing systems to train next-generation foundation...  ...effort — driving communication performance, distributed reliability, and cross-layer... 
    Suggested
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    2 days ago
  •  ...Senior Distributed Storage System Engineer This role has been designed as 'Onsite' with an expectation that you will primarily work from an HPE...  ...Definition: Designs, develops, troubleshoots and debugs software programs for software enhancements and new products. Develops... 
    Suggested
    Work at office
    Local area

    Hewlett Packard Enterprise

    Alviso, CA
    3 days ago
  •  ...technology company in Santa Clara seeks a Machine Learning engineer to build and operate a web crawl infrastructure that supports...  ...Rust, Scala, or Go, and has experience in building scalable distributed systems. You will be responsible for ensuring the performance of the... 

    Apple

    Santa Clara, CA
    21 hours ago
  • $105k

     ...Distributed Systems Software Engineer, Python / Go Join to apply for the Distributed Systems Software Engineer, Python / Go role at Canonical Continue with Google Continue with Google Distributed Systems Software Engineer, Python / Go 3 months ago Be among the first 25... 
    Full time
    Local area
    Remote work
    Worldwide

    Canonical

    San Jose, CA
    10 days ago
  •  ...NVIDIA Corporation is looking for a Senior System Software Engineer to join the NvSci team and help maintain its leadership in AI. This role involves building next-generation software, enhancing system architecture, and collaborating for performance optimization. Candidates... 

    NVIDIA

    Santa Clara, CA
    22 hours ago
  •  ...Pure Storage, Inc. is seeking a Senior Software Engineer in Santa Clara to lead the digital transformation of their Modern...  ...ideal candidate has over 8 years of experience in systems software, particularly in distributed systems. Join a team that values innovation and... 
    Flexible hours

    Pure Storage

    Santa Clara, CA
    1 day ago
  • $255.85k - $361.2k

    Job Overview We are seeking a Principal Engineer to define and architect the next generation of distributed AI systems across heterogeneous compute platforms, including...  ...'s or equivalent degree in Computer Science, Software Engineering, or related field. 12+ years of experience... 
    Local area
    Shift work

    Intel Corporation

    Santa Clara, CA
    4 days ago
  •  ...Corporation is seeking a Manager of Software Architecture in Santa Clara,...  ...involves leading a team focused on distributed AI communication systems and setting technical direction. Candidates...  ...have at least 8 years of software engineering experience and 3 years of people... 

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $168k - $270.25k

     ...Senior Software Engineer, Distributed Systems - NIM Factory page is loaded## Senior Software Engineer, Distributed Systems - NIM Factorylocations: US, CA, Santa Clara: US, TX, Remote: US, NY, Remote: US, CA, Remotetime type: Full timeposted on: Posted Todayjob requisition... 
    Remote work

    NVIDIA

    Santa Clara, CA
    22 hours ago
  • $181.1k - $318.4k

     ...at massive scale from the live web by a distributed crawl platform you'll help build and operate...  ..., high-impact team responsible for a system that continuously fetches, renders, and...  ...Apple Maps, and more. We're looking for an engineer who doesn't just build distributed... 
    Relocation

    Apple

    Santa Clara, CA
    22 hours ago
  • $168k - $270.25k

     ...advanced programming skills to build distributed and compute systems, backend services, microservices and...  ...or MS in Computer Science, Computer Engineering or related field (or equivalent...  ...experience developing microservices, cloud software and/or tooling roles. Desirable... 

    NVIDIA Gruppe

    Santa Clara, CA
    22 hours ago
  • $181.1k - $318.4k

     ...Senior Systems Framework Engineer, Vision Products Group Sunnyvale, California, United States Software and Services Apple is where individual imaginations gather together, committing...  ...products through the development of distributed systems and frameworks with a broad... 
    Relocation

    Apple

    Sunnyvale, CA
    21 hours ago
  •  ...towards a safer digital future. We’re looking for a Staff Software Engineer to join our Confidential Computing Management team—an engineer...  ..., build, and own core platform services powering secure, distributed systems at scale. This is a high-impact, hands-on technical... 
    H1b
    Worldwide

    Cerebras

    Santa Clara, CA
    21 hours ago
  • $141.91k - $200.34k

     ...Join an enthusiastic team of engineers in Intel's Networking...  ...security, performance, and system management for our customers...  ...create, and enhance tools or software to improve efficiency, optimize...  .../or troubleshooting), Linux distributions, PCIe devices, network management... 
    Local area

    Intel

    Santa Clara, CA
    21 hours ago
  • $175k - $275k

     ...Cerebras Systems builds the world's largest AI chip, 56 times larger...  ...As part of the Embedded Software team, you will help build the...  ...powers the Cerebras Wafer Scale Engine (WSE)-the world's largest AI...  ...systems, platform engineering, and distributed system enablement. As our... 

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    4 days ago
  • $184k - $287.5k

     ...debugging tools that empower NVIDIA engineers to improve perf and power...  ...to join a multifaceted software team with high standards! This...  ...insight in the workload and the system, and empower them to find...  ...like PyTorch and TensorFlow, distributed training and inference. Knowledge... 

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...Senior Software Engineer – GPU Cloud Infrastructure We are looking for a Senior Software Engineer...  ..., operations). Own and document system and software architecture, designs, and...  ...services Significant experience building distributed systems or cloud‑scale services,... 
    Worldwide

    NVIDIA Gruppe

    Santa Clara, CA
    22 hours ago
  • $184k - $287.5k

     ...Overview We are looking for a motivated Senior System Software Engineer to join the Holoscan team. This is an outstanding opportunity to accelerate...  ...software development within NVIDIA. Collaborate with a distributed team to address complex challenges in crafting a powerful... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  •  ...NVIDIA Gruppe is seeking a Senior System Software Engineer in Santa Clara, California, to develop world-class GPU-accelerated AI inference serving...  ...Rust & C++ skills, and a strong understanding of distributed systems. The position offers a competitive base salary, equity... 

    NVIDIA Gruppe

    Santa Clara, CA
    21 hours ago
  • $120k - $130k

     ...Picarro in Santa Clara is seeking a Systems Software Engineer to design and develop robust software systems for scientific instrumentation. The role involves building reliable systems primarily in Python on Linux, ensuring maintainable and testable code. Candidates should... 

    Picarro

    Santa Clara, CA
    22 hours ago
  • $184k - $287.5k

     ...building a scalable and modular software stack that powers advanced driver-assistance systems across a diverse range of...  ...motivated Senior Software Systems Engineer with a strong foundation in software...  ...~ Familiarity with parallel/distributed systems and low-level system profiling... 

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...Autonomous Vehicles Platform team is now looking for a Senior System Software Engineer. Our team builds the NVIDIA DriveWorks SDK with the goal...  ...proven experience developing and debugging multithreaded/distributed applications like multimedia systems, game engines, etc.... 

    NVIDIA Gruppe

    Santa Clara, CA
    21 hours ago
  • $184k - $287.5k

     ...implement next-generation NvSci software to enable seamless cross-...  ...collaborators to improve APIs, simplify system architecture, enhance...  ...with hardware and firmware engineers to optimize performance and improve...  ...in cross-functional, distributed teams. NVIDIA is committed to... 

    NVIDIA Gruppe

    Santa Clara, CA
    22 hours ago
  • $184k - $287.5k

     ...s an exciting time to join the NVIDIA Cloud Native Engineering (NVCNE) group’s backend software team. As a Cloud Platform Software Engineer, you will...  ...with SRE and product teams to troubleshoot complex distributed systems and drive operational excellence. You are expected... 

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $152k - $241.5k

     ...NVIDIA Solutions Engineering team is searching for engineers to help develop and bring...  ...their best work. We are looking for a System Software Engineer with expertise in embedded systems...  ...Be part of an internationally distributed team with locations in US, Europe, APAC... 

    NVIDIA

    Santa Clara, CA
    4 days ago
  •  ...centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and...  ...ROLE AMD is looking for a strategic software engineering lead who is passionate about improving...  ...techniques used to optimize inference like distributed kv-cache, disaggregation, request... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    22 hours ago
  • $168k - $322k

    A leading technology firm is hiring a Senior Software Engineer for Distributed Systems in California. This role involves designing and implementing a factory pipeline for AI models, collaborating with various teams to improve infrastructure, and mentoring team members.... 

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to System Software Engineer, Distributed Systems. Be the first to apply!