Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

System Software Engineer, Distributed Systems

NVIDIA Gruppe

The VLSI Productivity and Infrastructure team supports 1000+ chip design engineers by building tools and platforms that supercharge their everyday work. Our mission is to make chip designers faster. We build and operate long shelf‑life systems spanning build automation, observability, analytics, automated error detection/remediation, and codebase modernization, with a strong commitment to stability. Our core workflow infrastructure runs as userspace software on bare‑metal Linux hosts (no sudo, no containers). We coordinate shared state and artifacts via NFS, launch long‑running, compute‑heavy workflows on IBM LSF, and provide adjacent services for APIs and observability. This is a high‑ownership environment where you'll often be the expert on what you build. What you will be doing: Design, build, and deliver core components of our next‑generation productivity platforms Develop reliable userspace infrastructure for long‑running engineering workflows at scale on bare‑metal Linux hosts Build state coordination over NFS (atomicity, idempotency/dedup, partial‑write recovery, without privileged ops) Build and improve orchestration around IBM LSF (submission/tracking, retries/cancel, log capture, fairness/backpressure) Convert legacy codebases into modern powerhouses using incremental migration techniques (e.g., Perl to Go), with stage gates, parity strategies, and strong observability Debug and improve performance and reliability across Linux and Kubernetes, including operational tooling Collaborate with engineering users to turn ambiguous workflows into durable production systems What we need to see: B.S. CS/EE (or equivalent experience) 5+ years developing and operating production software in Go and/or Python, ideally in large codebases Strong Linux fundamentals: processes, filesystems, permissions, synchronization/locks, concurrency, and debugging Solid distributed‑systems thinking: failures, retries/timeouts, backoff, idempotency, and operational rigor Experience building long‑runtime automation or services on shared compute clusters (batch schedulers, build systems) Ability to translate ambitious, high‑level goals into a safe delivery plan (instrumentation, staged rollout, measurable outcomes) Ways to stand out from the crowd: Hands‑on experience with shared filesystems at scale (NFS), or coordination patterns on eventually‑consistent storage Experience with batch job scheduling, shared compute fleets, or build systems Track record of incremental modernization (tests, shadow runs, canaries, rollback plans) Experience partitioning/optimizing metadata‑heavy systems and reducing I/O or R/W hot spots Strong incident/debug tactics: clear root‑cause analysis, remediation, and guardrails as well as rapid comprehension and ownership of unfamiliar codebases in any language (including LLM‑generated code) to implement high‑leverage changes With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level3, and 184,000 USD - 287,500 USD for Level4. You will also be eligible for equity and benefits. NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr NVIDIA Gruppe

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the System Software Engineer, Distributed Systems in Santa Clara, CA vacancy
  • $140k - $240k

     ...Cerebras Systems builds the world's largest AI chip, 56 times larger...  ..., security-first based engineering. Cerebras cluster involves complex...  ...cluster management software stack - all the way from a bare...  ...leadership/management role in distributed systems security. ~ Proven... 
    Suggested

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    4 days ago
  • $184k - $356.5k

    NVIDIA Gruppe is seeking an Engineering Manager to lead a team solving AI's infrastructure problems with systems-level software. You will guide engineers in building distributed AI systems, balancing project delivery with innovative research. The ideal candidate has over... 
    Suggested

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $200k - $400k

     ...Institute Of Foundation Models Engineer The Institute of Foundation Models (IFM) designs...  ...operates ultra-scale GPU supercomputing systems to train next-generation foundation...  ...effort — driving communication performance, distributed reliability, and cross-layer... 
    Suggested
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    4 days ago
  •  ...Senior Distributed Storage System Engineer This role has been designed as 'Onsite' with an expectation that you will primarily work from an HPE...  ...Definition: Designs, develops, troubleshoots and debugs software programs for software enhancements and new products. Develops... 
    Suggested
    Work at office
    Local area

    Hewlett Packard Enterprise

    Alviso, CA
    5 days ago
  • $105k

     ...Distributed Systems Software Engineer, Python / Go Join to apply for the Distributed Systems Software Engineer, Python / Go role at Canonical Continue with Google Continue with Google Distributed Systems Software Engineer, Python / Go 3 months ago Be among the first 25... 
    Suggested
    Full time
    Local area
    Remote work
    Worldwide

    Canonical

    San Jose, CA
    12 days ago
  •  ...Corporation is seeking a Manager of Software Architecture in Santa Clara,...  ...involves leading a team focused on distributed AI communication systems and setting technical direction. Candidates...  ...have at least 8 years of software engineering experience and 3 years of people... 

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $255.85k - $361.2k

    Job Overview We are seeking a Principal Engineer to define and architect the next generation of distributed AI systems across heterogeneous compute platforms, including...  ...'s or equivalent degree in Computer Science, Software Engineering, or related field. 12+ years of experience... 
    Local area
    Shift work

    Intel Corporation

    Santa Clara, CA
    1 day ago
  •  ...technology company in Santa Clara seeks a Machine Learning engineer to build and operate a web crawl infrastructure that supports...  ...Rust, Scala, or Go, and has experience in building scalable distributed systems. You will be responsible for ensuring the performance of the... 

    Apple Inc.

    Santa Clara, CA
    14 hours ago
  • NVIDIA Corporation is looking for a Senior System Software Engineer to join the NvSci team and help maintain its leadership in AI. This role involves building next-generation software, enhancing system architecture, and collaborating for performance optimization. Candidates... 

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • Pure Storage, Inc. is seeking a Senior Software Engineer in Santa Clara to lead the digital transformation of their Modern...  ...ideal candidate has over 8 years of experience in systems software, particularly in distributed systems. Join a team that values innovation and... 
    Flexible hours

    Pure Storage, Inc.

    Santa Clara, CA
    1 day ago
  • $141.91k - $200.34k

     ...Join an enthusiastic team of engineers in Intel's Networking...  ...security, performance, and system management for our customers...  ...create, and enhance tools or software to improve efficiency, optimize...  ...or troubleshooting. Linux distributions, PCIe devices, network management... 
    Local area
    Immediate start
    Shift work

    Intel

    Santa Clara, CA
    1 day ago
  •  ...digital future. Requirements We’re looking for a  Staff Software Engineer to join our Confidential Computing Management team—an...  ...design, build, and own core platform services powering secure, distributed systems at scale. This is a  high-impact, hands-on technical... 
    H1b
    Worldwide

    Fortanix

    Santa Clara, CA
    5 days ago
  •  ...Role AMD is looking for a strategic software engineering lead who is passionate about improving the performance of key applications and...  ...Expertise with techniques used to optimize inference like distributed kv‑cache, disaggregation, request scheduling etc. Ability... 

    AMD

    Santa Clara, CA
    2 days ago
  •  ...centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and...  ...ROLE AMD is looking for a strategic software engineering lead who is passionate about improving...  ...techniques used to optimize inference like distributed kv-cache, disaggregation, request... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    2 days ago
  • $168k - $270.25k

     ...advanced programming skills to build distributed and compute systems, backend services, microservices and...  ...or MS in Computer Science, Computer Engineering or related field (or equivalent...  ...experience developing microservices, cloud software and/or tooling roles. Desirable... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $181.1k - $318.4k

     ...at massive scale from the live web by a distributed crawl platform you'll help build and operate...  ..., high-impact team responsible for a system that continuously fetches, renders, and...  ...Apple Maps, and more. We're looking for an engineer who doesn't just build distributed... 
    Relocation

    Apple Inc.

    Santa Clara, CA
    14 hours ago
  • $168k - $322k

    A leading technology firm is hiring a Senior Software Engineer for Distributed Systems in California. This role involves designing and implementing a factory pipeline for AI models, collaborating with various teams to improve infrastructure, and mentoring team members.... 

    NVIDIA Corporation

    Santa Clara, CA
    14 hours ago
  • $168k - $270.25k

    Senior Software Engineer, Distributed Systems - NIM Factory page is loaded## Senior Software Engineer, Distributed Systems - NIM Factorylocations: US, CA, Santa Clara: US, TX, Remote: US, NY, Remote: US, CA, Remotetime type: Full timeposted on: Posted Todayjob requisition... 
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    14 hours ago
  • $207k - $300k

    Site Reliability Engineering Manager, Google Distributed Cloud Google Sunnyvale, CA, USA Bachelor’s degree in...  ...experience building or managing distributed systems or cloud infrastructure, with a...  ...Engineering (SRE) combines software and systems engineering to build and... 
    Full time

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $198.3k - $342.8k

    Systems Software Engineering Manager, Vision Products Group Sunnyvale, California, United States Machine Learning and AI Apple is where individual...  ...of Vision products through the development of distributed systems and frameworks with a broad range of applications... 
    Relocation

    Apple Inc.

    Sunnyvale, CA
    3 days ago
  • $120k - $130k

    Picarro in Santa Clara is seeking a Systems Software Engineer to design and develop robust software systems for scientific instrumentation. The role involves building reliable systems primarily in Python on Linux, ensuring maintainable and testable code. Candidates should... 

    Picarro

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

    Senior Software Engineer - GPU Cloud Infrastructure We are looking for a Senior Software Engineer...  ..., operations). Own and document system and software architecture, designs, and...  ...services Significant experience building distributed systems or cloud‑scale services,... 
    Worldwide

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

    Overview We are looking for a motivated Senior System Software Engineer to join the Holoscan team. This is an outstanding opportunity to accelerate...  ...software development within NVIDIA. Collaborate with a distributed team to address complex challenges in crafting a powerful... 

    NVIDIA Gruppe

    Santa Clara, CA
    14 hours ago
  • NVIDIA Gruppe is seeking a Senior System Software Engineer in Santa Clara, California, to develop world-class GPU-accelerated AI inference serving...  ...Rust & C++ skills, and a strong understanding of distributed systems. The position offers a competitive base salary, equity... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...building a scalable and modular software stack that powers advanced driver-assistance systems across a diverse range of...  ...motivated Senior Software Systems Engineer with a strong foundation in software...  ....* Familiarity with parallel/distributed systems and low-level system... 

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...Autonomous Vehicles Platform team is now looking for a Senior System Software Engineer. Our team builds the NVIDIA DriveWorks SDK with the goal...  ...proven experience developing and debugging multithreaded/distributed applications like multimedia systems, game engines, etc.... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...implement next-generation NvSci software to enable seamless cross-...  ...collaborators to improve APIs, simplify system architecture, enhance...  ...with hardware and firmware engineers to optimize performance and improve...  ...in cross-functional, distributed teams. NVIDIA is committed to... 

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $600 per month

     ...Electric Distribution Planning Engineer (Hybrid) Before learning more about this opportunity, please note...  ...? Do you enjoy solving complex system and planning challenges while collaborating...  ...or a similar distribution modeling software is  valued Experience with and interest... 
    Permanent employment
    Contract work
    Temporary work
    H1b
    Work at office
    Relocation
    Visa sponsorship
    Work visa
    Relocation package
    Flexible hours

    Colorado Springs Utilities

    Sunnyvale, CA
    4 days ago
  • $100k

     ...Core is a full-stack platform engineering team (we have frontend and...  ...enables continuous delivery of software and infrastructure changes...  ...Continuous Delivery (CD) system Spinnaker, and our higher level...  ...We are seeking a backend, distributed systems focused engineer who... 
    Hourly pay
    Full time
    Immediate start
    Flexible hours

    Netflix

    Los Gatos, CA
    2 days ago
  •  ...Decisioning org at Netflix Ads. We own the systems that ingest and process billions of ad...  ...at the intersection of reliability engineering, data infrastructure, and ads domain expertise...  ...'re Seeking ~10+ years building distributed systems and backend services at large... 
    Hourly pay
    Full time
    Immediate start
    Flexible hours

    Netflix

    Los Gatos, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to System Software Engineer, Distributed Systems. Be the first to apply!