System Software Engineer, Distributed Systems
NVIDIA Gruppe
The VLSI Productivity and Infrastructure team supports 1000+ chip design engineers by building tools and platforms that supercharge their everyday work. Our mission is to make chip designers faster. We build and operate long shelf‑life systems spanning build automation, observability, analytics, automated error detection/remediation, and codebase modernization, with a strong commitment to stability. Our core workflow infrastructure runs as userspace software on bare‑metal Linux hosts (no sudo, no containers). We coordinate shared state and artifacts via NFS, launch long‑running, compute‑heavy workflows on IBM LSF, and provide adjacent services for APIs and observability. This is a high‑ownership environment where you'll often be the expert on what you build. What you will be doing: Design, build, and deliver core components of our next‑generation productivity platforms Develop reliable userspace infrastructure for long‑running engineering workflows at scale on bare‑metal Linux hosts Build state coordination over NFS (atomicity, idempotency/dedup, partial‑write recovery, without privileged ops) Build and improve orchestration around IBM LSF (submission/tracking, retries/cancel, log capture, fairness/backpressure) Convert legacy codebases into modern powerhouses using incremental migration techniques (e.g., Perl to Go), with stage gates, parity strategies, and strong observability Debug and improve performance and reliability across Linux and Kubernetes, including operational tooling Collaborate with engineering users to turn ambiguous workflows into durable production systems What we need to see: B.S. CS/EE (or equivalent experience) 5+ years developing and operating production software in Go and/or Python, ideally in large codebases Strong Linux fundamentals: processes, filesystems, permissions, synchronization/locks, concurrency, and debugging Solid distributed‑systems thinking: failures, retries/timeouts, backoff, idempotency, and operational rigor Experience building long‑runtime automation or services on shared compute clusters (batch schedulers, build systems) Ability to translate ambitious, high‑level goals into a safe delivery plan (instrumentation, staged rollout, measurable outcomes) Ways to stand out from the crowd: Hands‑on experience with shared filesystems at scale (NFS), or coordination patterns on eventually‑consistent storage Experience with batch job scheduling, shared compute fleets, or build systems Track record of incremental modernization (tests, shadow runs, canaries, rollback plans) Experience partitioning/optimizing metadata‑heavy systems and reducing I/O or R/W hot spots Strong incident/debug tactics: clear root‑cause analysis, remediation, and guardrails as well as rapid comprehension and ownership of unfamiliar codebases in any language (including LLM‑generated code) to implement high‑leverage changes With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level3, and 184,000 USD - 287,500 USD for Level4. You will also be eligible for equity and benefits. NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr
$152k - $241.5k
...supports 1000+ chip design engineers by building tools and platforms... ...and operate long shelf-life systems spanning build automation,... ...infrastructure runs as userspace software on bare-metal Linux hosts (... ...role with an emphasis on distributed systems and operational...Suggested$140k - $240k
...Cerebras Systems builds the world's largest AI chip, 56 times larger... ..., security-first based engineering. Cerebras cluster involves complex... ...cluster management software stack - all the way from a bare... ...leadership/management role in distributed systems security. ~ Proven...Suggested$184k - $356.5k
NVIDIA Gruppe is seeking an Engineering Manager to lead a team solving AI's infrastructure problems with systems-level software. You will guide engineers in building distributed AI systems, balancing project delivery with innovative research. The ideal candidate has over...Suggested$200k - $400k
...Institute Of Foundation Models Engineer The Institute of Foundation Models (IFM) designs... ...operates ultra-scale GPU supercomputing systems to train next-generation foundation... ...effort — driving communication performance, distributed reliability, and cross-layer...SuggestedVisa sponsorship- ...Senior Distributed Storage System Engineer This role has been designed as 'Onsite' with an expectation that you will primarily work from an HPE... ...Definition: Designs, develops, troubleshoots and debugs software programs for software enhancements and new products. Develops...SuggestedWork at officeLocal area
- ...technology company in Santa Clara seeks a Machine Learning engineer to build and operate a web crawl infrastructure that supports... ...Rust, Scala, or Go, and has experience in building scalable distributed systems. You will be responsible for ensuring the performance of the...
$105k
...Distributed Systems Software Engineer, Python / Go Join to apply for the Distributed Systems Software Engineer, Python / Go role at Canonical Continue with Google Continue with Google Distributed Systems Software Engineer, Python / Go 3 months ago Be among the first 25...Full timeLocal areaRemote workWorldwide- ...NVIDIA Corporation is looking for a Senior System Software Engineer to join the NvSci team and help maintain its leadership in AI. This role involves building next-generation software, enhancing system architecture, and collaborating for performance optimization. Candidates...
- ...Pure Storage, Inc. is seeking a Senior Software Engineer in Santa Clara to lead the digital transformation of their Modern... ...ideal candidate has over 8 years of experience in systems software, particularly in distributed systems. Join a team that values innovation and...Flexible hours
$255.85k - $361.2k
Job Overview We are seeking a Principal Engineer to define and architect the next generation of distributed AI systems across heterogeneous compute platforms, including... ...'s or equivalent degree in Computer Science, Software Engineering, or related field. 12+ years of experience...Local areaShift work- ...Corporation is seeking a Manager of Software Architecture in Santa Clara,... ...involves leading a team focused on distributed AI communication systems and setting technical direction. Candidates... ...have at least 8 years of software engineering experience and 3 years of people...
$168k - $270.25k
...Senior Software Engineer, Distributed Systems - NIM Factory page is loaded## Senior Software Engineer, Distributed Systems - NIM Factorylocations: US, CA, Santa Clara: US, TX, Remote: US, NY, Remote: US, CA, Remotetime type: Full timeposted on: Posted Todayjob requisition...Remote work$181.1k - $318.4k
...at massive scale from the live web by a distributed crawl platform you'll help build and operate... ..., high-impact team responsible for a system that continuously fetches, renders, and... ...Apple Maps, and more. We're looking for an engineer who doesn't just build distributed...Relocation$168k - $270.25k
...advanced programming skills to build distributed and compute systems, backend services, microservices and... ...or MS in Computer Science, Computer Engineering or related field (or equivalent... ...experience developing microservices, cloud software and/or tooling roles. Desirable...$181.1k - $318.4k
...Senior Systems Framework Engineer, Vision Products Group Sunnyvale, California, United States Software and Services Apple is where individual imaginations gather together, committing... ...products through the development of distributed systems and frameworks with a broad...Relocation- ...towards a safer digital future. We’re looking for a Staff Software Engineer to join our Confidential Computing Management team—an engineer... ..., build, and own core platform services powering secure, distributed systems at scale. This is a high-impact, hands-on technical...H1bWorldwide
$141.91k - $200.34k
...Join an enthusiastic team of engineers in Intel's Networking... ...security, performance, and system management for our customers... ...create, and enhance tools or software to improve efficiency, optimize... .../or troubleshooting), Linux distributions, PCIe devices, network management...Local area$175k - $275k
...Cerebras Systems builds the world's largest AI chip, 56 times larger... ...As part of the Embedded Software team, you will help build the... ...powers the Cerebras Wafer Scale Engine (WSE)-the world's largest AI... ...systems, platform engineering, and distributed system enablement. As our...$184k - $287.5k
...debugging tools that empower NVIDIA engineers to improve perf and power... ...to join a multifaceted software team with high standards! This... ...insight in the workload and the system, and empower them to find... ...like PyTorch and TensorFlow, distributed training and inference. Knowledge...$184k - $287.5k
...Senior Software Engineer – GPU Cloud Infrastructure We are looking for a Senior Software Engineer... ..., operations). Own and document system and software architecture, designs, and... ...services Significant experience building distributed systems or cloud‑scale services,...Worldwide$184k - $287.5k
...Overview We are looking for a motivated Senior System Software Engineer to join the Holoscan team. This is an outstanding opportunity to accelerate... ...software development within NVIDIA. Collaborate with a distributed team to address complex challenges in crafting a powerful...- ...NVIDIA Gruppe is seeking a Senior System Software Engineer in Santa Clara, California, to develop world-class GPU-accelerated AI inference serving... ...Rust & C++ skills, and a strong understanding of distributed systems. The position offers a competitive base salary, equity...
$120k - $130k
...Picarro in Santa Clara is seeking a Systems Software Engineer to design and develop robust software systems for scientific instrumentation. The role involves building reliable systems primarily in Python on Linux, ensuring maintainable and testable code. Candidates should...$184k - $287.5k
...building a scalable and modular software stack that powers advanced driver-assistance systems across a diverse range of... ...motivated Senior Software Systems Engineer with a strong foundation in software... ...~ Familiarity with parallel/distributed systems and low-level system profiling...$184k - $287.5k
...Autonomous Vehicles Platform team is now looking for a Senior System Software Engineer. Our team builds the NVIDIA DriveWorks SDK with the goal... ...proven experience developing and debugging multithreaded/distributed applications like multimedia systems, game engines, etc....$184k - $287.5k
...implement next-generation NvSci software to enable seamless cross-... ...collaborators to improve APIs, simplify system architecture, enhance... ...with hardware and firmware engineers to optimize performance and improve... ...in cross-functional, distributed teams. NVIDIA is committed to...$184k - $287.5k
...s an exciting time to join the NVIDIA Cloud Native Engineering (NVCNE) group’s backend software team. As a Cloud Platform Software Engineer, you will... ...with SRE and product teams to troubleshoot complex distributed systems and drive operational excellence. You are expected...$152k - $241.5k
...NVIDIA Solutions Engineering team is searching for engineers to help develop and bring... ...their best work. We are looking for a System Software Engineer with expertise in embedded systems... ...Be part of an internationally distributed team with locations in US, Europe, APAC...- ...centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and... ...ROLE AMD is looking for a strategic software engineering lead who is passionate about improving... ...techniques used to optimize inference like distributed kv-cache, disaggregation, request...
$168k - $322k
A leading technology firm is hiring a Senior Software Engineer for Distributed Systems in California. This role involves designing and implementing a factory pipeline for AI models, collaborating with various teams to improve infrastructure, and mentoring team members....
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to System Software Engineer, Distributed Systems. Be the first to apply!
- systems software developer Santa Clara, CA
- IT system engineer Santa Clara, CA
- system programmer Santa Clara, CA
- healthcare systems engineer Santa Clara, CA
- application system engineer Santa Clara, CA
- operating system engineer Santa Clara, CA
- space systems engineer Santa Clara, CA
- system engineer remote Santa Clara, CA
- advanced systems engineer Santa Clara, CA
- computer systems engineer Santa Clara, CA

