Principal Systems Engineer
$175k - $225kNscale
Principal Systems Engineer – GPU Supercluster Bringup
We are building AI infrastructure for frontier-scale workloads. Our platform is designed for high-density, high-performance GPU clusters that push the limits of power, networking, and distributed compute. As a startup, we move fast, operate with ownership, and expect technical leaders to define standards—not just follow them.
The Role
We are hiring a Principal Deployment Engineer to architect and lead the bringup of large-scale GPU clusters (hundreds to thousands of GPUs). This is a technical leadership role responsible for defining how we deploy, validate, and scale AI superclusters across sites. You will own the full lifecycle of deployment—from rack design and fabric architecture to cluster validation frameworks and production readiness standards. You will set the bar for performance, reliability, and operational excellence. This role combines deep hands-on expertise with system-level thinking and cross-functional leadership.
What You'll Do
End-to-End Supercluster Bringup Ownership
- Define the technical standards for node, rack, and full-cluster bringup.
- Lead large-scale GPU cluster deployments (multi-rack, multi-pod environments).
- Architect high-performance network fabrics (IB, RoCE, Ethernet) optimized for AI workloads.
- Establish cluster-level acceptance criteria and validation frameworks.
Performance & Fabric Architecture
- Tune and validate NCCL, RDMA, GPUDirect, and collective operations at scale.
- Identify and eliminate performance bottlenecks across hardware, topology, and firmware layers.
- Drive congestion control and fabric optimization strategies.
- Define performance benchmarking methodology for AI training workloads.
Deployment Strategy & Scalability
- Design repeatable deployment models for multi-site expansion.
- Build automation frameworks for provisioning and cluster validation.
- Establish deployment SLAs, quality gates, and operational readiness standards.
- Reduce time-to-capacity while increasing reliability.
Technical Leadership
- Serve as the escalation point for complex bringup and performance issues.
- Mentor senior engineers and shape infrastructure best practices.
- Influence hardware selection, rack topology, and data center design decisions.
- Partner with executive leadership on infrastructure scaling strategy.
What We're Looking For
Required
- 10+ years of experience in large-scale infrastructure or HPC environments.
- Proven experience bringing up large GPU clusters (hundreds+ GPUs).
- Deep expertise in high-speed networking (InfiniBand, RoCE, Ethernet fabrics).
- Strong understanding of server architecture (PCIe, NUMA, memory hierarchy).
- Experience debugging performance issues across compute and network layers.
- Strong automation and systems-level thinking.
Strongly Preferred
- Experience scaling AI training clusters for frontier models.
- Experience with liquid cooling or ultra-high-density deployments.
- Knowledge of distributed storage systems (Lustre, Ceph, NVMe-oF).
- Experience defining infrastructure standards in a fast-growing organization.
What Success Looks Like
- Superclusters are brought online quickly, predictably, and at peak performance.
- Deployment processes scale from first cluster to multi-site expansion.
- Infrastructure becomes a competitive advantage.
- You define the technical blueprint for how we scale AI infrastructure.
The range below reflects the base salary for the position. Actual compensation may vary based on job-related factors such as skill set, experience, education, and location. In addition to base salary, this role may be eligible for bonus, equity, and/or commission programs. Nscale may offer a competitive benefits package including medical, dental, vision, flexible paid time off, parental leave, and retirement plan participation.
Salary Range
$175,000 - $225,000 USD
$184k - $230k
...Early Warning, we've powered and protected the U.S. financial system for over thirty years with cutting-edge solutions like Zelle,... ...employment Visa sponsorship. Overall Purpose As a Principal Engineer in the Identity and Access Management (IAM) team, you will play...SuggestedHourly payFor contractorsWork experience placementWork at officeImmediate startVisa sponsorshipWork visaFlexible hours- ...Engineering Manager We're looking for an Engineering Manager to lead a group of highly experienced engineers. This is a hands-on leadership... ...foster a strong engineering culture as they tackle complex systems challenges in distributed computing, large-scale data handling...Suggested
$182k - $237k
...integrity, collaborating to win, and always striving for better.To continue advancing this mission, we are seeking a Director, Systems Engineering to join our organization, reporting to the Vice President of Product Development. This leader will oversee the Systems...SuggestedRemote work$240k
Convex is seeking experienced engineers to design and maintain its global cloud infrastructure in San Francisco. This role involves architectural decisions and collaboration with teams to improve system performance and reliability while prioritizing simplicity. The ideal...Suggested$144k - $240k
Lila Sciences is seeking a Sr Principal / Principal Software Engineer to join their innovative team in San Francisco, CA. You will design and build AI-driven applications, focusing on performance, reliability, and cross-functional collaboration with scientists. Ideal candidates...SuggestedFlexible hours- Nema, an AI company based in San Francisco, is seeking an experienced systems engineer to lead engineering lifecycle management for complex hardware systems. You will work closely with defense and robotics companies, owning the systems engineering domain model and leading...
$207k - $335k
...About the Team The Safety Systems team is in need of a Technical Program Manager to streamline our full safety stack and integration... ...multiple stakeholders - ranging across research, product, engineering, legal, and policy - and ensuring all the risks are...Work at officeRelocation package$225k - $237.5k
Jones Lang LaSalle Incorporated in San Francisco is seeking a Director of Operations & Engineering to lead the operational management of building systems. This role involves overseeing maintenance, managing a technical team, and ensuring compliance with regulations. A...- Autodesk is looking for a Principal Engineer in San Francisco to lead web development efforts for a Design System. This role requires over 8 years of experience and proven delivery of commercial applications, with a strong focus on React UI components and AI-assisted development...
- ...may be able to make a hybrid/remote exception for someone in LA or Seattle. About the Role We are looking for a Principal RF Systems & Hardware Engineer to lead the definition and execution of our communication payloads. You will bridge the gap between high-level...Work at officeRemote workShift work
- ...best work — both in and out of the office. We’re looking for an Engineering Program Manager to join our global Hardware PMO team. We are... ...hardware engineering teams: electrical, mechanical, firmware, system test and hardware compliance. Lead the engineering team to identify...Contract workWork at officeLocal areaFlexible hours
- ...Identity Management and Disaster RecoveryPublic Safety Systems and Municipal Broadband FiberSFGovTV Broadcasting ServicesIT... ...operations that run 24 hours a day, 7 days a week.This Principal System Integration Engineer role is a key technical position on the JUSTIS...Permanent employmentFull timeWork experience placementSecond jobWork at officeImmediate startRemote work2 days per week
$165k - $260k
The Opportunity Culture Biosciences is looking for a Staff/ Senior Staff/ Principal Systems Engineer in R&D as the technical authority for end-to-end system design and integration of complex, cross-disciplinary platforms. The candidate will translate Business Needs into...Full timeContract workWork at office- ...Department of Technology’s Justice Tracking Information System (JUSTIS) team is responsible for designing, operating, and... ...operations that run 24 hours a day, 7 days a week. The Principal System Integration Engineer is a key technical contributor on the JUSTIS development...Full timeTemporary workSecond jobLocal areaImmediate startRemote work2 days per week
$300 per month
...Location Type On-site Department Cloud Engineering Crusoe's mission is to accelerate the abundance... ...infrastructure. About This Role As a Principal Site Reliability Engineer, you will play... ...who thrives in complex distributed systems, drives clarity in ambiguous...Full timeTemporary work$197k - $235k
Gusto is seeking an experienced Application Systems Engineering Manager in San Francisco. In this role, you will lead a team focused on developing AI solutions that enhance customer interactions. The position demands strong technical leadership, collaborative efforts across...Work at office2 days per week3 days per week- Invisible Technologies is looking for a Principal Software Engineer (SRE/DevOps) to work remotely. The ideal candidate will possess dual expertise in application engineering and infrastructure, contributing to a variety of technical initiatives. This role includes overseeing...Remote job
$197k - $235k
...platform and is responsible for building and maintaining the systems that power end-of-lifecycle payroll workflows, including custom... ...Role Gusto is looking for an experienced Application Systems Engineering Manager to lead the design, development, and deployment of AI...Full timeFor contractorsWork at officeLocal area2 days per week3 days per week$170k - $190k
...a “sleepy” industry for decades is now at the epicenter of sustaining the global economy. About the Role As a Manager of Systems Test Engineering at Mytra, you will be responsible for leading a team of systems test engineers in developing and executing systems validation...Work at office$261k - $326k
A technology company specializing in AI infrastructure is seeking a Principal Engineer to enhance reliability and scalability of cloud systems. This role demands over 15 years of experience in production engineering or related fields and involves setting technical directions...$179.4k - $224.25k
About the Role We are searching for an Engineering Manager to drive our B2B capabilities, including Billing, Incentives, and Performance... ...technologies like Ruby on Rails, Sidekiq, Redis, and Postgres to ensure system excellence, while also integrating AI advancements. Your...Local areaRemote workWork from homeFlexible hours$300 per month
...About the Role As we scale our AI infrastructure, we are investing deeply in the software systems that manage, observe, and heal our network at scale. We are hiring a Senior Engineering Manager, SDN Management Plane to lead the team responsible for the automation,...Temporary work$293k - $385k
About the Team Within Applied Engineering, the Financial Engineering team ensures that our products are monetized effectively to accommodate... ...architecture and roadmap for order data flows into downstream systems (e.g., internal provisioning services, billing/invoicing...- A cutting-edge technology firm in San Francisco seeks an experienced Engineering Leader to manage and scale a high-impact engineering team. The role involves ensuring technical excellence and optimizing workflows in a dynamic DeFi environment. Candidates should have over...
- ...industrial power with the first commercialized Solid State Transformer systems. Solid State Transformer is much more than a transformer... ...equivalent industry experience in electronics or reliability engineering. 10+ years of experience in reliability engineering for power...Worldwide
- A leading open-source technology firm is seeking an Engineering Manager to lead the MAAS team in San Francisco. This role requires technical... ...in Python and Golang, alongside proficiency in Linux system administration. The successful candidate will drive innovation...
- Crane Venture Partners is seeking an Engineering Director to lead and scale initiatives supporting Aspire's growth in the US. This role involves working closely with cross-functional teams to improve development processes and ensure scalable, high-quality products. Ideal...
$250k - $350k
...actionable data insights. Our autonomous robots, computer vision systems, and cloud-based analytics platform operate in live retail... ...decisions. Position Overview Simbe is seeking a Vice President of Engineering to lead and unify our full-stack engineering organization...Worldwide- The Consulting Solutions is looking for an Engineering Manager to lead the Evals team, responsible for creating critical evaluation datasets... ...agents. This role involves guiding the quality of evaluation systems that influence the development of Cursor’s products....
$212.1k - $342.65k
...documents. Until now, these were disconnected from business systems of record, costing businesses time, money, and opportunity.... ...lifecycle management (CLM). What you'll do Join Docusign as a Principal Engineer in the Enterprise Application Technology Engineering team;...Permanent employmentFull timeContract workWork at officeLocal areaRemote workFlexible hours2 days per week
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal Systems Engineer. Be the first to apply!
- principal network engineer San Francisco, CA
- senior director engineering San Francisco, CA
- engineering director San Francisco, CA
- principal engineer San Francisco, CA
- principal application developer San Francisco, CA
- assistant chief engineer San Francisco, CA
- principal security engineer San Francisco, CA
- director systems engineering San Francisco, CA
- director software engineering San Francisco, CA
- project engineer assistant project manager San Francisco, CA

