Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Principal Systems Engineer

$175k - $225k

Nscale

Principal Systems Engineer – GPU Supercluster Bringup

We are building AI infrastructure for frontier-scale workloads. Our platform is designed for high-density, high-performance GPU clusters that push the limits of power, networking, and distributed compute. As a startup, we move fast, operate with ownership, and expect technical leaders to define standards—not just follow them.

The Role

We are hiring a Principal Deployment Engineer to architect and lead the bringup of large-scale GPU clusters (hundreds to thousands of GPUs). This is a technical leadership role responsible for defining how we deploy, validate, and scale AI superclusters across sites. You will own the full lifecycle of deployment—from rack design and fabric architecture to cluster validation frameworks and production readiness standards. You will set the bar for performance, reliability, and operational excellence. This role combines deep hands-on expertise with system-level thinking and cross-functional leadership.

What You'll Do
End-to-End Supercluster Bringup Ownership
  • Define the technical standards for node, rack, and full-cluster bringup.
  • Lead large-scale GPU cluster deployments (multi-rack, multi-pod environments).
  • Architect high-performance network fabrics (IB, RoCE, Ethernet) optimized for AI workloads.
  • Establish cluster-level acceptance criteria and validation frameworks.
Performance & Fabric Architecture
  • Tune and validate NCCL, RDMA, GPUDirect, and collective operations at scale.
  • Identify and eliminate performance bottlenecks across hardware, topology, and firmware layers.
  • Drive congestion control and fabric optimization strategies.
  • Define performance benchmarking methodology for AI training workloads.
Deployment Strategy & Scalability
  • Design repeatable deployment models for multi-site expansion.
  • Build automation frameworks for provisioning and cluster validation.
  • Establish deployment SLAs, quality gates, and operational readiness standards.
  • Reduce time-to-capacity while increasing reliability.
Technical Leadership
  • Serve as the escalation point for complex bringup and performance issues.
  • Mentor senior engineers and shape infrastructure best practices.
  • Influence hardware selection, rack topology, and data center design decisions.
  • Partner with executive leadership on infrastructure scaling strategy.
What We're Looking For
Required
  • 10+ years of experience in large-scale infrastructure or HPC environments.
  • Proven experience bringing up large GPU clusters (hundreds+ GPUs).
  • Deep expertise in high-speed networking (InfiniBand, RoCE, Ethernet fabrics).
  • Strong understanding of server architecture (PCIe, NUMA, memory hierarchy).
  • Experience debugging performance issues across compute and network layers.
  • Strong automation and systems-level thinking.
Strongly Preferred
  • Experience scaling AI training clusters for frontier models.
  • Experience with liquid cooling or ultra-high-density deployments.
  • Knowledge of distributed storage systems (Lustre, Ceph, NVMe-oF).
  • Experience defining infrastructure standards in a fast-growing organization.
What Success Looks Like
  • Superclusters are brought online quickly, predictably, and at peak performance.
  • Deployment processes scale from first cluster to multi-site expansion.
  • Infrastructure becomes a competitive advantage.
  • You define the technical blueprint for how we scale AI infrastructure.

The range below reflects the base salary for the position. Actual compensation may vary based on job-related factors such as skill set, experience, education, and location. In addition to base salary, this role may be eligible for bonus, equity, and/or commission programs. Nscale may offer a competitive benefits package including medical, dental, vision, flexible paid time off, parental leave, and retirement plan participation.

Salary Range

$175,000 - $225,000 USD

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Principal Systems Engineer in San Francisco, CA vacancy
  • $184k - $230k

     ...Early Warning, we've powered and protected the U.S. financial system for over thirty years with cutting-edge solutions like Zelle,...  ...employment Visa sponsorship. Overall Purpose As a Principal Engineer in the Identity and Access Management (IAM) team, you will play... 
    Suggested
    Hourly pay
    For contractors
    Work experience placement
    Work at office
    Immediate start
    Visa sponsorship
    Work visa
    Flexible hours

    Early Warning Services

    San Francisco, CA
    5 days ago
  •  ...Engineering Manager We're looking for an Engineering Manager to lead a group of highly experienced engineers. This is a hands-on leadership...  ...foster a strong engineering culture as they tackle complex systems challenges in distributed computing, large-scale data handling... 
    Suggested

    Modal

    San Francisco, CA
    4 days ago
  • $182k - $237k

     ...integrity, collaborating to win, and always striving for better.To continue advancing this mission, we are seeking a Director, Systems Engineering to join our organization, reporting to the Vice President of Product Development. This leader will oversee the Systems... 
    Suggested
    Remote work

    El Camino Health

    San Francisco, CA
    1 day ago
  • $240k

    Convex is seeking experienced engineers to design and maintain its global cloud infrastructure in San Francisco. This role involves architectural decisions and collaboration with teams to improve system performance and reliability while prioritizing simplicity. The ideal... 
    Suggested

    Convex

    San Francisco, CA
    2 days ago
  • $144k - $240k

    Lila Sciences is seeking a Sr Principal / Principal Software Engineer to join their innovative team in San Francisco, CA. You will design and build AI-driven applications, focusing on performance, reliability, and cross-functional collaboration with scientists. Ideal candidates... 
    Suggested
    Flexible hours

    Jobr

    San Francisco, CA
    5 days ago
  • Nema, an AI company based in San Francisco, is seeking an experienced systems engineer to lead engineering lifecycle management for complex hardware systems. You will work closely with defense and robotics companies, owning the systems engineering domain model and leading... 

    Nema

    San Francisco, CA
    5 days ago
  • $207k - $335k

     ...About the Team The Safety Systems team is in need of a Technical Program Manager to streamline our full safety stack and integration...  ...multiple stakeholders - ranging across research, product, engineering, legal, and policy - and ensuring all the risks are... 
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    3 days ago
  • $225k - $237.5k

    Jones Lang LaSalle Incorporated in San Francisco is seeking a Director of Operations & Engineering to lead the operational management of building systems. This role involves overseeing maintenance, managing a technical team, and ensuring compliance with regulations. A... 

    Jones Lang LaSalle Incorporated

    San Francisco, CA
    2 days ago
  • Autodesk is looking for a Principal Engineer in San Francisco to lead web development efforts for a Design System. This role requires over 8 years of experience and proven delivery of commercial applications, with a strong focus on React UI components and AI-assisted development... 

    Pomona College

    San Francisco, CA
    5 days ago
  •  ...may be able to make a hybrid/remote exception for someone in LA or Seattle. About the Role We are looking for a Principal RF Systems & Hardware Engineer to lead the definition and execution of our communication payloads. You will bridge the gap between high-level... 
    Work at office
    Remote work
    Shift work

    AdAstra

    San Francisco, CA
    11 days ago
  •  ...best work — both in and out of the office. We’re looking for an Engineering Program Manager to join our global Hardware PMO team. We are...  ...hardware engineering teams: electrical, mechanical, firmware, system test and hardware compliance. Lead the engineering team to identify... 
    Contract work
    Work at office
    Local area
    Flexible hours

    Ouraring Inc

    San Francisco, CA
    2 days ago
  •  ...Identity Management and Disaster RecoveryPublic Safety Systems and Municipal Broadband FiberSFGovTV Broadcasting ServicesIT...  ...operations that run 24 hours a day, 7 days a week.This Principal System Integration Engineer role is a key technical position on the JUSTIS... 
    Permanent employment
    Full time
    Work experience placement
    Second job
    Work at office
    Immediate start
    Remote work
    2 days per week

    City and County of San Francisco

    San Francisco, CA
    3 days ago
  • $165k - $260k

    The Opportunity Culture Biosciences is looking for a Staff/ Senior Staff/ Principal Systems Engineer in R&D as the technical authority for end-to-end system design and integration of complex, cross-disciplinary platforms. The candidate will translate Business Needs into... 
    Full time
    Contract work
    Work at office

    TryApplyNow

    South San Francisco, CA
    1 day ago
  •  ...Department of Technology’s Justice Tracking Information System (JUSTIS) team is responsible for designing, operating, and...  ...operations that run 24 hours a day, 7 days a week. The Principal System Integration Engineer is a key technical contributor on the JUSTIS development... 
    Full time
    Temporary work
    Second job
    Local area
    Immediate start
    Remote work
    2 days per week

    San Francisco Department of Technology (DT)

    San Francisco, CA
    2 days ago
  • $300 per month

     ...Location Type On-site Department Cloud Engineering Crusoe's mission is to accelerate the abundance...  ...infrastructure. About This Role As a Principal Site Reliability Engineer, you will play...  ...who thrives in complex distributed systems, drives clarity in ambiguous... 
    Full time
    Temporary work

    Epoch Biodesign

    San Francisco, CA
    5 days ago
  • $197k - $235k

    Gusto is seeking an experienced Application Systems Engineering Manager in San Francisco. In this role, you will lead a team focused on developing AI solutions that enhance customer interactions. The position demands strong technical leadership, collaborative efforts across... 
    Work at office
    2 days per week
    3 days per week

    Gusto

    San Francisco, CA
    1 day ago
  • Invisible Technologies is looking for a Principal Software Engineer (SRE/DevOps) to work remotely. The ideal candidate will possess dual expertise in application engineering and infrastructure, contributing to a variety of technical initiatives. This role includes overseeing... 
    Remote job

    Invisible Technologies

    San Francisco, CA
    2 days ago
  • $197k - $235k

     ...platform and is responsible for building and maintaining the systems that power end-of-lifecycle payroll workflows, including custom...  ...Role Gusto is looking for an experienced Application Systems Engineering Manager to lead the design, development, and deployment of AI... 
    Full time
    For contractors
    Work at office
    Local area
    2 days per week
    3 days per week

    Gusto

    San Francisco, CA
    1 day ago
  • $170k - $190k

     ...a “sleepy” industry for decades is now at the epicenter of sustaining the global economy. About the Role As a Manager of Systems Test Engineering at Mytra, you will be responsible for leading a team of systems test engineers in developing and executing systems validation... 
    Work at office

    Mytra

    Brisbane, CA
    3 days ago
  • $261k - $326k

    A technology company specializing in AI infrastructure is seeking a Principal Engineer to enhance reliability and scalability of cloud systems. This role demands over 15 years of experience in production engineering or related fields and involves setting technical directions... 

    Crusoe

    San Francisco, CA
    5 days ago
  • $179.4k - $224.25k

    About the Role We are searching for an Engineering Manager to drive our B2B capabilities, including Billing, Incentives, and Performance...  ...technologies like Ruby on Rails, Sidekiq, Redis, and Postgres to ensure system excellence, while also integrating AI advancements. Your... 
    Local area
    Remote work
    Work from home
    Flexible hours

    Omada Health, Inc.

    San Francisco, CA
    5 days ago
  • $300 per month

     ...About the Role As we scale our AI infrastructure, we are investing deeply in the software systems that manage, observe, and heal our network at scale. We are hiring a Senior Engineering Manager, SDN Management Plane to lead the team responsible for the automation,... 
    Temporary work

    Crusoe Energy Systems LLC

    San Francisco, CA
    4 days ago
  • $293k - $385k

    About the Team Within Applied Engineering, the Financial Engineering team ensures that our products are monetized effectively to accommodate...  ...architecture and roadmap for order data flows into downstream systems (e.g., internal provisioning services, billing/invoicing... 

    Slope

    San Francisco, CA
    2 days ago
  • A cutting-edge technology firm in San Francisco seeks an experienced Engineering Leader to manage and scale a high-impact engineering team. The role involves ensuring technical excellence and optimizing workflows in a dynamic DeFi environment. Candidates should have over... 

    deCircle

    San Francisco, CA
    1 day ago
  •  ...industrial power with the first commercialized Solid State Transformer systems. Solid State Transformer is much more than a transformer...  ...equivalent industry experience in electronics or reliability engineering. 10+ years of experience in reliability engineering for power... 
    Worldwide

    Reliabilityweb.com

    San Francisco, CA
    1 day ago
  • A leading open-source technology firm is seeking an Engineering Manager to lead the MAAS team in San Francisco. This role requires technical...  ...in Python and Golang, alongside proficiency in Linux system administration. The successful candidate will drive innovation... 

    Canonical

    San Francisco, CA
    2 days ago
  • Crane Venture Partners is seeking an Engineering Director to lead and scale initiatives supporting Aspire's growth in the US. This role involves working closely with cross-functional teams to improve development processes and ensure scalable, high-quality products. Ideal... 

    Crane Venture Partners

    San Francisco, CA
    1 day ago
  • $250k - $350k

     ...actionable data insights. Our autonomous robots, computer vision systems, and cloud-based analytics platform operate in live retail...  ...decisions. Position Overview Simbe is seeking a Vice President of Engineering to lead and unify our full-stack engineering organization... 
    Worldwide

    Ring Inc

    San Francisco, CA
    4 days ago
  • The Consulting Solutions is looking for an Engineering Manager to lead the Evals team, responsible for creating critical evaluation datasets...  ...agents. This role involves guiding the quality of evaluation systems that influence the development of Cursor’s products.... 

    The Consulting Solutions

    San Francisco, CA
    2 days ago
  • $212.1k - $342.65k

     ...documents. Until now, these were disconnected from business systems of record, costing businesses time, money, and opportunity....  ...lifecycle management (CLM). What you'll do Join Docusign as a Principal Engineer in the Enterprise Application Technology Engineering team;... 
    Permanent employment
    Full time
    Contract work
    Work at office
    Local area
    Remote work
    Flexible hours
    2 days per week

    DocuSign

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Principal Systems Engineer. Be the first to apply!