Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Principal Production Engineer

$300 per month

Crusoe

Job Description

Job Description

Crusoe is on a mission to accelerate the abundance of energy and intelligence . As the only vertically integrated AI infrastructure company built from the ground up, we own and operate each layer of the stack — from electrons to tokens — to power the world's most ambitious AI workloads. When you join Crusoe, you join a team that is building the future, faster.

We're in the midst of the greatest industrial revolution of our time. The demand for AI compute is boundless, and power is a bottleneck. We're solving that — with an energy-first approach that makes AI infrastructure better for the world and faster for the people innovating with AI.

We're looking for problem-solving, opportunity-finding teammates with a sense of urgency, who believe in the scale of our ambition and thrive on a path not fully paved — people who want to grow their careers alongside a team of experts across energy, manufacturing, data center construction, and cloud services.

If you want to do the most meaningful work of your career, help our customers and partners advance their AI strategies, and be part of a high-performing team that believes in each other, come build with us at Crusoe.

About This Role:

Crusoe is building the AI factory which is a vertically integrated company spanning power generation, purpose-built data centers, and the cloud platform that frontier AI runs on. We are looking for a Principal Engineer on our Production Engineering team. Someone who will own the reliability, scalability, and operational excellence of the cloud infrastructure that sits on top of it all: compute, storage, networking, and the platform and tooling that ties it together. The systems you'll be responsible for are the reason that compute translates into usable cloud, and at the growth rate Crusoe is operating, the scope of this role expands with every quarter. This is a high-ownership, high-autonomy position where you will set technical direction, drive observability and reliability standards across the organization, and be the kind of engineer that makes the people around them meaningfully better. The problems are novel, the scale is real, and the impact is immediate.

What You'll Be Working On:

  • Own the reliability and scalability of Crusoe's cloud infrastructure — compute, storage, and networking — defining SLOs, leading incident response, and driving systemic improvements that reduce toil and raise the bar across the platform

  • Build and mature the observability and tooling layer — from network fabric telemetry and storage health to control plane instrumentation and on-call tooling — so the team can detect, diagnose, and resolve issues faster than customers notice them

  • Drive platform reliability improvements across the full cloud stack, partnering closely with software, hardware, and network engineering teams to influence architecture decisions early, before they become operational debt

  • Act as a trusted advisor to senior leadership, bringing perspective on observability trends and advocating for the right long-term technology investments.

  • Set the technical standards for how Crusoe's production engineering organization builds, operates, and scales — defining on-call culture, incident frameworks, and reliability practices that grow with the company

  • Mentor senior and staff engineers, elevate the team's collective technical depth, and be the person others seek out when the problem is genuinely hard

What You'll Bring to the Team:

  • 15+ years of experience in infrastructure, networking, or production engineering — with meaningful time at companies operating at internet scale (cloud providers, CDNs, large-scale social/media platforms, or similar)

  • Strong systems fundamentals: Linux, distributed systems, storage, compute scheduling — you understand the full stack from hardware up

  • Hands-on data center experience: you've done physical infra, understand power and thermal constraints, and can reason about reliability at the facility level, not just the server level

  • The ability to write code — not necessarily full-time, but enough to automate what shouldn't be manual, instrument what isn't observable, and build tooling your team will actually use

  • Excellent analytical and problem-solving skills, including the ability to synthesize ambiguous customer and system signals into clear problem statements and designs.

  • Strong incident command: you lead calmly under pressure, communicate clearly during outages, and run blameless retrospectives that actually improve systems

Bonus Points:

  • Deep networking expertise: BGP, OSPF, ECMP, load balancing, and low-latency network design in production — you can debug a routing issue and design a fabric, sometimes in the same incident

  • Experience with HPC infrastructure: GPU cluster operations, job schedulers (Slurm, Kubernetes), high-bandwidth interconnects (InfiniBand, RoCE)

  • Prior principal or staff IC role where you influenced org-level technical strategy, not just project-level execution

  • Exposure to sustainability-focused or energy-constrained compute environments

Benefits:

  • Industry competitive pay

  • Restricted Stock Units in a fast growing, well-funded technology company

  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents

  • Employer contributions to HSA accounts

  • Paid Parental Leave

  • Paid life insurance, short-term and long-term disability

  • Teladoc

  • 401(k) with a 100% match up to 4% of salary

  • Generous paid time off and holiday schedule

  • Cell phone reimbursement

  • Tuition reimbursement

  • Subscription to the Calm app

  • MetLife Legal

  • Company paid commuter benefit; $300 per month

Compensation :

Compensation will be paid in the range of $261,000 - $326,000 + Bonus. Restricted Stock Units are included in all offers. Compensation to be determined by the applicant’s education, experience, knowledge, skills, and abilities, as well as internal equity and alignment with market data.

Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.

Vacancy posted 13 days ago
Similar jobs that could be interesting for youBased on the Principal Production Engineer in Sunnyvale, CA vacancy
  • $248k - $396.75k

     ...in this exciting endeavor! We are seeking a highly skilled Principal AI/ML Engineer to join our dynamic team to build the next generation of...  ...network/infrastructure engineering, including 7+ years building production-grade network automation. ~ Strong software engineering... 
    Suggested

    NVIDIA

    Santa Clara, CA
    3 days ago
  •  ...client has raised significant investment to revolutionize production costs in this industry, one of the first major...  ...highly specialized markets. We're seeking a hands-on Principal or Staff-level Engineer with deep domain expertise to lead process development,... 
    Suggested

    First Principle

    Sunnyvale, CA
    19 hours ago
  • $166k - $214k

    A leading cybersecurity firm is seeking a Principal Software Development QA Engineer to enhance product reliability and performance. The successful candidate will manage test strategies, design test plans, and develop automated scripts for product evaluations. Candidates... 
    Suggested

    Fortinet, Inc.

    Sunnyvale, CA
    4 days ago
  • Palo Alto Networks, Inc. is seeking a Principal Site Reliability Engineer in Santa Clara, CA. This role involves supporting a large infrastructure and ensuring applications are production-ready, scalable, and reliable. You'll work closely with developers and researchers... 
    Suggested

    Palo Alto Networks, Inc.

    Santa Clara, CA
    4 days ago
  • $272k - $431.25k

    As a Senior Engineering Manager for Agentic Systems & Platform Architecture, you will lead the strategy and execution for NVIDIA’s agentic...  ..., and governance mechanisms that accelerate developer productivity and agent quality. If you’re passionate about staying at the... 
    Suggested

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $80 - $83 per hour

     ...Description: Top skills: Atlassian Suite; Agile, Building Dashboards Summary This role is for a highly skilled Automation Engineer who will drive organizational efficiency by automating manual repetitive processes and administering core Atlassian tools. This... 

    Intelliswift

    Mountain View, CA
    4 days ago
  • $140k - $185k

     ...Principal Cloud Engineering and Production Operations Engineer The Principal Cloud and Production Operations Engineer serves as the senior technical authority responsible for architecting, automating, and optimizing hybrid and cloud-native production environments that... 
    For subcontractor
    Local area

    A10 Networks

    San Jose, CA
    10 hours ago
  •  ...maintain, and communicate key focus points and next steps with engineering, design, operations, marketing, management and system program...  ...tasks to closure ● Experience shipping high volume / high quality products preferred ● Strong ability to capture the big picture quickly... 

    Direct Staffing Inc

    Mountain View, CA
    2 days ago
  • $50 - $65 per hour

    Yoh Services LLC is seeking an Integrated Systems Test Engineer in Mountain View, California. The role involves contributing to the integration, troubleshooting, and testing of next-generation autonomous vehicles. Candidates should have strong hands-on experience with... 
    Hourly pay

    Yoh Services LLC

    Mountain View, CA
    2 days ago
  •  ...Test Automation Engineers Location: Cupertino, CA Onsite/ Remote: Onsite The pay range for this role is ***k- ***k per annum including any bonuses or variable pay. *** also offers benefits like medical, vision, dental, life, disability insurance and paid time off... 
    Remote work

    Yantran LLC

    Cupertino, CA
    2 days ago
  •  ..., committing to the values that lead to great work. Every new product we build, service we create, or Apple Store experience we deliver...  ...— you’ll add something. As part of Siri AI Quality Engineering, we are dedicated to creating groundbreaking conversational assistant... 
    Worldwide

    Apple

    Cupertino, CA
    2 days ago
  • $140k

     ...Job Title: Test Automation engineer Location: Cupertino, C Onsite only Salary-$140k with benefits. Open on...  ...as the primary liaison between offshore team, developers, and product stakeholders Maintain test documentation, coverage reports,... 
    Full time

    Diverse Lynx

    Cupertino, CA
    3 days ago
  • $151.2k - $266.57k

     ...solutions to some of the world’s hardest engineering problems. Do you want to be part of a...  ...perform with excellence and build incredible products? We provide the resources, inspiration...  ...Space Vehicle Responsible Systems Principal Engineer (SV RSE) to join the NGG/OPIR... 
    Full time
    Temporary work
    Work experience placement
    For subcontractor
    Work at office
    Remote work
    Relocation
    Flexible hours
    Shift work

    Lockheed Martin Corporation

    Sunnyvale, CA
    8 days ago
  •  ...new CEO, *** is committed to a transformative journey with 'Scale @ Speed' as our guiding principle. Job Title: Test Automation engineers Location: Cupertino, CA Onsite/ Remote: Onsite The pay range for this role is ***k- ***k per annum including any bonuses... 
    Remote work

    Yantran LLC

    Cupertino, CA
    2 days ago
  • Job Title Basic Qualifications: ~5+ years of proven experience in Automation testing of SaaS and/or cloud-based Enterprise applications like Workday. ~ Hands-on experience with software development testing, developing reliable, performant, and maintainable automated...

    Syntricate Technologies

    Santa Clara, CA
    4 days ago
  • $184k

     ...customer experience—shaping how millions of customers discover products every day. Improvements in search directly translate into...  ...influence company‑wide metrics, and lead a world‑class team across engineering and machine learning. What You Will Do Own and evolve... 
    Temporary work
    Flexible hours

    Dormont Manufacturing Co

    Mountain View, CA
    10 hours ago
  • Proofpoint is seeking a Senior Director of Security Engineering in Sunnyvale, CA. This role will lead the vision and strategy for security engineering, ensuring robust protection across infrastructure and applications. The ideal candidate will have over a decade of experience... 

    Proofpoint

    Sunnyvale, CA
    1 day ago
  • $209k - $343k

     ...economic opportunity for every member of the global workforce. Our products help people make powerful connections, discover exciting...  ...intersection of developer experience, AI tooling, and platform engineering — shaping how thousands of engineers at LinkedIn collaborate with... 
    For contractors
    Work at office
    Immediate start
    Flexible hours

    LinkedIn

    Mountain View, CA
    3 days ago
  •  ...Job Description Our client is a world leader in Consumer Electronics products & services, is looking for Senior System Test Automation Engineer . Kindly see the details below and send us your updated resume. Job Title : Sr. System Test Automation Engineer... 
    Long term contract
    Full time

    Dawar Consulting

    Sunnyvale, CA
    1 day ago
  • $126k - $189k

    Pure Storage, Inc. is seeking a Senior Test Engineer, Service Logistics in Santa Clara, CA. You will develop robust testing solutions to ensure the highest quality of enterprise storage systems. The ideal candidate will have over 3 years of experience in manufacturing... 

    Pure Storage, Inc.

    Santa Clara, CA
    4 days ago
  • $137.1k - $188.3k

     ...contribute to innovative Dolby Imaging/Video algorithms and software. This role demands a Bachelor's in Computer Science or Electrical Engineering and at least 5 years of experience in application engineering. The successful candidate will have expertise in test automation... 

    Via Licensing Corporation

    Sunnyvale, CA
    10 hours ago
  • An innovative firm is seeking a Wireless Engineer to join their dynamic team in Sunnyvale. This role involves designing and developing...  ...tests, and collaborating with cross-functional teams to ensure product performance and stability. The ideal candidate will have a solid... 

    Central Business Solutions, Inc

    Sunnyvale, CA
    3 days ago
  •  ...This position involves designing test automation frameworks and collaborating closely with developers and UI/UX designers to enhance product quality. The ideal candidate will demonstrate strong technical skills and experience in hands-on verification and validation. If... 

    Intuitive

    Sunnyvale, CA
    10 hours ago
  • A leading technology company in California is seeking a skilled developer to enhance the testing and automation of NVIDIA's DriveOS automotive operating system. The successful candidate will develop and extend test strategies, collaborate with various teams, and efficiently...

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $300 per month

     ...at Crusoe. About This Role: Crusoe is building the most reliable, energy-efficient, AI-optimized cloud platform — and Production Engineering sits at the heart of that mission. As a Production Engineer focused on Operational Excellence, you will help ensure the... 
    Temporary work

    Crusoe

    Sunnyvale, CA
    4 days ago
  • $141.8k - $258.6k

     ...Business Operations Engineering Program Manager - PACE At Apple, the Product Analysis and Compliance Engineering (PACE) organization ensures that every product we ship meets the highest standards of regulatory/environmental compliance, product safety, and analytical... 
    For contractors
    Relocation

    Apple

    Cupertino, CA
    3 days ago
  • $165k - $242k

     ...enjoy building and improving distributed systems, solving hard production problems, and operating cloud infrastructure at meaningful...  ...you'll feel right at home here. About the Role Production Engineering ensures CoreWeave's cloud delivers world-class reliability, performance... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    4 days ago
  • $138.8k - $190.85k

     ...rich feature set that enables customers to differentiate their products with higher performance, smaller size, lower power, and better...  ...information, visit: Job Summary The MEMS Test Production Engineer is responsible for ensuring smooth, efficient, and high-quality... 

    SiTime Corporation

    Santa Clara, CA
    25 days ago
  • $141.8k - $258.6k

    Quality & Test Automation Engineer - Xcode Cupertino, California, United States Software and Services The Xcode team provides the tools...  ..., watchOS, tvOS and VisionOS platforms. Because tools are our products, software engineers on the Xcode team have the unique and rewarding... 
    Relocation

    Apple Inc.

    Cupertino, CA
    4 days ago
  •  ...technology solutions provider in Mountain View seeks a skilled Test Engineer to join their team. The successful candidate will create...  ...also lead in developing testing strategies to ensure high-quality products. Strong experience in back-end and compatibility testing is... 

    TechDigital Group

    Mountain View, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Principal Production Engineer. Be the first to apply!