Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, Compute Infrastructure

AI Chopping Block, Inc.

About the Team We build and scale the Compute foundation that powers frontier AI research and products. Our team delivers reliable, efficient, and cost-effective GPU/CPU capacity across world-scale supercomputing systems, enabling researchers and product teams to move quickly. We operate one of the largest GPU fleets in the world, rapidly bringing new infrastructure online across a wide range of providers, hardware types, and generations while integrating it into a single seamless platform at massive scale. We focus on building an intuitive, low-friction system that helps teams experiment faster, innovate faster, and train some of the world’s largest and most advanced models. About the Role We’re looking for engineers to help build and operate the next generation of compute infrastructure powering OpenAI’s frontier research. This is an opportunity to work on the large-scale clusters, high-performance networks, and supercomputing systems that enable some of the most advanced AI workloads in the world. In this role, you’ll combine distributed systems engineering with hands‑on infrastructure work across some of our largest data centers. You’ll help scale Kubernetes clusters to massive scale, automate bare‑metal bring‑up, and build the software layers that make heterogeneous GPU fleets and multi‑datacenter supercomputing environments easier to operate. You’ll work where hardware and software meet, in an environment where speed, efficiency, and reliability are critical. That means solving real‑time operational challenges, quickly diagnosing and fixing issues when they arise, and continuously improving automation, resilience, performance, and uptime across the systems that power frontier model training. In this role, you will Spin up and scale large Kubernetes clusters, including automation for provisioning, bootstrapping, and cluster lifecycle management Build software abstractions that unify multiple clusters and present a seamless interface to training workloads Own node bring‑up from bare metal through firmware upgrades, ensuring fast, repeatable deployment at massive scale Improve operational metrics such as reducing cluster restart times (e.g., from hours to minutes) and accelerating firmware or OS upgrade cycles Integrate networking and hardware health systems to deliver end‑to‑end reliability across servers, switches, and data center infrastructure Develop monitoring and observability systems to detect issues early and keep clusters stable under extreme load You might thrive in this role if you Have experience operating large‑scale compute fleets and enjoy bringing diverse hardware across providers, generations, and environments into one reliable platform Care deeply about infrastructure efficiency and know how to maximize utilization so every GPU and CPU delivers meaningful work Bring a strong bias for operational excellence, balancing speed with long‑term quality and building systems that improve consistently over time Focus on solving root causes rather than symptoms, and build trust by eliminating recurring pain points for users Have experience improving training performance, reducing bottlenecks, and helping workloads run faster and more cost‑effectively at scale Enjoy pushing the limits of scale, from increasing concurrent workloads to enabling larger and more ambitious single‑cluster jobs Build intuitive platforms and tooling that empower researchers, product teams, and operators to self‑serve with minimal manual support Are comfortable working in fast‑moving environments where ownership, reliability, and continuous improvement are essential Qualifications Experience as an infrastructure, systems, or distributed systems engineer in large‑scale or high‑availability environments Strong knowledge of Kubernetes internals, cluster scaling patterns, and containerized workloads Proficiency in compute infrastructure concepts (compute, networking, storage, security) and in automating cluster or data center operations Bonus: background with GPU workloads, firmware management, or high‑performance computing We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic. Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations. #J-18808-Ljbffr AI Chopping Block, Inc.

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Software Engineer, Compute Infrastructure in San Francisco, CA vacancy
  • $164.2k - $205.2k

    Senior Software Engineer, Compute Infrastructure RDQ427R175 Overview At Databricks, we are passionate about helping data teams solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical breakthroughs... 
    Suggested
    Local area

    Databricks Inc.

    San Francisco, CA
    2 days ago
  •  ...democratize access to cutting‑edge AI infrastructure previously reserved for...  ...a global marketplace for AI compute—powering AGI with the same...  ...an Infrastructure Product Engineer, you will play a pivotal...  ...environments. ~ Advanced software engineering skills; capable... 
    Suggested
    Full time
    Remote work

    Andromeda

    San Francisco, CA
    4 days ago
  • $148.1k - $250k

     ...how work gets done. Airtable’s infrastructure is evolving to meet the needs of our fast growing engineering org. We are looking for...  ...usage, and vertical scaling. Compute: The compute pod builds and manages...  ...projects. We are looking for software engineers experienced in... 
    Suggested
    For contractors
    Work at office
    Remote work
    Relocation
    Flexible hours

    GrabJobs

    San Francisco, CA
    3 days ago
  •  ...just as language models are changing how engineers write code. Our vision is a design...  ...engineer obsessed with building systems and infrastructure that are as simple as possible while...  ...flexible and resilient. You will build the compute and infrastructure systems underpin our... 
    Suggested
    Flexible hours

    Chai Discovery

    San Francisco, CA
    1 day ago
  • $166k - $225k

     ...the world's best data and AI infrastructure platform so our customers...  ...their business. Founded by engineers - and customer obsessed - we...  ...getting started. As a Senior Software Engineer on the...  ...for: ~ BS (or higher) in Computer Science, or a related field... 
    Suggested
    Local area
    Worldwide
    Flexible hours

    Databricks

    San Francisco, CA
    3 days ago
  •  ...experienced, creative, and versatile engineer who is eager to tackle the...  ..., and scaling the core infrastructure and systems powering...  ...~3+ years of experience in software development, with a strong background...  ..., with a degree in Computer Science or a related field.... 
    Work from home

    LIGHTFIELD INC

    San Francisco, CA
    5 days ago
  • $190k - $221k

     ...Senior Software Engineer, Infrastructure TRM Labs is a blockchain intelligence company committed to fighting crime and creating a safer world....  ...re looking for: ~ Bachelor's degree (or equivalent) in Computer Science or related field ~3+ years of experience in a... 
    Remote work

    TRM Labs

    San Francisco, CA
    4 days ago
  •  ...is building the foundational infrastructure for the next generation of...  ...development. We provide AI engineers and data scientists with lightning...  ...AI. The Role As a Software Engineer on our...  ...~ Bachelor's degree in Computer Science or a related field,... 
    Work at office
    Work from home
    1 day per week

    Runloop AI, Inc

    San Francisco, CA
    4 days ago
  • $100k - $300k

     ...Senior Software Engineer, Infrastructure Pittsburgh, San Francisco, Bengaluru Company Overview At Skild AI, we are building the world's...  ...Preferred Qualifications BS, MS or higher degree in Computer Science, Robotics, Engineering or a related field, or equivalent... 
    Work experience placement

    Skild AI

    San Francisco, CA
    1 day ago
  •  ...Software Engineer, Infrastructure Serval is building an AI platform to automate complex IT workflows for modern enterprises. As a Software Engineer...  .... Profile and optimize system performance, including compute, storage, networking, and database layers. Implement... 

    Serval

    San Francisco, CA
    4 days ago
  •  ...so many problems that modern engineering teams face are ultimately about...  ...Role We're looking for a Software Engineer to ship the product features and infrastructure that make Scanner feel effortless...  ...leverages highly burstable compute provided by serverless components... 
    Immediate start
    Flexible hours

    Scanner

    San Francisco, CA
    4 days ago
  • $170k - $216k

     ...states. The Simulation Infrastructure team creates reliable, scalable...  ...evaluate the Waymo Driver's software stack at a massive scale. We...  ...range of customers Software Engineers, Product, Data Science,...  ...You have: ~ B.Sc. in Computer Science, or a related field,... 
    Full time
    Remote work

    Waymo

    San Francisco, CA
    4 days ago
  •  ...updates on our news and engineering blogs and join us as we enable...  ...design and scale the core infrastructure that powers machine learning...  ...plus: ~ Bachelor’s degree in Computer Science, Statistics, Applied...  ...experience. ~3+ years of software engineering experience... 
    Work experience placement
    Work at office
    Local area
    Remote work
    Work from home
    Home office
    Flexible hours

    Whatnot

    San Francisco, CA
    1 day ago
  • $140k - $260k

     ...Infrastructure Engineer Profound is on a mission to help companies understand and control their AI presence. As an Infrastructure Engineer...  ..., cost-effective, and able to handle explosive traffic and compute demands. You will work closely with engineers across product... 
    Work at office
    Visa sponsorship

    Profound

    San Francisco, CA
    1 day ago
  • $255k - $405k

     ...to-week. Supporting that pace requires infrastructure that can handle real production constraints...  ...operational lessons as defaults, so engineers don't need to rediscover failure modes,...  ...offer of employment: protect computer hardware entrusted to you from theft, loss... 
    Contract work
    Shift work

    OpenAI

    San Francisco, CA
    1 day ago
  • $209k - $240k

     ...think and execute. About the Product Infrastructure Team The Product Infrastructure...  ...classes of problems up-front for product engineers. Solve hard technical challenges such...  ...by Google). You've heard of computing pioneers like Ada Lovelace, Douglas Engelbart... 
    Local area

    Notion Labs, Inc

    San Francisco, CA
    1 day ago
  •  ...Exa Infrastructure Engineer Exa is building a search engine from scratch to serve every AI agent...  ...in rust to search over it. If you like compute, we also own a $5M H200 GPU cluster (and...  ...machines Design GPU scheduling software so we max out our cluster utilization... 
    H1b

    Exa Labs

    San Francisco, CA
    1 day ago
  •  ...About the Role We are hiring Software Engineers focused on AI Infrastructure to build the systems that enable frontier multimodal AI to operate reliably...  ...platform usability. Qualifications Degree in Computer Science, Engineering, or comparable combination of... 
    Internship
    Immediate start

    SpreeAI

    San Francisco, CA
    4 days ago
  •  ...on a mission to harness the power of computer vision to transform the way transit systems...  ...real-world challenges. The Infrastructure Engineering team is crucial to the overall success...  ...Standards: Establish and enforce elite software engineering and DevOps standards,... 
    Shift work

    Hayden AI

    San Francisco, CA
    5 days ago
  • $180k - $210k

     ...Software Engineer, Infrastructure AcuityMD is a software and data platform that accelerates access to medical technologies. We help MedTech companies...  ...help design, build, and operate the platform primitives—compute, networking, storage, CI/CD, and developer tooling—that power... 
    For contractors
    Work at office
    Remote work
    Work from home
    Home office
    Flexible hours

    GrabJobs

    San Francisco, CA
    4 days ago
  • $140k - $225k

     ...assembling a diverse, world-class team-engineers, designers, researchers, and...  ...The Role As the Senior Software Engineer, Tooling and Development Infrastructure, you will play a critical role in...  ...Qualifications ~ Bachelor's degree in Computer Science, Engineering, or a... 
    Full time
    Temporary work
    Local area
    Flexible hours

    HP IQ

    San Francisco, CA
    3 days ago
  •  ...month and growing. You'll own reliability, performance, and security for multi-tenant compute. What You'll Do Design and operate secure, multi-tenant container infrastructure with fast startup and smart autoscaling. Ship cloud deployments (Helm/Terraform) with... 
    Remote work

    Julius

    San Francisco, CA
    3 days ago
  • $190k - $250k

     ...The next step is to speak to Jack . Job Title: Software Engineer (Infrastructure) Salary: $190K – $250K + Equity Company Description...  ...to support rapidly expanding global usage and compute-heavy AI workloads. Lead technical deployments for large... 

    Jack and Jill AI

    San Francisco, CA
    1 day ago
  • $230k

     ...The Fleet team at OpenAI supports the computing environment that powers our cutting-...  ...growth. About the role As a software engineer on the Fleet High Performance Computing...  ...and efficiency of our supercomputing infrastructure. Our team empowers strong... 

    OpenAI

    San Francisco, CA
    1 day ago
  •  ...years of experience in Infra Software/DevOps • Strong with Docker,...  ...(like AWS) • Degree in computer science (or similar field),...  ...Misc: • Experience working in Infrastructure / DevOps at one or more of the...  ...customer-facing role • Solutions engineering background is good as long... 
    Work experience placement

    Tranzeal

    San Francisco, CA
    5 days ago
  • $150k - $250k

     ...As a founding member of our engineering team, you will have a direct...  ...isn't just "models." It's the software layer that turns:...  ...build the backend systems and infrastructure that power the factory of the...  ...infrastructure across multiple compute environments You will build... 
    Full time
    Contract work

    Foundry Robotics Inc

    San Francisco, CA
    3 days ago
  • $160k - $220k

     ...Fidelity, and employs a team of 450 engineers and entrepreneurs. Astranis designs,...  ...in Northern California, USA. Senior Software Engineer - Infrastructure As a Senior Software Engineer on...  ...Science in a related discipline (e.g. Computer Science, Information Technology) Proficiency... 
    Permanent employment
    Remote work
    Flexible hours

    Astranis

    San Francisco, CA
    6 days ago
  • $215k - $265k

     ...Software Engineer Opportunity Apollo Research's mission is to reduce the risks from scheming...  ...Build and maintain Apollo's cloud infrastructure . This means IaC, networking, environment...  ...training infrastructure, or research compute Demonstrated interest in AI safety... 
    Full time
    Work experience placement
    Work at office
    Immediate start
    Visa sponsorship
    Flexible hours

    Apollo Research

    San Francisco, CA
    4 days ago
  • $325k

     ...About the Role We are seeking a Cloud Infrastructure Engineer to help design and evolve the platforms that power OpenAI's products....  ...the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return... 

    OpenAI

    San Francisco, CA
    1 day ago
  • $141.9k - $190.3k

     ...Senior Software Engineer Technology is at the heart of Disney's past, present, and future....  ...Software Engineer on the Build Tooling Infrastructure team at Disney Entertainment & ESPN Product...  ...and tooling ~ BA/BS degree in Computer Science or equivalent technical experience... 

    The Walt Disney Studios

    San Francisco, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, Compute Infrastructure. Be the first to apply!