Senior Systems Engineer - AI Infrastructure
$150k - $230kDormont Manufacturing Co
About Clockwork Systems Clockwork.io – Software Driven Fabrics to increase GPU cluster utilization Clockwork Systems was founded by Stanford researchers and veteran systems engineers who share a vision for redefining the foundations of distributed computing. As AI workloads grow increasingly complex, traditional infrastructure struggles to meet the demands of performance, reliability, and precise coordination. Clockwork is pioneering a software-driven approach to AI fabrics by delivering cross-stack observability to catch and quickly resolve problems, workload fault tolerance to keep jobs running through failures, and performance acceleration that dynamically routes and paces traffic to avoid congestion. To learn more, visit About the Role We’re building infrastructure for fault-tolerant, high-performance distributed GPU training. You’ll work at the intersection of GPU systems, high-speed networking, and distributed coordination—designing and implementing systems that run at scale. This is a systems building role. You’ll dig into internals, understand why things break under pressure, and design solutions that handle the messy reality of distributed systems. What You’ll Do Design and implement low-level systems software for GPU clusters Work with internals of frameworks like PyTorch, NCCL, CUDA runtime—not as a user, but modifying and extending them Build components that make large-scale GPU training more reliable and efficient Debug complex distributed/concurrent systems where failures are subtle and non-deterministic Own systems end-to-end: from design through production What We’re Looking For Required: Systems building experience You’ve designed and built complex systems—not just deployed or operated them. Examples: Kernel subsystems, device drivers, or OS-level components Distributed storage, databases, or coordination systems Runtimes, profilers, or performance tooling Network stacks, protocols, or high-performance I/O systems Large-scale infrastructure at the systems layer Core technical skills: Strong C/C++ in systems contexts (not just application code) Deep understanding of concurrency, memory models, and failure modes Experience reasoning about distributed system behavior: consistency, ordering, partial failures Comfortable reading and modifying large, unfamiliar codebases Nice to have: GPU programming (CUDA) or GPU systems experience High-performance networking (RDMA, InfiniBand) ML framework or runtime internals Cluster scheduling or orchestration systems We believe strong systems engineers pick up domain-specific tools quickly. We value your ability to reason about complex systems over checkbox familiarity with our specific stack. Senior Expectations Lead design of significant system components Navigate ambiguity and define technical direction Mentor engineers and raise team capabilities 8+ years building systems software Enjoy Challenging projects. A friendly and inclusive workplace culture. Competitive compensation. A great benefits package. Catered lunch. Compensation for this position will vary based on the skills and experience you bring, as well as internal equity considerations. For candidates hired at the posted level, the expected base salary range is $150,000 - $230,000. The offered compensation package may also include stock options or other equity awards, subject to Clockwork’s equity program and applicable approvals. Clockwork Systems is an equal opportunity employer. We are committed to building world-class teams by welcoming bright, passionate individuals from all backgrounds. All qualified applicants will receive consideration for employment without regard to race, color, ancestry, religion, age, sex, sexual orientation, gender identity or expression, national origin, disability, or protected veteran status. We believe diversity drives innovation, and we grow stronger together. #J-18808-Ljbffr Dormont Manufacturing Co
- Cerebras is looking for a Senior Site Reliability Engineer to join their Infrastructure team in Palo Alto, California. This role... ...infrastructure for distributed AI applications, contributing to... ...native technologies and distributed systems. The position offers the chance...Senior
$144k - $236k
...part of our world-class software engineering team, you will help build the next-generation infrastructure and platforms that power LinkedIn’s products, business, and AI-first future. This includes data infrastructure, storage systems, streaming platforms, traffic and...SeniorFull timeFor contractorsWork experience placementWork at officeFlexible hours$159k - $231k
...degree in Electrical Engineering, Computer Engineering,... ...enterprise-class hardware systems design and data center... ...worldwide. As a Senior Hardware Engineer in the... ..., you will work on ML/AI hardware systems projects... .... Our Platforms Infrastructure Engineering team designs...SeniorWorldwide- ATX Venture Partners seeks a Principal Engineer to drive technology initiatives and create scalable solutions. You'll develop systems in a highly collaborative environment, utilizing... ...and back-end technologies, particularly in AI domains. The ideal candidate has over 10...Senior
- NVIDIA Gruppe is seeking a skilled professional to develop a factory pipeline for AI models and build deployable services across multiple environments. The ideal candidate will have over 8 years of experience in microservices, robust programming skills, and a passion for...Senior
- ...company based in Santa Clara, California, is seeking a Senior Software Engineer to focus on the cloud-native stack for their AI/ML datacenters. This role entails deep technical work including debugging complex systems and gathering customer requirements. Ideal candidates...Senior
- Moveworks in Mountain View, California, is seeking a Senior Software Engineer to design and build highly reliable platform components and influence... ...clearly. This role offers an opportunity to work in a fast-paced startup environment. #J-18808-Ljbffr Moveworks.aiSenior
- Crusoe is seeking a Staff Cloud Support Engineer to serve as a technical authority... ...role requires deep expertise in Linux systems and Kubernetes as well as strong customer... ...influence architectural decisions, and enhance AI infrastructure globally. Attractive benefits include...Senior
$193.93k - $291.15k
...scalable driver, combining cutting-edge AI with automotive-grade hardware. Nuro licenses... ...standard protocols. About the Work Engineered Connectivity: Architect a network bonding... ...QUIC, SRT, gRPC Tools: Wireshark About You Systems Thinker: You have a deep understanding of...SeniorRemote work$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...Senior$166k - $225k
A leading data and AI company is seeking a Senior Software Engineer to join their Networking Infrastructure team. The position requires 5+ years of production-level experience... ...+ years developing large-scale distributed systems. The ideal candidate will work on designing...Senior- An innovative AI solutions company is seeking a Senior DevOps Engineer to architect and maintain the core infrastructure supporting cutting-edge AI applications. The role involves designing... ..., and championing best practices in system reliability. Ideal candidates should...SeniorFull timeRemote workFlexible hours
- About the Company Hippocratic AI is a generative AI company... .... About the Role As a Senior Staff Software Engineer at Hippocratic AI, you’ll... ...engineering standards, CI/CD infrastructure, and developer platform... ...ll architect foundational systems that power reliable, testable...SeniorWork at officeLocal area
$184k - $287.5k
...the boundaries of innovation and engineering? At NVIDIA, we lead the world... ...computing—driving progress in AI, graphics, and high‑performance systems. As a Senior Hardware Systems Engineer, you... ...Familiarity with hyperscale data center infrastructure, including cooling methods,...Senior- Rhoda AI in Mountain View is seeking a Staff / Principal ML Training Systems Engineer to lead the performance of large-scale multimodal training systems. This role involves improving training efficiency and collaborating closely with research teams to accelerate model iteration...Senior
- Moveworks is seeking a Senior Software Engineer to develop the runtime infrastructure for AI agents. The role focuses on distributed systems engineering, requiring expertise in managing orchestration and real-time responses. Ideal candidates have over 5 years of backend...SeniorFlexible hours
$125k - $191.7k
...This role is categorized as hybrid/Remote Role: As a Senior Software Systems Engineer on the Software Validation team within the AV organization... ...responsible for shaping the future of evaluation methodologies for AI systems and other ADAS features, architecting solutions...SeniorLocal areaRemote workWork from homeFlexible hours- ...leading technology company in Mountain View is seeking a Senior Staff Software Engineer to advance AI/ML initiatives in Google Ads. This role requires... ...software solutions, and optimizing machine learning infrastructure. The ideal candidate should have a strong background...Senior
- Moveworks is seeking a Software Engineer for their Natural Language Understanding team. In this role, you will contribute to AI agent systems and solve complex engineering challenges, leveraging modern technologies and collaborating with experts across the company. Ideal...SeniorWork at officeRemote work
- A leader in AI technology in Palo Alto is seeking a Senior AI Systems Performance Engineer to optimize the latest foundation models on their innovative platform. This role involves collaborating with cross-functional teams to push the performance limits of AI systems....Senior
- Google Inc. is looking for a Software Engineer to develop next-generation technologies in Mountain View, CA. Ideal candidates will have significant experience in software development and expertise in AI and machine learning, particularly in areas such as Natural Language...Senior
$165k - $238k
X, The Moonshot Factory in Mountain View is seeking a Senior Applied Research Engineer to architect core AI systems for their professional intelligence platform. In this role, you will bridge applied research and production, focusing on complex machine learning systems....Senior$183k - $275k
...combining cutting‑edge AI with automotive‑grade... ...many different bench‑top systems to evaluate and... ...road. You will own the infrastructure that makes this possible... ...and much much more. Engineers across the company rely... ...deliver, and you’ve briefed senior engineering leadership...SeniorTemporary work- ...partnering with a well-funded AI company that recently emerged... ...'re now hiring several early engineers to help transform cutting-edge research into production systems used by enterprises and... ...reasoning frameworks Backend infrastructure and APIs Developer tools and...SeniorImmediate start
$193.93k - $352.29k
...immediate and profound opportunity for AI to drive positive change in the physical... ...investors. About the Role The Autonomy ML Infrastructure team is responsible for building &... ...model compression. Work with autonomy engineers to optimize, validate, and deploy large...SeniorWork experience placementImmediate startFlexible hours$118.8k - $264k
We are looking for a Senior Backend Platform Developer to... ...for Navan’s contact center systems. This is a software engineering role first: where you... ...observability tooling, and AI-enabled workflows that support... ..., AWS SAM, or related infrastructure tooling. This senior...SeniorFull time$228.6k - $314.25k
Databricks is seeking an experienced software engineer to work on enterprise-grade analytical data systems, focusing on distributed systems and performance optimization. In this role, you will be responsible for delivering scalable architectures and mentoring team members...Senior- Madrona Venture Labs in Palo Alto is looking for a Senior Machine Learning Engineer to define and oversee the roadmap for content generation across... ...have over 10 years of experience in building and scaling ML systems, with a Master's or PhD in Computer Science. Benefits...SeniorWork at officeFlexible hours3 days per week
$183.83k - $333.93k
...driver, combining cutting‑edge AI with automotive‑grade... ...we are looking for talented engineers to join us and be instrumental... ...the following areas: Onboard Systems, Performance, and Devices Platform... ...supports the autonomy evaluation infrastructure by providing detailed...Senior$225k - $300k
...Manufacturing Co is seeking an exceptional Python engineer to lead the development of DataHub’s... ...in Palo Alto. You will create scalable systems for metadata collection, ensuring high... ...-growing team and be at the forefront of AI and data technologies! #J-18808-Ljbffr Dormont...SeniorFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Systems Engineer - AI Infrastructure. Be the first to apply!
- systems engineer Palo Alto, CA
- advanced systems engineer Palo Alto, CA
- space systems engineer Palo Alto, CA
- senior linux systems engineer Palo Alto, CA
- mission system engineer Palo Alto, CA
- director systems engineering Palo Alto, CA
- operating system engineer Palo Alto, CA
- software system engineer Palo Alto, CA
- distributed systems engineer Palo Alto, CA
- senior staff systems engineer Palo Alto, CA

