Lead Software Engineer, AI Infrastructure
$146.88k - $220.32kAi2
Persons in these roles are expected to work from our offices in Seattle. On-site requirements vary based on position and team. If you have questions about on-site work arrangements for this role, please ask your recruiter.
Our base salary range is $146,880 - $220,320, and in addition we have generous bonus plans to provide a competitive compensation package.
Who You Are
You are a visionary leader who occupies the space between high-level software orchestration and low-level system performance. You are motivated by the idea that world-class infrastructure should be a catalyst for public good, not a proprietary secret. You understand that in the world of frontier AI, the "software" and the "hardware" are a single, inseparable organism. You are as comfortable designing a distributed scheduling algorithm in Go as you are debugging a NCCL timeout or optimizing an InfiniBand fabric.
You lead by example, blending the rigor of a Lead Software Engineer with the pragmatic, hands-on urgency of an HPC operator. Not only do you build systems, but you also ensure they thrive under the immense pressure of training world-class AI models.
Who We Are
While much of the AI industry has moved behind closed APIs, proprietary datasets, and "black box" infrastructure, Ai2 remains a lighthouse for Open Science. Founded by the late Paul Allen, we are a non-profit research institute dedicated to building AI for the common good.
We don't have a stock price to defend or a walled garden to protect. Instead, we have a mission: to provide the global research community with the transparent, high-performance foundations they need to achieve humanity-enriching breakthroughs.
What makes us different:
- Radical Transparency: We don't just release model weights; we release the data, the training code, and the infrastructure insights. We believe the "how" is just as important as the "what."
- Mission over Margin: Our "bottom line" is scientific impact. This gives us the unique freedom to prioritize technical elegance, long-term stability, and open-source contributions over quarterly profit targets.
- The Best of Both Worlds: We operate at the pace and scale of a world-class tech startup but with the intellectual soul of a research lab.
- The Beaker Ecosystem: We build and operate systems like Beaker to coordinate the simultaneous training of frontier models (like OLMo) across massive GPU clusters. Our job is to ensure that the next great AI breakthrough isn't stalled by a resource bottleneck or a proprietary gatekeeper.
Your Next Challenge
At Ai2, we believe that the most important AI breakthroughs should be transparent and accessible. Your challenge is to build the infrastructure that makes this possible. You will bridge the gap between our researchers, our orchestration platform ( Beaker ) and our GPU clusters.
You will be a technical lead responsible for ensuring that when a researcher submits a job, the software schedules it intelligently and the hardware executes it flawlessly. This involves:
- Designing for Scale: Architecting the next generation of our orchestration layer to ensure that the highest value workloads receive GPU time.
- Operational Excellence: Moving our HPC operations from manual intervention to high-level automation.
- Performance Engineering: Working directly with researchers to squeeze every bit of performance out of our GPU-accelerated computing environment.
Your Responsibilities
- Strategic Leadership: Develop the roadmap for managing large-scale HPC systems, including the deployment of compute, networking, and storage in partnership with leadership.
- Full-Stack Ownership: Lead the design and delivery of critical systems that span the entire stack—from the Beaker job scheduler to the execution runtime.
- System Automation: Build innovative tooling and software-defined infrastructure to accelerate researcher velocity and automate cluster health management.
- Performance Optimization: Conduct root-cause analysis on complex distributed system failures and implement optimizations for distributed workloads.
- Mentorship & Culture: Foster a high-performance culture by reviewing code/design docs, mentoring team members, and driving process improvements across the organization.
- Evangelism: Represent Ai2’s infrastructure work across internal research teams.
What You’ll Need
- 10+ years of professional experience developing business-critical software and operating large-scale compute infrastructure. Proficiency in Go and/or Python preferred.
- Bachelor’s degree in related field ; relevant advanced degree may substitute for equivalent years of technical work experience
- Deep Linux Expertise: Expert-level knowledge of Linux internals, and container runtimes like Docker.
- Distributed Systems Mastery: A proven track record of designing, debugging, and optimizing high-scale distributed systems and databases.
- HPC Foundations: Applied experience with workload schedulers (like Kubernetes or Slurm) and high-performance networking (NCCL and InfiniBand).
- Cloud & Hardware Hybridity: Familiarity with the nuances of on-prem GPU cluster management and cloud infrastructure (GCP, AWS).
- Communication: Exceptional writing skills and the ability to drive consensus across diverse groups of researchers and engineers.
- A principled approach to engineering: you care about how systems are built and are excited by the unique constraints and freedoms of a non-profit research environment.
Bonus Qualifications
- Prior experience training or fine-tuning frontier AI models.
- Deep systems administration expertise or "Site Reliability Engineering" (SRE) background in an HPC context.
- Experience contributing to open-source infrastructure or orchestration projects.
- Familiarity with on-prem storage systems like WEKA and Ceph.
Physical Demands and Work Environment:
The physical demands described here are representative of those that must be met by a team member to successfully perform the essential functions of this position. Reasonable accommodations may be made to enable individuals with disabilities to perform the functions.
- Must be able to remain in a stationary position for long periods of time.
- The ability to communicate information and ideas so others will understand. Must be able to exchange accurate information in these situations.
- The ability to observe details at close range.
- Can work under deadlines.
A Little More About Ai2:
Ai2 is a Seattle based non-profit AI research institute founded in 2014 by the late Paul Allen. Our mission is building breakthrough AI to solve the world’s biggest problems. We develop foundational AI research and innovation to deliver real-world impact through large-scale open models, data, robotics, conservation, and beyond.
In addition to Ai2’s core mission, we also aim to contribute to humanity through our treatment of each member of the Ai2 Team. Some highlights are:
- We are a learning organization – because everything Ai2 does is ground-breaking, we are learning every day. Similarly, through weekly Ai2 Academy lectures, a wide variety of world-class AI experts as guest speakers, and our commitment to your personal on-going education, Ai2 is a place where you will have opportunities to continue learning alongside your coworkers.
- We value diversity - We seek to hire, support, and promote people from all genders, ethnicities, and all levels of experience regardless of age. We particularly encourage applications from women, non-binary individuals, people of color, members of the LGBTQA+ community, and people with disabilities of any kind.
- We value inclusion - We understand the value that people's individual experiences and perspectives can bring to an organization, and we are building a culture in which all voices are heard, respected and considered.
- We emphasize a healthy work/life balance – we believe our team members are happiest and most productive when their work/life balance is optimized. While we value powerful research results which drive our mission forward, we also value dinner with family, weekend time, and vacation time. We offer generous paid vacation and sick leave as well as family leave.
- We are collaborative and transparent – we consider ourselves a team, all moving with a common purpose. We are quick to cheer our successes, and even quicker to share and jointly problem solve our failures.
- We are in Seattle – and our office is on the water! We have mountains, we have lakes, we have four seasons, we bike to work, we have a vibrant theater scene, and we have so much else. We even have kayaks for you to paddle right outside our front door. We welcome interest from applicants from outside of the United States.
- We are friendly – chances are you will like every one of the 200+ (and growing) people who work here. We do.
Ai2 is proud to be an Equal Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. You may view the related Know Your Rights compliance poster and the Pay Transparency Nondiscrimination Provision by clicking on their corresponding links.
This employer participates in E-Verify and will provide the federal government with your Form I-9 information to confirm that you are authorized to work in the U.S. If E-Verify cannot confirm that you are authorized to work, this employer is required to give you written instructions and an opportunity to contact the Department of Homeland Security (DHS) or Social Security Administration (SSA) so you can begin to resolve the issue before the employer can take any action against you, including terminating your employment. Employers can only use E-Verify once you have accepted a job offer and completed the Form I-9.
We are committed to providing reasonable accommodations to employees and applicants with disabilities to the full extent required by the Americans with Disabilities Act (ADA). If you feel you need a reasonable accommodation pursuant to the ADA, you are encouraged to contact us at View email address on jobs.jobcopilot.com.
Benefits:
- Team members and their families are covered by medical, dental, vision, and an employee assistance program.
- Team members are able to enroll in our health savings account plan, our healthcare reimbursement arrangement plan, and our health care and dependent care flexible spending account plans.
- Team members are able to enroll in our company’s 401k plan.
- Team members will receive $125 per month to assist with commuting or internet expenses and will also receive $200 per month for fitness and wellbeing expenses.
- Team members will also receive up to ten sick days per year, up to seven personal days per year, up to 20 vacation days per year and twelve paid holidays throughout the calendar year.
- Team members will be able to receive annual bonuses and can participate in the long-term incentive plan.
Note: This job description in no way states or implies that these are the only duties to be performed by the team members(s) of this position. Team members will be required to follow any other job-related instructions and to perform any other job-related duties requested by any person authorized to give instructions or assignments. All duties and responsibilities are essential functions and requirements and are subject to possible modification to reasonably accommodate individuals with disabilities. To perform this job successfully, the team member(s) will possess the skills, aptitudes, and abilities to perform each duty proficiently. Some requirements may exclude individuals who pose a direct threat or significant risk to the health or safety of themselves or others. The requirements listed in this document are the minimum levels of knowledge, skills, or abilities. This document does not create an employment contract, implied or otherwise, other than an at will relationship.
$148.5k - $313.7k
...efforts. Job Category Software Engineering Job Details About... ...Salesforce is the #1 AI CRM, where humans with agents... ...your career at the company leading workforce transformation in... ...and building foundational infrastructure that enables great customer...SuggestedWork experience placement- ...Lead Software Engineer We have an opportunity to impact your career and provide an adventure where... ...Sector, specifically as a part of the Infrastructure Platforms team, you will play a... ...are secure, scalable, and optimized for AI and machine learning workloads. Collaborate...Suggested
- ...Lead Software Engineer We have an opportunity to impact your career and provide an adventure where... ..., and technical troubleshooting for AI-enabled applications. Develop secure... .... DevOps: CI/CD pipelines, infrastructure as code, containerization/orchestration...SuggestedWork at office
$172.5k - $260.1k
...Category Enterprise Technology & Infrastructure Job Details About... ...Salesforce Salesforce is the #1 AI CRM, where humans with agents... ...your career at the company leading workforce transformation in... ...IT Infrastructure M&A Lead Engineer Overview Salesforce is...Suggested$193.3k - $261.5k
...seeking an experienced engineer and technical leader... ...stack for EC2 distributed AI/ML systems. The team... ...role for a technical lead with the expectation to... ...develops hardware and software components that are... ...building blocks for EC2 infrastructure. Every instance in EC2...SuggestedInternshipLocal areaFlexible hours- ...A leading supply chain technology company is seeking a Lead Software Development Engineer to design and build scalable services and microservices. This role involves using modern AI tools to enhance development, ensuring high reliability and performance across systems....Remote work
$190k - $260k
...At Amperity, we're an AI-first company helping the world's leading brands create personalized... ...libraries that the engineering department builds on. If... ...data platforms and cloud infrastructure our customers already... ...evolving complex, high-scale software systems. ~ Experience...Work at officeLocal areaRemote work$148.55k - $191.63k
...Lead Software Development Engineer Bellevue Immigration / Work Authorization Notice: Applicants must be currently authorized to work in the United... ...front-end testing strategies. Explore and evaluate AI technologies to enhance team productivity and effectiveness...Full timePart timeWork experience placementWork at officeLocal areaRemote workWork from homeFlexible hours3 days per week1 day per week- ...Lead Software Engineer, Vice President As a Lead Software Engineer, Vice President, at JPMorgan Chase within the Commercial and Investment Bank... ...senior team member contributor role, you will build a new AI-powered, promoter-based forecasting and planning platform for...Bank staffImmediate start
$171k - $260k
...pushing the envelope to enhance, build, and deliver top-notch technology products. As a Senior Lead Software Engineer at JPMorganChase within the CDAO's AI/ML Data Platform's team, you are an integral part of an agile team that works to enhance, build, and deliver...For contractors- ...Lead Software Engineer We have an opportunity to impact your career and provide an adventure where... ...and/or FastAPI Apply responsible AI-assisted development workflows to accelerate... ...Automate network services infrastructure and network provisioning workflows, including...
$148.5k - $260.1k
...efforts. Job Category Software Engineering Job Details About... ...Salesforce is the #1 AI CRM, where humans with agents... ...your career at the company leading workforce transformation in... ...and Salesforce's internal infrastructure. We provide the core building...Work at office$171.6k - $258.1k
...Washington, United States Software and Services AI systems are only as trustworthy... .... Join Apple Services Engineering to build the next... ...interaction.We are looking for a Lead Forward Deployed Engineer... ...seamlessly with existing ML infrastructure and developer workflows. You...Relocation$172.5k - $260.1k
...efforts. Job Category Software Engineering Job Details About... ...Salesforce is the #1 AI CRM, where humans with agents... ...your career at the company leading workforce transformation in... ...: Be a cornerstone in the infrastructure of technical expertise represented...$140k - $180k
...Lead Software Developer Valorem Reply is an award-winning digital transformation firm focused... ...with technical leadership, guiding engineering teams while building scalable distributed... ...engineering practices, including the use of AI-assisted development tools such as...Full time- ...Lead I - Software Engineering Our team builds and maintains data products within the Care & Retail domain, delivering reliable, scalable solutions... ...what exists, and adopt the best available tools, including AI, to do it more effectively. If you think in systems and...Work experience placementFlexible hours
- ...Lead Software Engineer The Lead Software Engineer designs and evolves core services within the NICE CXone platform, building scalable cloud systems... ...fundamentals with modern development practices, including AI-assisted engineering tools and agentic workflows that...Worldwide
$148.5k - $313.7k
...not duplicating efforts. Job Category Software Engineering Job Details About Salesforce Salesforce is the #1 AI CRM, where humans with agents drive customer... ...Ready to level-up your career at the company leading workforce transformation in the agentic era?...Flexible hours$13 per hour
...efforts. Job Category Software Engineering Job Details About... ...Salesforce is the #1 AI CRM, where humans with agents... ...your career at the company leading workforce transformation in... ...Optimize existing ETL pipeline infrastructure to support a daily, self-serve...- A leading cloud technology company is seeking a skilled Lead Penetration Testing Engineer to perform advanced penetration tests across applications, platforms, cloud infrastructure, and AI systems. This hands-on role requires deep expertise in offensive security, vulnerability...
- Apple Inc. is looking for a Lead Forward Deployed Engineer in Seattle, WA, to drive solutions and adoption strategies in AI evaluation systems. This hybrid role connects engineering and research teams, transforming workflows into intuitive platforms. Applicants should have...
$148.8k - $204.53k
Databricks Inc. is seeking a Product Manager for Databricks Repos focusing on enhancing the developer experience for data and AI teams. This role includes defining how Repos integrates with GitHub, GitLab, and Azure DevOps. Candidates must have over 5 years of product...$101.9k - $175k
What you'll do here: In your role as Technology Lead, Platforms & Ecommerce Engineering within our Digital organization, you are responsible for building... ...continuous improvement. You will also help advance how AI is applied within engineering workflows to improve efficiency...Work experience placementLive inLocal area- Engineering - Workplace Engineering - Cloud Platform - M365 and Mobile Lead Engineer - Seattle Seattle, Washington, United States Job Description Your Impact At Goldman... ...drive the adoption of advanced features, including AI‑powered capabilities like Copilot, the Teams...Full timeWork at office
- ...Head of Infrastructure Engineering About the Company Pioneering cloud infrastructure company Industry... ...is in search of a Head of Infrastructure to lead the design, deployment, and operations of cutting-edge AI and HPC infrastructure. This pivotal role involves...
- Lead Software Engineer (Artificial Intelligence background) Location: Seattle, WA Duration: 24Months + Extension Hourly Rate: Depending on Experience... ...in C++ or similar programming languages Prior work on AI or search engine technologies Deep interest in server-side...Hourly payPermanent employmentContract workLocal area
$170k - $189k
Shipium builds technical infrastructure for complex supply chains. Modern operators turn... ...provides cloud infrastructure and leading AI capabilities that optimize costs... ...orders. About the role As a Lead Software Development Engineer at Shipium, no two days look the same...Temporary workLocal areaRemote workWork from home- ...building the digital infrastructure for the next generation... ...rapidly launch cloud-native software and services, turning... ...Aerospace Systems Engineering research Backed by... ...Job Summary As a lead software engineer at Aerovy... ...IoT, telematics, and AI. You'll work on a...
$13 per hour
...efforts. Job Category Software Engineering Job Details About... ...Salesforce is the #1 AI CRM, where humans with agents... ...your career at the company leading workforce transformation in... ...class Data Cloud and security infrastructure to drive immediate global impact...Immediate start$55 - $60 per hour
...applications within the Core Network Engineering Group.... ...and implement end-to-end IT infrastructure for container workloads, including... ...touch" cluster deployments. AI-Driven Operations (AIOps): Architect... ...Migration & Modernization: Lead the migration of legacy monolithic...Full timeLocal areaFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Lead Software Engineer, AI Infrastructure. Be the first to apply!
- lead network engineer Seattle, WA
- lead system engineer Seattle, WA
- lead algorithm engineer Seattle, WA
- lead industrial engineer Seattle, WA
- lead operating engineer Seattle, WA
- lead infrastructure engineer Seattle, WA
- lead engineer Seattle, WA
- data infrastructure engineer Seattle, WA
- infrastructure engineering manager Seattle, WA
- remote infrastructure engineer Seattle, WA

