Staff AI Accelerator Reliability Lead
Google is seeking a Staff Technical Lead to oversee the reliability, availability, and serviceability of a next-generation AI accelerator system. This role involves defining the reliability strategy and collaborating with engineering teams through the full product lifecycle. The ideal candidate will have extensive experience in reliability engineering and technical leadership, ensuring high standards in product design and validation. This position, based in Sunnyvale, California, offers a competitive salary range and benefits. #J-18808-Ljbffr Google
$262k - $365k
Google Inc. is seeking a Senior Staff Software Engineer, specializing in Site Reliability Engineering. This role involves leading projects, engaging through the entire lifecycle of services, and ensuring systems remain reliable and efficient. Candidates should have 8 years...Suggested- Google Inc. is seeking a Staff Site Reliability Engineer in Sunnyvale, CA, who will manage the technical roadmap for the Evergreen platform and optimize... ...services, leveraging large-scale architecture knowledge and innovative AI-driven solutions. #J-18808-Ljbffr Google Inc.Suggested
$183k - $271k
A leading tech firm in Sunnyvale, CA is seeking a Lead Engineer for Silicon and Software Integration within Google Cloud. This role involves shaping the future of AI/ML hardware acceleration, leading the integration of custom silicon solutions, and owning validation processes...Suggested$215k - $263k
EarnIn is seeking a Staff Analyst in Mountain View to partner with the CMO and drive user growth through analytics. The role involves building AI-driven workflows, collaborating with cross-functional teams, and developing data-driven insights. The ideal candidate will have...SuggestedWork at office2 days per week$176.1k - $308.2k
...meaningful work. Today, ServiceNow is the AI control tower for business reinvention. Our... ...managementand data-driven optimization to lead programs that bring accountability and... ...discipline. About the role We are looking for a Staff FinOps AI GovernanceLeadto drive financial...SuggestedWork at officeImmediate startRemote workFlexible hours- EngineersOfAI is seeking a highly accomplished GPU Architect to lead the next generation of AI accelerators and multi-GPU cluster architecture. This significant... ...performance modeling, manufacturing techniques, and reliability engineering. You will ensure performance efficiency...
$193k - $234k
Crusoe is on a mission to accelerate the abundance of energy... ...vertically integrated AI infrastructure company... ...team is seeking a Staff Technical Program Manager... ...contributor role, you will lead our most complex... ...will ensure we deliver a reliable, scalable platform where...Temporary work- ...EXPLORATION TECHNOLOGIES CORP is seeking a Global Supply Manager to manage sourcing strategies and supplier relationships for GPUs and AI accelerators. This role involves collaborating closely with engineers and managing high-performance supply chains to drive innovation. The...
$180k - $275k
Figure is an AI robotics company developing autonomous general-purpose humanoid... ...San Jose, CA. We are looking for a Staff AI Inference & Acceleration Engineer to join the Platform... ...while meeting the strict latency and reliability demands of a real-time autonomous system...Full time- Crusoe in Sunnyvale, California is searching for a Technical Program Manager to drive the Managed Inference platform for AI workloads. The role focuses on program delivery and team alignment across various functions while managing the product lifecycle. The ideal candidate...
- Intuit Inc. is hiring a Staff Product Manager located in Mountain View, California. In this role, you will define AI-first experiences across various Intuit products such as TurboTax and QuickBooks. You’ll be focusing on identifying opportunities to create innovative AI...
$180k - $220k
Crusoe is on a mission to accelerate the abundance of energy... ...vertically integrated AI infrastructure company... .... About the Role The Staff Enterprise Technology Administrator... ...the system scales reliably as the company grows.... ...as ET‑side technical lead for the active Workday...Temporary work- d-Matrix inc. in Santa Clara, California seeks a Senior Staff SI/PI Engineer to lead the SI/PI strategy for high-performance AI compute platforms. This role involves end-to-end modeling, simulation, and validation of complex multi-chip modules, ensuring electrical integrity...
$272k - $336k
...with Simulation, ML Research (AI Foundations), Perception, and... ...loop, establishing a seamless, reliable path for model training, offline... ...Program Delivery: Lead complex, cross-functional programs... ...and advanced technologies to accelerate developer velocity Resource...Full timeRemote work- A global data and AI company is seeking a Senior Staff Technical Program Manager to lead Reliability initiatives within product engineering teams. This role requires over 10 years of experience in managing cloud infrastructure programs and driving improvements in reliability...Local area
- General Motors is seeking a Staff Technical Program Manager to lead their autonomous driving platform initiatives... ...model training and operational reliability. The ideal candidate will possess over... ...background in ML operations and AI infrastructure. You will work collaboratively...
$128k - $201.25k
...transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a... ...tapping into the unlimited potential of AI to define the next era of computing. An... ...actions to reduce financial risk Lead audits and ensure alignment of on-hand and...Contract workShift work- Apple Inc. is seeking a Senior Site Reliability Engineer based in Cupertino, California, to drive reliability standards across the Apple Data Platform. You will mentor engineers and ensure that large-scale infrastructures run reliably and efficiently. With a focus on technical...
$168k - $264.5k
Join our LPU team as Lead Systems Quality and Reliability Engineer. What you'll be doing: You will own, build, and manage the RMA and FA debug and root-cause analysis for existing and new Nvidia AI/ML products. You will conduct tests and root-cause analysis. Other responsibilities...- NVIDIA Gruppe is hiring a Lead Systems Quality and Reliability Engineer in Santa Clara, California. In this role, you will manage RMA and FA debug analysis for AI/ML products, conduct tests, and collaborate with various engineering teams. The ideal candidate has a BS/MS...
$200k
...building the next generation of compute platforms for Physical AI. As AI moves beyond the datacenter into robots, autonomous... ...computing platform. Role Overview We are looking for an Accelerator Runtime Lead to own the execution runtime for Velaura’s AI accelerator....Flexible hours$151.6k - $245.3k
Palo Alto Networks, Inc. seeks a Principal Site Reliability Engineer in Santa Clara, CA. The role involves driving SRE and DevOps initiatives... ...will have 7+ years of relevant experience, proficiency in AI productivity tools, and expertise in cloud-native application development...- ...point in how Talent Acquisition operates. AI agents and automation are no longer... ...transition right. The Principal Applied AI Lead is the person who bridges the gap between... ...metrics for each agent — accuracy, timeliness, reliability, process outcomes — and run regular...
- Google is seeking an experienced professional in Sunnyvale, California, to lead static timing analysis for innovative TPU technology. The role involves driving cutting-edge AI/ML hardware acceleration efforts, ensuring timely sign-off of complex ASICs, and collaborating...
$126k - $204.5k
...Grafana. The ideal candidate should have over 5 years of experience, strong skills in cloud technologies, and a passion for high reliability. Compensation ranges from $126,000 to $204,500 annually, depending on experience and qualifications. #J-18808-Ljbffr Palo Alto Networks...- NVIDIA Gruppe in Santa Clara is on the lookout for a visionary leader to drive innovation in accelerated computing. You'll lead software design decisions, mentor a world-class team, and collaborate closely with company leadership to implement strategic initiatives. The...
$300 per month
Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability. Be a part of the AI revolution with sustainable...Full timeTemporary workShift work$168k - $258.75k
NVIDIA Corporation, located in Santa Clara, is seeking a seasoned professional to lead manufacturing support for Data Center AI accelerator boards. The candidate will be responsible for planning and coordinating manufacturing progress while collaborating closely with engineering...- We are looking for a Senior AI Agentic Lead to drive one of our most strategic enterprise client engagements — building and deploying agentic... ...across engineering and other use cases — e.g. PR review and acceleration, test generation, incident correlation & RCA, knowledge...
- NVIDIA is hiring a Senior Integrated Marketing Manager for Semiconductor in Santa Clara. This role involves leading global campaigns to showcase how AI and accelerated computing transform semiconductor design and manufacturing. The ideal candidate will have over 10 years...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff AI Accelerator Reliability Lead. Be the first to apply!
