Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

System Debug Engineer Manager, Cloud AI Infrastructure

$192k - $278k

Google Inc.

corporate_fare Google place Kirkland, WA, USA ; Austin, TX, USA Apply In accordance with Washington state law, we are highlighting our comprehensive benefits package, which is available to all eligible US based employees. Benefits for this role include: Health, dental, vision, life, disability insurance Retirement Benefits: 401(k) with company match Paid Time Off: 20 days of vacation per year, accruing at a rate of 6.15 hours per pay period for the first five years of employment Sick Time: 40 hours/year (increased to 69 hours/year for Seattle) including 5 discretionary sick days per instance Maternity Leave (Short-Term Disability + Baby Bonding): 28-30 weeks Baby Bonding Leave: 18 weeks Holidays: 13 paid days per year Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Kirkland, WA, USA; Austin, TX, USA . Bachelor's degree in Computer Science or IT-related field, or equivalent practical experience. 8 years of experience with system design. 5 years of experience managing or leading a team. 5 years of experience with managing technical work, engineering strategy, and roadmaps. 5 years of experience with hardware debug (silicon debug, platform debug, IO interface, memory analysis). 3 years of experience with organizational design. Preferred qualifications: 5 years of experience working with vendors or customers. 3 years of experience with leadership development and career growth of employees. 3 years of experience in analyzing and troubleshooting distributed systems. 2 years of CPU, dGPU, or TPU debug or validation experience. Understanding of memory and high-speed IO technologies. About the job Systems Development Engineering (SDE) at Google is a role where you manage services and systems at scale. SDEs creatively put their engineering discipline to use automating the mundane and reducing toil. We don’t just write code to fix bugs, but emphasize the development of tools and solutions that fix classes of problems. We know it’s hard to control what you can’t measure – so we focus on observability: instrumenting first, then turning data into knowledge, and finally knowledge into action. We know that the operational efficiency of Google systems, services, virtual compute environments and the operating systems that power them impact the environment, not just the bottom line. We know that working together we can do more, and that community matters. Google brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame‑free environment. We promote self‑direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow. Together we engineer and build the infrastructure, tools, access and telemetry for systems that enable orchestration of Google‑scale services. Come build things that matter. As a part of the Google Cloud Support team, you will ensure customers maximize their investment. As a Systems Debug Engineer, you will be a trusted advisor driving hardware understanding and issue resolution. You will troubleshoot platform challenges, providing expert solutions that enable innovation. You will represent the customer, collaborate with engineering and product teams to drive continuous improvement across global cloud products and services. The US base salary range for this full‑time position is $192,000‑$278,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job‑related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process. Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google . Responsibilities Drive technical team performance across on‑call activities and system management by delivering leadership, mentorship, and career development while collaborating with primary responders to address system issues. Debug platform hardware, silicon, and AI/ML workloads to drive root‑cause resolution, develop permanent infrastructure improvements, and build tools for faster diagnosis through troubleshooting and reproduction. Collaborate cross‑functionally with Product, Quality, and Engineering teams to enhance product outcomes, and engage with Site Reliability Engineering (SRE) teams to ensure high‑quality production and reliability. Resolve customer challenges on AI/ML infrastructure through effective diagnosis, resolution, and the implementation of investigation tools to increase productivity for critical reported issues. Serve as a consultant and subject matter expert for internal stakeholders to resolve deployment and operational obstacles across AI infrastructure environments daily. Google is proud to be an equal opportunity and affirmative action employer. We are committed to building a workforce that is representative of the users we serve, creating a culture of belonging, and providing an equal employment opportunity regardless of race, creed, color, religion, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition (including breastfeeding), expecting or parents‑to‑be, criminal histories consistent with legal requirements, or any other basis protected by law. See also Google's EEO Policy, Know your rights: workplace discrimination is illegal, Belonging at Google, and How we hire. Google is a global company and, in order to facilitate efficient collaboration and communication globally, English proficiency is a requirement for all roles unless stated otherwise in the job posting. To all recruitment agencies: Google does not accept agency resumes. Please do not forward resumes to our jobs alias, Google employees, or any other organization location. Google is not responsible for any fees related to unsolicited resumes. #J-18808-Ljbffr Google Inc.

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the System Debug Engineer Manager, Cloud AI Infrastructure in Austin, TX vacancy
  • $192k - $278k

     ...experience with system design. 5...  ...of experience managing or leading a team...  ...technical work, engineering strategy, and roadmaps...  ...with hardware debug (silicon debug,...  ...and build the infrastructure, tools, access...  ...of the Google Cloud Support team,...  ..., silicon, and AI/ML workloads to... 
    Suggested
    Permanent employment
    Full time
    Temporary work

    Google

    Austin, TX
    1 day ago
  • $163k - $237k

     ...experience with systems automation, and...  ...with technical infrastructure (e.g.,...  ...resolve complex AI infrastructure...  ...Systems Development Engineering (SDE) at Google...  ...role where you manage services and systems...  .... The Google Cloud Support team...  .... As a Systems Debug Engineer, you will... 
    Suggested
    Full time
    Temporary work

    Google Inc.

    Austin, TX
    4 days ago
  • $163k - $237k

    Google is seeking a Systems Debug Engineer located in Austin, Texas. In this role, you'll manage services and systems at scale, focusing...  ...customer issues related to AI/ML workloads. You'll also participate...  ...automation and technical infrastructure. The compensation package... 
    Suggested

    Google

    Austin, TX
    2 days ago
  • Latent AI is seeking an Infrastructure Developer in Austin, Texas to build foundational systems for groundbreaking brain-computer interface experiences. You will enhance developer...  ...implement robust infrastructure on both cloud and on-premise environments. Ideal... 
    Suggested
    Flexible hours

    Latent AI

    Austin, TX
    3 days ago
  • $171k - $248k

     ...product marketing, product management, or a related role with a focus...  ...5 years of experience in cloud technology (i.e. on-prem to...  ...cloud-to-cloud migrations, AI/ML infrastructure, model training/inference,...  ...across Product, Engineering, and Sales to manage projects... 
    Suggested
    Full time
    Flexible hours

    Google

    Austin, TX
    2 days ago
  •  ...experiences—from AI and data centers,...  ...gaming and embedded systems. Grounded in a culture...  ...a lead systems engineer to provide thought...  ...system level debug engineer. Individual...  ...application, data, infrastructure, architecture expertise...  ...and supporting cloud environments,... 

    Advanced Micro Devices

    Austin, TX
    2 days ago
  • Site Reliability Engineer (Edge Services), Infrastructure Services Austin,...  ...distributed systems and seamless user...  ...be comfortable debugging protocol‑level issues...  ...configuring and managing modern...  ...Experience managing cloud environments (AWS...  ...applying Generative AI tools within SRE... 
    Shift work

    Apple Inc.

    Austin, TX
    2 days ago
  • $81k - $260k

    Micron Technology, Inc. is seeking a candidate with extensive experience in system bring-up and debugging to drive technical collaborations with partners like Intel and AMD. The role requires solid knowledge of computer hardware systems and the ability to work in a fast... 

    Micron Technology, Inc.

    Austin, TX
    2 days ago
  • $170k - $220k

    At Electric Mind, Engineering is where strategy meets action. Our team helps...  ...designs and built modern cloud‑based data platforms that support analytics, AI, and machine learning workloads...  ...Experience and understanding in Infrastructure‑as‑Code Implement and integrate... 

    Electric Mind

    Austin, TX
    2 days ago
  • $79.1k - $158.2k

     ...team as a Software Engineer 3, focused on Site...  ..., scalable infrastructure and data pipelines...  ...next evolution of cloud operations by advancing...  ...observability, and AI-assisted reliability...  ...response, system resilience, and operational...  ...lifecycle management Observability and... 
    Temporary work
    Flexible hours

    Oracle

    Austin, TX
    4 days ago
  • $183k - $265k

     ..., Technology, Engineering, Mathematics,...  ...experience in cloud computing or a...  ...of experience managing a software engineering...  ...developing AI/Generative AI...  ...(RAG) systems. Experience in...  ...solutions within infrastructures, ensuring data...  ...consult, but codes, debugs and jointly... 
    Full time

    Google Inc.

    Austin, TX
    4 days ago
  •  ...computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture...  ...and Hardware Platform Engineering team to lead system,...  ...and platform teams to debug complex issues and deliver...  .../O adapters, and power management. Lead cross‑functional... 
    Worldwide

    Advanced Micro Devices

    Austin, TX
    11 hours ago
  • A leading technology company is looking for a Systems Design Engineer in Austin, Texas, to support complex data center deployments. The role involves hands-on validation, configuring platforms, debugging system issues, and collaborating with cross-functional teams. Ideal... 

    Advanced Micro Devices

    Austin, TX
    2 hours ago
  • $200k - $230k

     ...building the AI growth platform...  ...to build the infrastructure that powers our...  ...that enable both engineers and AI agents...  ...operations, debugging, and automation...  ..., dependency management, and local configuration...  ...production systems at scale....  ...(Confluent Cloud) Observability... 
    Local area
    Remote work
    Shift work

    GrabJobs

    Austin, TX
    11 hours ago
  • $100k

     ...Infrastructure And Platform Engineer, Metal United States Tenstorrent...  ...industry on cutting-edge AI technology,...  ...across large-scale AI systems. This role focuses on...  ...operators, and production debugging. Proficient in...  ...and cluster lifecycle management. How internal... 

    Tenstorrent

    Austin, TX
    1 day ago
  •  ...platform company based in Texas is looking for a mid-level engineer to develop and maintain robust cloud and on-premise platform infrastructure. In this role, you will be responsible for deploying Teradata systems across AWS, Azure, and Google Cloud, tuning performance... 

    Teradata Corporation (SE)

    Austin, TX
    2 days ago
  • $192k - $278k

    Google is hiring a Systems Debug Engineer in Austin, Texas, to manage services and systems at scale. The role emphasizes automating tasks, troubleshooting, and enhancing platform performance. Candidates should have a Bachelor’s degree in Computer Science, 8 years of system... 

    Google

    Austin, TX
    4 days ago
  • $171k - $248k

    Global Content Manager, Startups, Google Cloud Google Chicago, IL, USA ; Atlanta, GA, USA ; +2 more Apply X The application window will...  ...Preferred qualifications: Ability to translate complex AI and cloud infrastructure narratives into compelling messaging that resonates... 
    Full time

    Google Inc.

    Austin, TX
    4 days ago
  • $171k - $248k

    Global Content Manager, Corporate, Google Cloud Google - Chicago, IL, USA; Atlanta, GA, USA; +2 more The...  ...prioritization models. Ability to leverage AI to enable content generation.Track...  ...web‑scale data centers and software infrastructure. The Cloud Marketing team is looking... 
    Full time

    Google Inc.

    Austin, TX
    11 hours ago
  •  ...the most complete cloud analytics and data platform for AI. By delivering harmonized...  ...’ll Do As Senior Manager, Cloud Marketplace...  ...to-end operational engine that powers...  ...platform as Teradata’s system of record for co-sell...  ...marketplace and co-sell infrastructure — including SKU/... 
    Permanent employment
    Contract work
    Flexible hours

    Teradata Corporation (SE)

    Austin, TX
    1 day ago
  • Compunnel, Inc. is looking for an experienced Systems Engineer to support large-scale technology transformations. The role focuses...  ...enterprise systems and involves working with AI-assisted workflows and cloud-native applications. Ideal candidates have strong expertise... 

    Compunnel, Inc.

    Austin, TX
    2 days ago
  • eBay Inc. is seeking an Engineering Manager to lead its Cloud Security Team, specializing in Identity and Access Management (IAM). The role involves guiding the development of secure identity systems and adopting AI-powered solutions to enhance security and operational... 

    eBay Inc.

    Austin, TX
    1 day ago
  • $82.5k - $199.5k

     ...Job Description Principal Product Manager - Cloud Database & AI Innovation Location: Redwood...  ...database cloud services in Oracle Cloud Infrastructure with wide adoption across the globe...  ...to work collaboratively with engineering on product requirements and designs... 
    Temporary work
    Flexible hours

    Oracle

    Austin, TX
    4 days ago
  • Advanced Micro Devices is seeking a hands-on lead systems engineer for its Data Center GPU organization in Austin, TX. You...  ...contributing to the development of next-generation AI products. Strong background in debugging, hardware validation, and experience in Data Center... 

    Advanced Micro Devices

    Austin, TX
    2 days ago
  • $131k - $245k

    IBM is seeking an Infrastructure SRE based in Austin, Texas, to enhance our cloud and on-prem infrastructure. This role involves designing, developing, and delivering solutions while ensuring robust security and compliance. Candidates should have a high school diploma,... 

    IBM

    Austin, TX
    4 days ago
  •  ...software, and servers into oneunified system. You’ll join a team of engineers and architects who are dedicated to...  ...secure web services and scalable infrastructure for highly available applications....  ...skills. Comfort building and using AI-assisted workflows that enhance efficiency... 

    Apple Inc.

    Austin, TX
    3 days ago
  • $119.2k - $175.45k

     ...analytics platform engineering team is at the...  ...company. As a Senior Systems Engineer, you'll...  ...passionate about cloud technologies,...  ...architecture with an AI‑enabled...  ...application deployment and management. BI platforms...  ...Build and maintain Infrastructure as Code using... 
    H1b
    Relocation package
    Flexible hours

    General Motors

    Austin, TX
    4 days ago
  • $118.3k - $251.6k

     ...Description The OCI AI Infrastructure team plays a critical...  ...automating hardware lifecycle management, and integrating...  ...of Oracle Cloud Infrastructure. About...  ...collaborate effectively across engineering, product, and design...  ...to improve systems, teams, and outcomes.... 
    Temporary work
    Flexible hours

    Oracle

    Austin, TX
    4 days ago
  • $136k - $228.6k

     ...role: We are the Cloud Security Team at eBay...  ...We’re looking for an Engineering Manager to lead a high-impact...  ...intelligent identity systems that protect users,...  ...drive the adoption of AI-powered security solutions...  ...— including Cloud Infrastructure, Security Compliance,... 
    Immediate start
    Remote work
    Visa sponsorship

    eBay Inc.

    Austin, TX
    4 days ago
  •  ...generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of...  ...THE ROLE: The Platform Systems Engineer, Power Enablement will drive the...  ...hardware, and software Coordinate debug of issues and drive them to closure... 

    Advanced Micro Devices

    Austin, TX
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to System Debug Engineer Manager, Cloud AI Infrastructure. Be the first to apply!