Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior AI Performance Architect Remote-ready, GPU Scale

Advanced Micro Devices Inc

The Role

As a Principal Engineer, you will spearhead the next generation of AI infrastructure by defining GPU architecture specifications that enable massive model training at scale. Your expertise will drive 2-3x performance gains in both training and inference pipelines through innovative system design and optimization. You will champion the adoption of cutting‑edge techniques across the engineering organization, from efficient attention mechanisms to advanced parallelization strategies. By establishing comprehensive best practices for distributed ML systems, you will create a framework that enables seamless scaling from single‑GPU to thousand‑GPU deployments.

The Person

You have a deep understanding of GPU microarchitecture, memory hierarchies, and their impact on large‑scale ML workloads. You are passionate about software engineering and possess leadership skills to drive sophisticated issues to resolution. You are able to communicate effectively and work optimally with different teams across AMD.

Key Responsibilities
  • Lead performance modeling and optimization for multi‑trillion parameter LLM training/inference including Dense, Mixture of Experts (MoE) with multiple modalities (text, vision, speech)
  • Model/optimize novel parallelization strategies across tensor, pipeline, context, expert and data parallel dimensions
  • Architect memory‑efficient training systems utilizing techniques like structured pruning, quantization (MX formats), continuous batching/chunked prefill, speculative decoding
  • Incorporate and extend SOTA models such as GPT‑4, Reasoning models (Deepseek‑R1), and multi‑modal architectures
  • Collaborate with internal and external stakeholders/ML researchers to disseminate results and iterate at rapid pace.
Required Experience
  • Extensive and senior experience optimizing large‑scale ML systems and GPU architectures
  • Deep expertise in CUDA programming, GPU memory hierarchies, and hardware‑specific optimizations
  • Proven track record architecting distributed training systems handling large scale systems
  • Expert knowledge of transformer architectures, attention mechanisms, and model parallelism techniques
Preferred Experience
  • PyTorch, CUDA, TensorRT, OpenAI Triton
  • Distributed systems: Ray, Megatron‑LM
  • Performance analysis tools: NSight Compute, nvprof, PyTorch Profiler
  • KV cache optimization, Flash Attention, Mixture of Experts
  • High‑speed networking: InfiniBand, RDMA, NVLink
Academic Credentials
  • Bachelor’s, MS/PhD in Computer Science/Engineering or equivalent industry experience
Location

Austin, Tx or Santa Clara, Ca strongly preferred; Remote is a possibility for the right candidate

Visa Policy

This role is not eligible for visa sponsorship.

Benefits

Benefits offered are described: AMD benefits at a glance.

EEO Statement

AMD and its subsidiaries are equal‑opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third‑party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

AI Screening Policy

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is available here.

Posting Status

This posting is for an existing vacancy.

#J-18808-Ljbffr
Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Senior AI Performance Architect Remote-ready, GPU Scale in Santa Clara, CA vacancy
  •  ...experiences-from AI and data...  ...Enterprise Software Architect, you will provide...  ...leadership in GPU architecture...  ...translates intoworking, performant, and...  ..., and software readiness for emerging edge...  ...workflows that scale across teams and...  ...LOCATION: Open to remote locations... 
    Remote work
    Senior
    Performance

    Advanced Micro Devices , Inc.

    Austin, TX
    2 days ago
  • $133.1k - $221.9k

     ...We are hiring an IT Architect to define and drive cloud...  .... Familiarity with AI/ML and NLP capabilities...  ...applicable) Data & Performance: Oracle and relational...  ...security, and operational readiness. ~ Evaluate...  ...designing and operating large-scale, multi-tenant services... 
    Remote work
    Senior
    Performance
    Contract work

    McKesson

    United States
    5 hours ago
  • $134.9k - $237.3k

     ...Genesys Ai Architect Genesys empowers organizations...  ...experiences at scale to drive customer...  ...Location: Fully remote within U.S. (not limited...  ...AI Architect is a senior presales AI...  ...reliable, production-ready AI solutions that...  ...sources while balancing performance, latency, and... 
    Remote work
    Senior
    Performance
    Work from home
    Worldwide
    Flexible hours

    Genesys

    United States
    13 hours ago
  •  ...Senior Director, Design Engineering Req...  ...Clark Band: 14  Remote Position: Yes...  ...Principal Engineer, AI/ML System Architect. As system...  ...andinference workloads and performance demands, as well...  ...or other modern GPU accelerators and...  ...board to full-scale production and after... 
    Remote work
    Senior
    Performance
    Local area

    Celestica

    Charlotte, NC
    7 days ago
  •  ...Clara is seeking a technical leader for the GPU AI/HPC Infrastructure team. You will design...  ..., focusing on deep learning and high-performance computing. The ideal candidate will have...  ...least 5+ years of experience with large-scale infrastructure, strong programming... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  •  ...Sr. Data Engineer -AI-ready-enterprise-scale data platforms seattle...  ...We are seeking a senior Data Engineer to...  ...Responsibilities Architect and build scalable,...  ...architectures for performance, scalability, and...  ...approved by SCSC) : No Remote work possibility : (... 
    Remote work
    Senior
    Performance
    Freelance
    Local area
    Flexible hours

    Hallmark Global Solutions Ltd

    Bellevue, WA
    2 days ago
  • NVIDIA Gruppe in Santa Clara is looking for a Senior Systems Software Engineer to focus on GPU performance at scale. You will be instrumental in driving innovation in AI and GPU computing, contributing to state-of-the-art computing hardware. The ideal candidate has extensive... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $96k - $181k

     ...JOB DESCRIPTION: The AI-Ready Knowledge Architect plays a critical role in designing...  ...implementing and scaling an Enterprise Data Catalog...  ...business process drivers and performance management, with a value...  ...accommodations by emailing ****@*****.***. #LI-Remote
    Remote work
    Performance
    Work at office
    Work from home
    Flexible hours

    Key Bank

    Brooklyn, OH
    3 days ago
  •  ...Genesys Partner AI Architect Genesys empowers...  ...personalized experiences at scale to drive customer...  ...AI Architect is a senior presales AI...  ..., production-ready AI solutions that...  ...sources while balancing performance, latency, and...  ...and facilitating remote and in person workshops... 
    Remote work
    Senior
    Performance

    Genesys

    Belmont, NC
    5 days ago
  •  ...Senior AI Architect – Multi-Agent Systems & Platform Infrastructure...  ...Location: Remote (US Preferred) Compensation...  ...the urgency and scale of the market Nivalto...  ...monitor agent workflows, performance, and compliance in...  ...Collaborate on SOC2-ready audit frameworks, consent... 
    Remote work
    Senior
    Performance
    Full time
    Work at office

    Nivalto

    San Francisco, CA
    11 days ago
  • $136.3k - $204.5k

     ...challenges you? Are you ready for an opportunity...  ...other. The Senior Architect in Delivery leads...  .... • Use AI-assisted tooling to...  ...practices at program scale. Delivery and Consulting...  ...architectures for performance and cost. •...  ...hybrid and remote teams and maintaining... 
    Remote work
    Senior
    Performance
    Home office
    Visa sponsorship
    Work visa
    Flexible hours

    3Cloud

    United States
    2 days ago
  • $168k - $258.75k

    A leading AI technology company in Santa Clara is seeking a Senior Datacenter Technical Program Manager. In this role, you will drive the integration of cutting...  ...candidate has 8+ years of experience in high-performance computing, excellent teamwork skills, and a background... 
    Remote job
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  •  ...Associates Limited is looking for a Senior Storage Engineer to support large-scale AI infrastructure in San Francisco....  ...storage solutions for high-performance GPU platforms. The ideal candidate has...  ..., including stock options and remote working options. #J-18808-Ljbffr... 
    Remote job
    Senior
    Performance

    Hamilton Barnes Associates Limited

    San Francisco, CA
    3 days ago
  •  ...in San Francisco is seeking a Senior Site Reliability Engineer to design and operate large-scale GPU infrastructure. This high-impact...  ...will ensure reliability and performance, serving as a key technical liaison...  ...managing large-scale AI workloads. The position offers... 
    Remote job
    Senior
    Performance

    Cortes 23

    San Francisco, CA
    13 hours ago
  • $125k - $250k

     ...Senior Account Executive- GPU/AI Infrastructure Senior Account Executive - GPU and...  ...Infrastructure Location: Remote within the USA...  ...globally, we deliver high-performance GPU solutions that remove...  ...Enterprise GPU clusters for large-scale AI initiatives On-demand... 
    Remote work
    Senior
    Performance
    Temporary work
    Flexible hours

    ESR Healthcare

    New York, NY
    18 days ago
  •  ...Title: Senior AI Architect Location: Remote Duration: 10+ Months What are the requirements...  ...select groups Monitor platform performance and usage to drive continuous improvement...  ...Strong understanding of enterprise-scale system architecture, APIs, and... 
    Remote work
    Senior
    Performance

    Netorbit

    United States
    3 days ago
  •  ...mission to democratize AI by breaking down...  ...an innovative GPU marketplace and AI...  ...We're seeking a Senior Infrastructure Engineer...  ...to help build and scale Hyperbolic's GPU Cloud...  ...Redfish, BMC-based remote management, PXE boot...  ...Familiarity with high-performance networking... 
    Remote work
    Senior
    Performance

    Hyperbolic Labs

    San Francisco, CA
    1 day ago
  •  ...Technical Solutions Architect Ai/Gpu Do you enjoy building cutting-edge...  ...adopt AI inference at scale? Join the Technical Solutions...  ...Networking, storage, and high-performance computing concepts, data pipelines...  ...One Raffles Place, Singapore, 048616, SG (Remote)... 
    Remote work
    Senior
    Performance
    Permanent employment
    Work at office
    Work from home
    Worldwide
    Flexible hours

    Akamai

    United States
    2 days ago
  •  ...Senior AI Workflow Architect Berlin, DE | Germany (REMOTE) | Stuttgart, DE WongDoody creates human experiences at 24...  ...capabilities into stable, production-ready systems System...  ...production stability Ensure performance, scalability, and maintainability... 
    Remote work
    Senior
    Performance
    Work at office
    Local area
    Flexible hours

    WONGDOODY

    United States
    6 hours ago
  •  ...Senior Director, Design Engineering (Req ID: 134...  ...Clark Band: 14 Remote Position: Yes Region...  ...Principal Engineer, AI/ML System Architect. As system architect,...  ...inference workloads and performance demands, as well as...  ...Intel, or other modern GPU accelerators and... 
    Remote work
    Senior
    Performance
    Local area

    Celestica

    San Jose, CA
    2 days ago
  • $172.5k - $313.7k

     ...Salesforce is the #1 AI CRM, where...  ...of it all. Ready to level‑up your...  ...scalable, and high‑performance platforms that...  ...toward large scale, highly distributed...  ...throughput, GPU‑accelerated...  ...This includes architecting and maintaining...  ...looking for a Senior or Staff Software... 
    Senior
    Performance
    Temporary work

    Centaur Labs

    Austin, TX
    1 day ago
  • $184k - $287.5k

    ## Senior GPU ArchitectApplylocations: US, CA, Santa Clara: US, NC, Remote: US, TX, Remote: US, OR, Remote: US, AZ...  ...Group is looking for architects to contribute to the...  ...provide feedback for performance optimization. As a member...  ...is spearheading the AI revolution. Join our... 
    Remote work
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    3 hours ago
  •  ...leading the AI and Digital Revolution...  ...them at scale through its...  ...on high-performance teams? Join WWT...  ...most successful architects do not just...  ...This is a senior thought leadership...  ...— spanning GPU compute, high...  ...into customer-ready solutions, workshops...  ...#LI-DP2 #LI-Remote WWT will... 
    Remote work
    Senior
    Performance
    Full time
    Work at office

    World Wide Technology

    United States
    13 hours ago
  • $272k - $431.25k

     ...NVIDIA AI in Santa Clara is seeking a Principal System Architect with extensive experience in SoC platforms. The candidate will define architecture for high-volume GPU products, guide performance evaluations, and mentor junior engineers. A Master's degree and 15+ years... 
    Senior
    Performance

    NVIDIA AI

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

    NVIDIA is seeking a Senior Systems Software Engineer focusing on GPU Performance at Scale. This role involves driving innovation in AI and GPU computing, collaborating with developers and researchers to enhance system workflows. Key duties include leading performance practices... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  • Photon is seeking an Agentic AI FDE to join their team in Berkeley Heights, NJ. This full-time role requires extensive experience in...  ...Python. The position focuses on designing and implementing production-ready solutions, working directly with cloud platforms like Azure or... 
    Remote job
    Senior
    Full time

    Photon

    Berkeley Heights, NJ
    1 day ago
  • $160k - $220k

     ...AI Architect AI Architect, US-Based (Remote) to build AI systems that work in production -...  ...Technical Architect to join a scaled, high-growth US consumer...  ...role is for a former senior software engineer who...  ...continuous improvement, performance tracking, and cost optimization... 
    Remote work
    Senior
    Performance
    Home office

    BluZinc

    United States
    13 hours ago
  •  ...Staff HPC Engineer to design and optimize large scale compute environments for scientific computing and AI workloads. The ideal candidate should have extensive...  ...and developers, focusing on scalability and performance optimization. KLA offers competitive benefits including... 
    Senior
    Performance

    KLA-Belgium

    Milpitas, CA
    1 day ago
  • NVIDIA Corporation is seeking a Hardware Architect in Santa Clara to design the next generation of NVLink Fusion AI scale up hardware. This pivotal role involves working...  ...with 12+ years of relevant experience in high-performance connectivity architectures. Join NVIDIA and... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $168k - $258.75k

    Senior Datacenter Technical Program Manager, At-Scale AI Clusters page is loaded## Senior Datacenter...  ...Clara: US, CA, Remote: US, Remotetime...  ...engineers and architects to build and deploy large scale GPU computing systems...  ...Experience with high-performance computing systems... 
    Remote work
    Senior
    Performance
    For contractors

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AI Performance Architect Remote-ready, GPU Scale. Be the first to apply!