Principal Software Engineer
Advanced Micro Devices , Inc.
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next‑generation computing experiences—from AI and data centers, to PCs, gaming, and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.THE ROLE:
As a Principal AI Infrastructure Solution Engineer, you will partner with AMD’s AI software teams and customers to enable large‑scale LLM training and inference on AMD Instinct GPUs. You will design and validate production‑ready Kubernetes architectures and translate inference frameworks such as vLLM and SGLang into deployable customer solutions. Your work will accelerate customer time‑to‑production and strengthen AMD’s leadership in AI infrastructure.THE PERSON:
You are a solution‑oriented AI infrastructure engineer with strong expertise in GPU‑accelerated computing and large‑scale AI deployments. You excel at translating complex technologies into customer‑ready solutions and delivering production‑grade Kubernetes‑based inference and training systems. You bring hands‑on experience with Kubernetes‑native distributed training, including scheduling, topology‑aware GPU placement, and operating resilient, high‑performance AI workloads at scale.KEY RESPONSIBILITIES:
Design and deliver reference architectures for LLM training and inference on AMD GPUs, from single‑node to multi‑datacenter deployments using Kubernetes and SLURM. Architect and validate Kubernetes‑based distributed training stacks for large‑scale LLM workloads on AMD GPUs. Define and implement gang scheduling and topology‑aware GPU placement for multi‑node training workloads. Enable Kubernetes‑native training controllers including Kubeflow Training Operator, MPI Operator, Volcano, and Kueue. Partner with enterprise customers and cloud providers to deploy and optimize production AMD GPU clusters for distributed inference and multi‑tenant workloads. Implement and validate GPU orchestration using Kubernetes GPU Operator, device plugins, metrics exporters, and SLURM controllers. Benchmark and optimize LLM inference frameworks (vLLM, SGLang) on AMD hardware, producing customer‑ready performance playbooks. Develop repeatable benchmarks for Kubernetes‑based distributed training, covering scaling efficiency, step time, communication, and checkpointing. Create tuning guides for RCCL/NCCL‑equivalent communication, CPU/GPU affinity, interconnect utilization, and workload‑specific optimizations. Serve as the feedback loop between customers and AMD engineering, translating requirements into validated performance improvements.PREFERRED EXPERIENCE:
Deployed and operated large‑scale GPU clusters for production AI training and inference Deep expertise in Kubernetes GPU orchestration (operators, device plugins, scheduling, multi‑tenancy, observability) Hands‑on experience with distributed training on Kubernetes (Kubeflow, MPI Operator, Volcano, Kueue, Ray) Strong knowledge of gang scheduling, elastic jobs, quotas, priority, and shared GPU environments Tuned Kubernetes networking and storage for AI workloads (high‑performance CNI, RDMA where applicable, scalable checkpointing) Implemented ML observability for training (GPU/comms metrics, step‑time analysis, SLO‑driven ops) Experience in AI/ML infrastructure, solution architecture, and production GPU deployments Proven success enabling customers through complex AI platform deployments and migrations Strong background working across engineering and customer‑facing roles Understanding of AI accelerator architectures and inference optimization techniques Experience operationalizing Kubernetes‑based distributed training at scale Open‑source contributions or AI infrastructure community engagement (plus)LOCATION:
Santa Clara, Ca or open to discuss other locations. This role is not eligible for visa sponsorship.#LI-EV1
#LI-HYBRID
Benefits offered are described: AMD benefits at a glance. AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee‑based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third‑party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process. AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is available here. This posting is for an existing vacancy. #J-18808-Ljbffr Advanced Micro Devices , Inc.Vacancy posted 13 hours ago
Similar jobs that could be interesting for youBased on the Principal Software Engineer in Santa Clara, CA vacancy
$272k - $425.5k
Principal Software Engineer – Large-Scale LLM Memory and Storage Systems page is loaded## Principal Software Engineer – Large-Scale LLM Memory and Storage Systemslocations: US, CA, Santa Clara: US, WA, Remote: US, MA, Remotetime type: Full timeposted on: Posted Todayjob...SuggestedLocal areaRemote work$200.5k - $260.5k
...As a Principal Software Engineer, you will: Work with developers within the team and on other cross‑functional projects, as well as project management to drive, develop, and maintain the product Develop and maintain software components on current and future products Hands...SuggestedFull time$170k - $210k
...security. At Fortinet, our mission is to safeguard people, devices, and data everywhere. We are currently seeking a Principal Software Developer Engineer for our FortiSwitch team. Responsibilities Develop and maintain software components on current and future products Take...SuggestedFull time$175k - $245k
...business requirements. Collaborate with our hardware team to support the delivery of our new platform. Maintain the existing software components, OS related. Requirements: B.S./M.S. with 8+ years of relevant experience. Hands-on experience with the Linux...SuggestedFull timeWorldwide$200k - $220k
...and valuation growth. The Opportunity As Halo transitions from R&D to high-volume manufacturing, we are seeking a Principal Machine Control Software Engineer to support the development, integration, and maintenance of equipment control systems for our semiconductor...SuggestedTemporary work$114.38k - $195.88k
...customers' requirements and reach agreements with customers. Work with hardware engineering colleagues to design/select suitable key parts as part of system conception design. Create software architecture as part of system conception design. Assess software-related...Full timeOverseasFlexible hoursShift workWeekend work$248k - $391k
...Principal Software Development Engineer Specializing In Solid State Drives (Ssd) Are you ready to push the boundaries of what's possible? At nvidia, you'll have the opportunity to work on groundbreaking technology that's setting the standard for graphical processing...$165.8k - $307.9k
...Solutions, is responsible for ensuring a software product meets its specified... ...its development lifecycle. As a Principal Software Developer in Test, you will be... ...this role, you will represent quality engineering and verification on behalf of your team...Work at officeLocal areaRelocation package- ...leading provider of AI-powered IT management and cybersecurity software, serving Managed Service Providers (MSPs) and internal IT... ...raising the bar. Job Summary The Principal Software Engineer - Cloud Native and SASE focused will lead the design and implementation...Worldwide
$147k - $237.5k
...with virtualization technologies, various hypervisors, system software, and networking. Qualifications Required Qualifications: Strong... ...ARM templates. BS/MS degree in Computer Science, Computer Engineering, Electrical Engineering or equivalent or equivalent military experience...Full timeWork at office$147k - $237.5k
...real‑time problem‑solving, stronger relationships, and the kind of precision that drives great outcomes. Job Summary As a Principal Software Engineer, you will play a key role in the design and implementation of our Threat Intelligence Services for public and private...Full timeWork at office$272k - $431.25k
NVIDIA is seeking a strategic and technically proficient Principal Software Engineer to join the Data Center Systems and Software CSP engagements team. As a leader and technologist, you will play a pivotal role contributing significantly to the architecture and development...Shift work- ...the products and services that proactively address them. Our engineering team is at the core of our products - connected directly to the... ...remote networks and mobile users. We are seeking an experienced Software Engineer to design, develop and deliver next‑generation...Full timeWork at officeRemote work
$143k - $286k
...data access layer for SQL and NoSQL datastores. Lead a team of engineers to deliver cross‑team initiatives. Root‑cause analysis of... ...involving multiple teams, applications, networks, hardware, and software that relate to scaling and performance. Collaborate with the open...Temporary workWork experience placementWork at office- ...Sequencing is not only changing science, but we are changing lives. Our software teams are laying the groundwork for the future by developing... ...Come join us! Responsibilities As a Bioinformatics Software Engineer, you will design and develop automated verification tests for...
$167k - $270.5k
...real‑time problem‑solving, stronger relationships, and the kind of precision that drives great outcomes. Job Summary As a Principal Software Engineer to join our CPQ (Configure Price and Quote) team, you will serve as the recognized subject matter expert, bringing...Full timeWork at officeVisa sponsorshipWork visa3 days per week$143k - $286k
...generation content. What you’ll do: Guide and mentor a team of engineers, conducting code reviews and leading design discussions to... ...business goals and scalability requirements. Architect complex software systems, ensuring performance, security, and scalability needs...$147k - $237.5k
Job Summary The Cortex Xpanse group is growing, and we’re looking for a Principal Software Engineer to join our team. This team is at the forefront of identifying and mitigating external security risks by continuously discovering and analyzing our customers' internet-facing...Visa sponsorshipWork visa$200k
...Evaluator products are deployed in mission-critical environments where accuracy, governance, and scale are non‑negotiable. The Principal Software Engineer role exists to help us continue raising the engineering bar on those products — and to lead the way in how we build...Shift work$217k - $326k
...So, if you're ready to seize the endless opportunities and leave your mark, come join us. THE ROLE You will be a key Senior Software Engineer driving the digital transformation of Everpure focused on innovating and enhancing our Modern Data platforms. Your mission is...Work at officeImmediate startFlexible hours- As a Principal Engineer, you will act as a hands‑on technical leader and architect for the Marketplace Traffic Exposure platform. This role... ...and architectural initiatives. Qualifications 10+ years of software engineering experience in large‑scale systems. Deep...Temporary work
$212k - $386.3k
Principal Software Engineer, Retail Foundations Sunnyvale, California, United States Software and Services Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or Apple...Work experience placementRelocation$272k - $431.25k
NVIDIA is the platform for every new AI-powered application. We seek a Principal Software Engineer - AI Inference to advance open-source LLM serving. This role involves contributing to upstream inference engines like vLLM and SGLang. You will ensure they run outstandingly...$143k - $286k
Position Summary We are seeking a dynamic and highly experienced Principal Software Engineer to join our Amends, Post transactions team. In this crucial role, you will design, develop, and implement innovative software solutions that enhance customer engagement, improve...Temporary work- d-Matrix inc. is seeking a Principal Software Engineer specializing in kernels at our headquarters in Santa Clara, CA. In this role, you will be responsible for developing and maintaining software kernels for next-generation AI hardware, ensuring optimized performance....3 days per week
- ...transformation of technology. We are at the forefront of software and hardware innovation, pushing the boundaries of what is... ...onsite at our Santa Clara, CA, headquarters 3-5 days per week. Principal Software Engineer - Kernels The role requires you to be part of the team...Work experience placement3 days per week
$172k - $349k
## Principal Software Engineer, Embedded (RIS)Applylocations: Sunnyvale, California, United States of Americatime type: Full timeposted on: Posted Todayjob requisition id: 1205911Principal Software Engineer, Embedded (RIS)This role has been designed as ‘Hybrid’ with an...Work experience placementWork at office2 days per week- ...business. We are looking for hands-on engineers with expertise and passion in solving difficult... ...problems in all areas of cloud service software engineering: high scale distributed... ...Experience working closely with architects, principals, product and program managers to deliver...Flexible hours
$272k - $431.25k
...will lead the architecture and hands‑on delivery across system software, drivers, and CUDA to make profiling continuously available... ...signals into actionable insights. Set technical direction for an engineering team; mentor engineers, drive technical planning to mitigate...$200k - $225k
...security deployment for remote networks and mobile users. As a Senior Engineer, your role will involve building and designing distributed... ...requirements, design, develop, and support highly scalable software features and infrastructure on our next-generation security platform...Remote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal Software Engineer. Be the first to apply!
Related searches
- principal software engineer Santa Clara, CA
- senior principal software engineer Santa Clara, CA
- senior principal cloud computing engineer Santa Clara, CA
- principal data scientist Santa Clara, CA
- senior principal scientist Santa Clara, CA
- principal cloud computing engineer Santa Clara, CA
- principal architect Santa Clara, CA
- principal Santa Clara, CA
- software technical support engineer Santa Clara, CA
- software support Santa Clara, CA


