Principal Software Engineer (AI Inference / Distributed Systems)
Advanced Micro Devices , Inc.
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges-striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
THE ROLE:
AMD is looking for a strategic software engineering lead who is passionate about improving the performance of key applications and benchmarks . You will be a member of a core team of incredibly talented industry specialists and will work with the very latest hardware and software technology.
THE PERSON:
The ideal candidate should be passionate about software engineering and possess leadership skills to drive sophisticated issues to resolution. Able to communicate effectively and work optimally with different teams across AMD.
KEY RESPONSIBILITIES:
- Develop techniques for optimizing scale-up and scale-out inference.
- Develop methods and tooling to utilize dynamic resources in service of inference
- Support proliferation of rocm ecosystem.
PREFERRED EXPERIENCE:
- Expertise in the K8s ecosystem, especially as it pertains to large scale inference
- Operational experience with at least one of sglang, or vllm and with kserve, llm-d. Experience running inference as a service can be substituted in-lieu of experience with frameworks such as kserve or llm-d.
- Expertise with techniques used to optimize inference like distributed kv-cache, disaggregation, request scheduling etc
- Ability to write high quality code with a keen attention to detail. Preferred languages are go and python.
- Experience with modern concurrent programming
- Effective communicator with keen attention to detail.
- Prior experience roadmapping deeply technical areas is highly valuable.
ACADEMIC CREDENTIALS:
- Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
This role is not eligible for visa sponsorship.
#LI-G11
#LI-HYBRID
Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.
AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD's "Responsible AI Policy" is available here.
This posting is for an existing vacancy.
$272k - $425.5k
Principal Software Engineer – Large-Scale LLM Memory and Storage Systems page is loaded## Principal Software Engineer – Large... ...-throughput, low-latency inference framework for serving generative AI and reasoning models across multi-node distributed environments. Built in...SuggestedLocal areaRemote work$272k - $431.25k
...the platform for every new AI-powered application. We seek a Principal Software Engineer - AI Inference to advance open-source LLM serving... ...on NVIDIA GPUs and systems. You will also strengthen the... ...performance engineering, and distributed systems. You will collaborate...Suggested$2,000 per month
...Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using... ...is The Role: We are on the lookout for a Principal Software Engineer I to join our Elasticsearch - Distributed Systems team and focus on how Elasticsearch provides...SuggestedLocal areaFlexible hours$140k - $240k
...Cerebras Systems builds the world's largest AI chip, 56 times larger than... ...leading training and inference speeds and... ...security-first based engineering. Cerebras cluster... ...cluster management software stack - all the way... ...management role in distributed systems security....Suggested$184k - $287.5k
...Overview We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme... ...systems, computer architecture, parallel programming, distributed systems, deep learning theories. Knowledgeable...Suggested- ...Principal AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the potential of generative... ...tools Experience with distributed, high-performance software design... ...related fields Experience with inference servers/model serving frameworks (...Work experience placement3 days per week
- ...experiences-from AI and data centers,... ...gaming and embedded systems. Grounded in a culture... ...Staff AI Infra Engineer who is passionate... ...intersection of hardware and software to optimize... ...LLM training and inference on AMD GPUs,... ...ML infrastructure, distributed systems, or performance...
$142.8k - $274.8k
...enterprises. Ourconverged AI fabricdelivers inference capabilities for all LLMs inMicrosoft... ..., Llama, and more. As a Principal Software Engineer , you will shape the... ...and shipping core serving systems, smart routing, and request distribution for a broad portfolio of LLMs...Ongoing contractLocal area$248.71k - $292.6k
...Groq Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving... ...is possible. Build fast. Sr. Staff Software Engineer - High Performance GPU Inference Systems... ...Responsibilities & opportunities in this role Distributed Systems Engineering : Design and...$172k - $349k
...Principal Software Engineer, Systems/Solutions Test This role has been designed as ‘Hybrid’ with an expectation... .... Champion adoption of AI-assisted testing workflows, including... ...Demonstrated excellence in debugging complex distributed/network failures and driving closure...Work experience placementWork at officeLocal areaImmediate start2 days per week- ...cryptography, encryption, and confidential AI solutions. As data breaches... ...Requirements We’re looking for a Staff Software Engineer to join our Confidential Computing... ...core platform services powering secure, distributed systems at scale. This is a high-impact,...H1bWorldwide
- NVIDIA Gruppe is seeking a Senior System Software Engineer in Santa Clara, California, to develop world-class GPU-accelerated AI inference serving software. This role involves contributing... ...skills, and a strong understanding of distributed systems. The position offers a...
$120.1k - $225.7k
...Entails End-to-End Inference Optimization: Lead... ...and load imbalance in distributed inference.... ...members to build a robust AI inference technical ecosystem... ...Science, Electronic Engineering, AI, or related fields... ...Intelligent Routing . Systems Proficiency: Expert...Relocation package$226k - $369k
...part of our world-class software engineering team, you will take... ..., best-in-class AI/ML infrastructure, Kubernetes... ...use your passion for distributed technologies and... ...algorithms, API design and systems design, and your... ...our company. As a Principal Staff Software Engineer...For contractorsWork at officeFlexible hours$215k - $250k
...Onehouse Data Infrastructure Engineer Onehouse is a mission-... ...traditional analytics to real-time AI / ML). We are a team of... ...created large-scale data systems and globally distributed platforms that sit at the... ...tech stack by building the software and data features that...Odd jobWork at officeLocal areaRemote workRelocationRelocation package$184k - $287.5k
NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact...$212.8k
...Senior AI Infra Engineer - Large Model Inference Systems (Multimodal/LLM/VLM) Location: San Jose Employment Type... ...high-performance foundation for distributed serving, heterogeneous scheduling... ...or above in Computer Science, Software Engineering, Artificial Intelligence...Temporary workLocal area- ...experiences-from AI and data centers,... ...gaming and embedded systems. Grounded in a culture... ...and Multimodal inference at scale across... ...across internal GPU software teams and engage with... ...Skilled engineer with strong technical... ...training. ~ Distributed System Optimization...
$152k - $204k
...Senior Software Engineer, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of... ...8 years industry experience building distributed systems or cloud services. ~ Strong coding...Permanent employmentTemporary workCasual workWork at officeFlexible hoursShift work- ...experiences-from AI and data centers,... ...gaming and embedded systems. Grounded in a culture... ...member of the LLM inference framework team,... ...single-node and distributed inference runtimes... ...intersection of inference engines, distributed... ...Software Engineering ~ Expertise...
$139k - $204k
...Senior Software Engineer I, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform... ...years industry experience building distributed systems or cloud services. Computer Science...Permanent employmentTemporary workCasual workWork at officeRemote workFlexible hoursShift work$100k
...Software Engineer, TT-Distributed Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations... ...optimize distributed software systems that power the most... ...state-of-the-art distributed inference and training infrastructure...Permanent employment- ...Distributed Software Engineer Bengaluru, Karnataka, India; Sunnyvale CA or Toronto Canada Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture... ...-leading training and inference speeds and empowers machine...
$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...$152k - $241.5k
...platform upon which every new AI‑powered application is... .... We are seeking a Senior Software Engineer - AI Inference to advance open‑source LLM... ...‑class on NVIDIA GPUs and systems-and by improving the underlying... ...mindset. Familiarity with distributed systems concepts and...$188k - $275k
...Staff Software Engineer, Inference CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers... ...GPU resource management, and system-wide optimizations that drive... ...scale. You will work deeply in distributed systems and Kubernetes-based...Permanent employmentTemporary workCasual workWork at officeFlexible hours- ...performance degrades or systems fail, the impact is... ...that using agentic AI. As a Principal Engineer in Performance and... ...adopt Optimize LLM inference at scale through prompt... ...and operating distributed systems at scale Proven... ...systems, software engineering, or related...Full timeTemporary workPart timeLocal areaImmediate startHome officeFlexible hours
- ...experiences—from AI and data... ..., and embedded systems. Grounded in a... ...THE ROLE: As a Principal AI Infrastructure Solution Engineer, you will partner... ...with AMD’s AI software teams and customers... ...LLM training and inference on AMD Instinct... ...‑native distributed training, including...
- We are looking for a Senior System Software Engineer to work on Dynamo-Triton Inference Server. NVIDIA is hiring software engineers... ...GPUs to power a revolution in AI, enabling breakthroughs in... ...design. Experience with high‑scale distributed systems and ML systems. Strong...
$152k - $241.5k
...individual to optimize and benchmark GenAI inference using the latest acceleration... ...industry benchmark results and architecting distributed inference systems. Required qualifications include a relevant degree and significant software development experience in Python or...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal Software Engineer (AI Inference / Distributed Systems). Be the first to apply!
- senior principal software engineer Santa Clara, CA
- principal software engineer Santa Clara, CA
- electronic systems engineer Santa Clara, CA
- space systems engineer Santa Clara, CA
- systems engineer Santa Clara, CA
- system design engineer Santa Clara, CA
- ground systems engineer Santa Clara, CA
- computer systems engineer Santa Clara, CA
- senior linux systems engineer Santa Clara, CA
- healthcare systems engineer Santa Clara, CA


