Senior Software Development Engineer - SGLang and Inference Stack

Advanced Micro Devices , Inc.

WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

THE ROLE:

As a core member of the team, you will play a pivotal role inoptimizingand developing deep learning frameworks for AMD GPUs. Your work will be instrumental in enhancing GPU kernel performance, accelerating deep learning models, and enabling RL training and SOTA LLM and Multimodal inference at scale across multi-GPU and multi-node systems. You will collaborate across internal GPU software teams and engage with open-source communities to integrateand optimize cutting-edgecompiler technologies and drive upstream contributions thatbenefitAMD’s AI software ecosystem.

THE PERSON:

Skilled engineer with strong technical and analyticalexpertisein GPGPU C++, Triton, TileLang or DSL development within Linux environments. The ideal candidate will thrive in both collaborative team settings and independent work, with the ability to define goals, manage development efforts, and deliver high‑quality solutions. Strong problem‑solving skills, a proactive approach, and a keen understanding of software engineering best practices are essential.

KEY RESPONSIBILITIES:

OptimizeDeep Learning Frameworks: Enhance performance of frameworks like TensorFlow,PyTorch, andSGLangon AMD GPUs via upstream contributions in open‑source repositories. Develop and Optimize Deep Learning Models: Profile, analyze, code change and tune large‑scale training and inference models foroptimalperformance on AMD hardware.Day-0 supports to many SOTA models, DeepSeek 3.2, Kimi K2.5, etc. GPU Kernel Development: Design, implement, andoptimizehigh‑performance GPU kernels using HIP, Triton, TileLang or other DSLs for AI operator efficiency. Collaborate with GPU Library and Compiler Teams: Work closely with internal compiler and GPU math library teams to integrate, optimize and align kernel‑level optimizations with full‑stack performance goals.Initiate and help with different level codegen optimizations. Contribute toSGLangDevelopment: Support optimization, feature development, and scaling of theSGLangframework across AMD GPU platforms for LLM, multimodal serving and RL‑training. Distributed System Optimization: Tune and scale performance across both multi‑GPU (scale‑up) and multi‑node (scale‑out) environments, including inference parallelism, prefill‑decode disaggregation, Wide‑EP and collective communication strategies. Graph Compiler Integration: Integrate andoptimizeruntime execution through graph compilers such as XLA,TorchDynamo, or custom pipelines. Open‑Source Collaboration: Partner with external maintainers to understand framework needs, propose optimizations, and upstream contributions effectively. Apply Engineering Best Practices: Leverage modern software engineering practices in debugging, profiling, test‑driven development, and CI/CD integration.

PREFERRED EXPERIENCE:

Strong Programming Skills: Proficient in C++ and/or Python (PyTorch, Triton, TileLang), withdemonstratedability to code, debug, profile, and optimize performance‑critical code. SGLangand LLM Optimization: Hands‑on experience withSGLangor similar LLM inference frameworks is highly preferred. Compiler and GPU Architecture Knowledge: Background in compiler design or familiarity with technologies like LLVM, MLIR, orROCmis a plus. Heterogeneous System Workloads: Experience running and scaling workloads on large‑scale, heterogeneous clusters (CPU + GPU) using distributed training or inference strategies. AI Framework Integration: Experience contributing to or integrating optimizations into deep learning frameworks such asPyTorch, SGLang, vLLM, Slime, VeRL GPGPU Computing: Working knowledge of HIP, CUDA, Triton, TileLang or other GPU programming models; experience with GCN/CDNA architecture preferred.

ACADEMIC CREDENTIALS:

Bachelor’s and/orMaster’s Degree in Computer Science, Computer Engineering, Electrical Engineering, Physics or a related field.

#LI-JG1

Benefits offered are described: AMD benefits at a glance. AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee‑based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third‑party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process. AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is available here. This posting is for an existing vacancy. #J-18808-Ljbffr Advanced Micro Devices , Inc.

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the Senior Software Development Engineer - SGLang and Inference Stack in Santa Clara, CA vacancy

Senior Software Engineer - AI Inference
$152k - $241.5k
...application is built. We are seeking a Senior Software Engineer - AI Inference to advance open‑source LLM serving... ...inference engines like vLLM and SGLang-ensuring they run best‑in‑class on NVIDIA... ...-and by improving the underlying stack that enables high‑throughput, low‑latency...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Software Engineer, AI Inference Systems
$184k - $287.5k
...skilled and motivated software engineers to join us and build AI inference systems that serve large... ...-performance inference stacks, optimize GPU kernels and... ...AI research and development to create groundbreaking... ...engines (e.g., vLLM and SGLang). Familiarity with GPU...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior AI Inference Kernel Engineer
$184k - $287.5k
...in Santa Clara is seeking an AI Systems Engineer to innovate and develop cutting-edge technologies in the AI inference software stack. Candidates should hold a Master's degree... ...over 6 years of experience in ML/DL systems development. The role involves building efficient...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior AI Systems Engineer — SGLang & Inference on GPUs
A leading technology company is seeking a skilled engineer to optimize deep learning frameworks and enhance GPU kernel performance. The... ...frameworks like TensorFlow and PyTorch and working closely with GPU software teams. This role promises a dynamic work environment with a...
Senior
Advanced Micro Devices
Santa Clara, CA
3 days ago
Senior DL Algorithms Engineer - Inference Performance
$152k - $241.5k
...are looking for a Senior DL Algorithms Engineer for LLM/Omni model... ...layers of the hardware/software stack from GPU... ...NVIDIA’s accelerated inference SW stack. Contribute... ...like TRT-LLM, vLLM, SGLang, FlashInfer, etc.... ...-heavy application development. Deep understanding...
Senior
NVIDIA
Santa Clara, CA
4 days ago
Senior Performance Engineer, Inference
...industry‑leading training and inference speeds and empowers machine... ...The Role We are hiring a Senior Performance Engineer to join our Product team.... ...resident expert on how Cerebras stacks up against alternative... ...source inference stacks (vLLM, SGLang, TensorRT‑LLM), GPU kernel‑...
Senior
Contract work
Shift work
Cerebras
Sunnyvale, CA
2 days ago
Principal Software Engineer - AI Inference
$272k - $431.25k
...We seek a Principal Software Engineer - AI Inference to advance open-source... ...like vLLM and SGLang. You will ensure they... ...strengthen the underlying stack for high-throughput,... ..., engage in development discussions, help compose... ...the community. Mentor senior engineers, raise the...
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Software Engineer - TensorRT Edge-LLM
...large language model inference? Join NVIDIA’s TensorRT... .... We build the software stack that enables Large Language... ...kernel and operator development for critical... ...Electrical/Computer Engineering, or a closely related... ...TensorRT‑LLM, vLLM, SGLang, MLC‑LLM, or FlashInfer...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Java Full Stack Developer: Microservices & Cloud
TechClub Inc. is seeking a Java Full Stack Developer to join their team, focusing on programming and developing microservices for their US clients. The role offers a hybrid work model with three days on-site in Sunnyvale, CA. Applicants should have over five years of experience...
Senior
Downtown Boulder Partnership
Sunnyvale, CA
1 day ago
Senior Software Engineer, Inference
$152k - $204k
...025. Learn more at What You'll Do: Senior engineers are area owners who lead designs, raise... ...teams to evolve our Kubernetes-native inference platform and meet strict P99 SLAs at scale... ...scale, CI/CD, and observability stacks (Prometheus, Grafana, OpenTelemetry)....
Senior
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
3 days ago
Senior .NET Full Stack Developer
...Title: Senior .NET Full Stack Developer Work Location: Santa Clara... ...technical role requiring strong development, architectural, and... ...activities and improving engineering productivity. This... ...~ Strong understanding of software development best practices...
Senior
Contract work
eTeam
Santa Clara, CA
12 hours ago
Senior Software Engineer - Full Stack
$136.5k - $276.5k
...Senior Software Engineer - Full Stack This role has been designed as 'Hybrid' with an expectation that you will work on average 2 days per week... ...outsourced partners, across all phases of the software development lifecycle: design, analysis, coding, testing, and integration...
Senior
Work experience placement
Work at office
Local area
Immediate start
2 days per week
Hewlett Packard Enterprise Development LP
Sunnyvale, CA
1 day ago
Senior GPU AI Inference Engineer - Triton & Dynamo
A leading technology company is seeking a Senior System Software Engineer to develop GPU-accelerated AI inference serving software. The ideal candidate will have over 5 years of experience with deep learning software, strong skills in Rust and C++, and a collaborative approach...
Senior
NVIDIA Corporation
Santa Clara, CA
11 hours ago
Senior Full-Stack Engineer: AI Marketing Platform (Remote)
CrowdStrike, Inc. is seeking a Sr. Full Stack Engineer to architect and build an AI platform that transforms marketing workflows. This position involves building infrastructure that supports enterprise-grade capabilities, requiring collaboration with marketing leaders...
Senior
Remote job
CrowdStrike
Sunnyvale, CA
4 days ago
Senior Full-Stack Engineer — Node.js, Angular & APIs
A leading IT consulting firm in Sunnyvale, CA, is seeking a Senior Engineer - Full Stack. This role involves end-to-end development lifecycle responsibilities including design, implementation, and deployment, focusing on web and mobile applications. The ideal candidate...
Senior
Pyramid Consulting
Sunnyvale, CA
1 day ago
Senior AI Inference Performance Engineer (GPU/Cluster)
$152k - $241.5k
...seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role... ...Required qualifications include a relevant degree and significant software development experience in Python or C++. A deep understanding of LLM...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Software Engineer, CUTLASS Performance
$152k - $241.5k
...and industries. Within our software stack, CUTLASS stands out as a popular... ...-art deep learning models' inference and training passes to... ...Computer Science, Computer Engineering, or related field (or equivalent... ...frameworks like PyTorch, JAX, SGLang, vLLM, TRT-LLM, or others....
Senior
Nvidia Corporation
Santa Clara, CA
1 day ago
Senior AI Kernel & Inference Engineer
A leading technology company is seeking a Senior AI Software Engineer to join their team in Santa Clara, California. In this role, you will innovate and develop groundbreaking AI systems software for inference applications including deep learning framework optimizations...
Senior
NVIDIA
Santa Clara, CA
1 day ago
Senior Software Engineer, Inference
About the Role We are seeking a Senior Inference Engineer to accelerate the performance of Pika's AI-driven products. In this highly technical... ...quantization, attention acceleration, and deep learning compiler stacks. GPU & Parallelism: Deep knowledge of GPU programming (...
Senior
Full time
Work at office
3 days per week
Pika
Palo Alto, CA
4 hours ago
Senior Software Engineer, RL Post-Training Frameworks
...challenges in the field. RL requires inference, rollout generation, and... ...building an RL Frameworks engineering team to develop the open‑... ...on. The team spans the full software stack, from collaborating closely... ...performance inference engines (vLLM, SGLang, TensorRT‑LLM) into RL...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior AI Systems Engineer: Inference Kernels & Runtimes
$184k - $287.5k
NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Software Engineer, Deep Learning Inference - TensorRT
$152k - $241.5k
Senior Software Engineer - Deep Learning Inference What you’ll be doing: Craft and develop robust inferencing software that can be scaled to multiple platforms... ...Engineering or related field. 3+ years of software development experience. Strong experience with the latest C++...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior System Software Engineer - Dynamo-Triton Inference Server
We are looking for a Senior System Software Engineer to work on Dynamo-Triton Inference Server. NVIDIA is hiring software engineers... .... Contribute to feature development and drive broad customer adoption... ...Server and NVIDIA Dynamo stacks to establish a unified, high‑performance...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior AI Inference Engineer - High-Performance LLM Serving
$152k - $241.5k
NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Frontend-Heavy Full Stack Engineer (React/TypeScript)
B Capital in Santa Clara is seeking a Full Stack Engineer with a strong frontend focus. You will own a significant portion of the UI layer for our AI-driven finance automation platform while also having backend fluency to unblock when necessary. Your responsibilities include...
Senior
B Capital
Santa Clara, CA
2 days ago
Senior DL Software Engineer, Model Optimization and Edge Deployment - Autonomous Vehicles
$184k - $287.5k
...optimization strategies for inference, such as automated model... ...the road. Architect the software interface to seamlessly... ...Science, Computer Engineering, or a related technical... ...modern LLM/VLM inference stacks, such as vLLM, TensorRT‑LLM, and SGLang. A proven track record of...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior AI Inference Systems Engineer: GPU-Optimized, Cloud
$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Software Engineer - Fallback Stack
$145k - $245k
...family commitments. About the role We are looking for a software architect to own the fallback stack for our L4 trucking program. You will define how the... ...Work closely with behavior, planning, and perception engineers to align safety logic and trajectories with fallback...
Senior
Odd job
Full time
For contractors
For subcontractor
Casual work
Work at office
Remote work
Day shift
Applied Intuition
Sunnyvale, CA
11 hours ago
Senior Staff Software Engineer - High Performance GPU Inference Systems
$248.71k - $292.6k
...Groq Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™... ...anything is possible. Build fast. Sr. Staff Software Engineer - High Performance GPU Inference... ...Next‑Gen Enablement : Future‑prove the stack to support evolving GPU architectures (...
Senior
I did my part and supported the Regular Toilet
Palo Alto, CA
4 days ago
Senior Full-Stack AI Engineer: React/Next.js & LLM Orchestration
Creative Solutions Services, LLC is seeking an experienced Full Stack AI Engineer/Developer to design and develop scalable AI applications in... ...7 years of experience with React, TypeScript, and backend development using Node.js and Python. You will work on real-time AI...
Senior
Creative Solutions Services, LLC
Santa Clara, CA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Software Development Engineer - SGLang and Inference Stack. Be the first to apply!