AI Inference Infra Tech Lead - Cloud GPU & Scale

$208.8k

ByteDance

A leading tech company in San Jose is looking for a Tech Lead Software Engineer specializing in AI Inference Infrastructure. This role entails designing container-based management systems and collaborating across teams to develop state-of-the-art inference solutions. Candidates should have significant experience in ML infrastructure and orchestration technologies like Docker and Kubernetes. This position offers an attractive salary range of $208,800 - $438,000 annually, alongside comprehensive benefits. #J-18808-Ljbffr

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the AI Inference Infra Tech Lead - Cloud GPU & Scale in San Jose, CA vacancy

AI/ML Technical Leader - Language Model Inference & AI Ops
$212.3k - $275.8k
...Join Cisco's CX AI Incubation Team as... ...Experiences, across cloud and on-prem environments... ...to large multi-GPU servers, including... ...work on cutting-edge inference optimization - speculative... ...models at scale. WhatYou'llDo... ...~On-Prem, Edge & Infra Hands-on experience...
Cloud
Full time
Temporary work
Local area
Flexible hours
3 days per week
Cisco
San Jose, CA
6 days ago
Principal AI and ML Infra Software Engineer, GPU Clusters
$272k - $431.25k
...We are seeking a Principal AI and ML Infra Software Engineer, GPU Clusters at NVIDIA to join... ...bottlenecks and lead initiatives to systematically... ...processing, model training, and inference pipelines.* Proficiency in... ...well as familiarity with cloud computing platforms (e.g.,...
Suggested
NVIDIA
Santa Clara, CA
2 days ago
Senior Software Engineer, AI Infra Compute
$212.8k - $387.6k
...building large-scale and highly available cloud infrastructure,... ...infrastructure or AI infrastructure.... ...the areas below: GPU Infra (GPU cluster management... ...frameworks, Inference engines (vLLM,... ...industry-leading public-cloud platforms... ...rapidly growing tech company. By constantly...
Cloud
Temporary work
Local area
ByteDance
San Jose, CA
1 day ago
Tech Lead, Data & Inference Engineer
...Tech Lead, Data & Inference Engineer Sunnyvale, California, United States About... ...how business brands scale demand generation and account... ...specialized vertical in Applied AI, Machine Learning, and Data... ...Exposure to Kubernetes and cloud infrastructure (AWS, GCP, or...
Cloud
Full time
Catalyst Labs, LLC
Sunnyvale, CA
4 days ago
Senior AI Inference Systems Engineer: GPU-Optimized, Cloud
$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves... ...high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming experience...
Cloud
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior Staff Software Architect, GPU Uber Tech Leads
$262k - $365k
Senior Staff Software Architect, GPU Uber Tech Leads corporate_fare Google place... ...information at massive scale, and extend well beyond web search... ...that power Google’s AI and HPC infrastructure. Your... ...critical Google services and Cloud. Your work is fundamental to...
Cloud
Full time
Google Inc.
Sunnyvale, CA
2 days ago
GPU Software Architecture Engineer, Graphics, Games, & ML
$181.1k - $318.4k
...GPU Software Architecture Engineer, Graphics... ...engineer to lead server-side ML acceleration... ...on Private Cloud Compute that enables... ...Intelligence at unprecedented scale. It will involve... ...understanding of inference workload... ...define the future of AI experiences delivered...
Cloud
Relocation
Apple
Cupertino, CA
3 days ago
Staff Software Engineer, Inference Cloud
...builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the... ...to deliver industry-leading training and inference speeds and empowers machine... ...over 10 times faster than GPU-based hyperscale cloud inference services. This...
Cloud
CEREBRAS SYSTEMS INC.
Sunnyvale, CA
5 days ago
Cloud Acceleration Engineer - DPU & AI Infra
$156k - $387.6k
...Volcano Engine Public Cloud. Our mission is... ...for cloud and AI computing. Our... ...acceleration - GPU virtualization and... ...wave of cloud-scale computing. Responsibilities... ...training and inference. - Drive end-to-... ...people. We lead with curiosity,... ...rapidly growing tech company. By...
Cloud
Temporary work
Local area
ByteDance
San Jose, CA
3 days ago
Senior Systems Software Engineer - GPU Performance at Scale
$184k - $287.5k
...unlimited potential of AI to define the next era of... ...computing. An era in which our GPU acts as the brains of... ...on GPU Performance at Scale. At NVIDIA, this role is... ...and develop new, leading solutions. Engage with HPC... ...Experience with modern cloud and container-based enterprise...
Cloud
Remote work
NVIDIA
Santa Clara, CA
3 days ago
AI Inference Performance Engineer Scale LLMs & GPU Clusters
$124k - $195.5k
...NVIDIA Corporation is seeking an AI Inference Performance Engineer - New College Grad 2026 in Santa Clara. This role involves optimizing AI inference benchmarks using NVIDIA’s accelerators and working with various teams on performance enhancements. Applicants should have...
NVIDIA
Santa Clara, CA
2 days ago
Cloud Data Center Ops Lead for AI/GPU Infra (Equity)
...Corporation is seeking a Principal Developer Operations Lead in Santa Clara, CA, to drive the global scale expansion of AI infrastructure. You will be responsible for... ...strategic capacity planning across a growing AI/GPU infrastructure portfolio. Applying your extensive...
Cloud
NVIDIA
Santa Clara, CA
2 days ago
Senior Systems Software Engineer - GPU Performance at Scale
Senior Systems Software Engineer - GPU Performance at Scale We are looking for a dedicated... ...will drive innovation in AI and GPU computing. What You’ll Be Doing Lead the implementation of performance... ...(CUDA). Experience with modern cloud and container‑based enterprise...
Cloud
NVIDIA Corporation
Santa Clara, CA
5 days ago
Staff AI/ML Fullstack Engineer - AV ML Infra
$218.8k - $335.3k
...the team: The AV ML Infra team at GM builds... ...unique demands of AI and ML innovation,... ...includes: AI Validation & Inference: Ensures robust... ...performance by running large-scale simulation... ...inference across cloud and on‑prem compute... ...infrastructure. You will lead technically complex...
Cloud
Flexible hours
General Motors
Sunnyvale, CA
3 days ago
Staff AI/ML Fullstack Engineer - AV ML Infra
...the team The AV ML Infra team builds end‑to‑... ...products to support AI and ML innovation... ...includes: AI Validation & Inference: Ensures robust... ...by running large‑scale simulation workloads... ...inference across cloud and on‑prem compute... ...Project Ownership: Lead projects from inception...
Cloud
Israelvcforum
Sunnyvale, CA
2 days ago
Principal AI/ML Engineer, AV ML Infra
$275.8k - $340.5k
...team: The AV ML Infra team at GM builds ML... ...unique demands of AI and ML innovation,... ...AI Validation & Inference: Ensures robust model... ...performance by running large-scale simulation... ...and inference across cloud and on-prem compute... .../ML Engineer will lead a growing...
Cloud
Local area
Remote work
Work from home
Relocation
Relocation package
Flexible hours
General Motors
Sunnyvale, CA
2 days ago
Tech Lead Software Engineer - AI Compute Infrastructure
$244.8k
...About the Team The Inference Infrastructure... ...plane for large-scale LLM inference.... ...computing across multi-cloud and global... ...of cloud-native, GPU-optimized... ...developers to bring AI workloads from research... ...people. We lead with curiosity,... ...rapidly growing tech company. By constantly...
Cloud
Temporary work
Local area
ByteDance
San Jose, CA
1 day ago
Senior Software Engineer II, Inference
$165k - $242k
...Senior Software Engineer II, Inference role at CoreWeave.... ...CoreWeave is The Essential Cloud for AI™. Built for pioneers by... ...innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and... ...cost-per-token analytics, GPU resource isolation)....
Cloud
Permanent employment
Temporary work
Casual work
Work at office
Remote work
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
2 days ago
Senior Software Development Engineer - LLM Inference Framework
...computing experiences-from AI and data centers, to PCs... ...member of the LLM inference framework team, you will... ...multi-node inference at scale. Your work will directly... ...strategic partners, and cloud providers) and will be upstreamed... ...systems, and GPU runtime and kernel backends...
Cloud
Advanced Micro Devices , Inc.
Santa Clara, CA
4 days ago
Software Engineer Graduate (Inference Infrastructure) - 2026 Start (PHD)
$136.8k - $259.2k
...Software Engineer Graduate (Inference Infrastructure) - 2026... ...control plane for large-scale LLM inference. We are... ...computing across multi-cloud and global datacenters.... ...external developers to bring AI workloads from research... ..., scheduling, and GPU acceleration. Responsibilities...
Cloud
Temporary work
Pangleglobal
San Jose, CA
2 days ago
Senior Software Engineer II, Inference
$165k - $242k
...Senior Software Engineer II, Inference Sunnyvale, CA /... ...CoreWeave is The Essential Cloud for AI™. Built for pioneers by... ...innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and... ...cost-per-token analytics, GPU resource isolation)....
Cloud
Permanent employment
Temporary work
Casual work
Work at office
Remote work
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
6 hours ago
Senior Software Engineer I, Inference
$139k - $204k
...Senior Software Engineer I, Inference Sunnyvale, CA /... ...CoreWeave is The Essential Cloud for AI™. Built for pioneers by... ...innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global... ...-per-token analytics, GPU resource isolation)....
Cloud
Permanent employment
Temporary work
Casual work
Work at office
Remote work
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
6 hours ago
Principal Software Engineer - Large-Scale LLM Memory and Storage Systems
$272k - $425.5k
...Software Engineer – Large-Scale LLM Memory and... ...throughput, low-latency inference framework for serving generative AI and reasoning models... ...Dynamo orchestrates GPU shards, routes... ...remote file/object/cloud storage to support large... ...integrations with leading LLM serving engines...
Cloud
Local area
Remote work
NVIDIA
Santa Clara, CA
2 days ago
Senior AI Platform Engineer - Scale LLM Infra
$168k - $322k
...NVIDIA Gruppe is seeking a Senior AI Platform Engineer to improve engineering efficiency and data security through AI-powered products. The role involves working with Cloud and AI/ML teams to build and scale infrastructure and shape the technological future of the organization...
Cloud
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior Software Engineer, AI Inference Systems
$184k - $287.5k
...engineers to join us and build AI inference systems that serve large-scale models with extreme... ...inference stacks, optimize GPU kernels and compilers, drive... ..., multi-node, and multi-cloud environments. You’ll collaborate... ...to the industry‑leading MLPerf Inference benchmarking...
Cloud
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Staff Software Engineer, Inference Cloud
...builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the... ...to deliver industry-leading training and inference speeds and empowers machine... ...over 10 times faster than GPU-based hyperscale cloud inference services. About...
Cloud
Cerebras
Sunnyvale, CA
2 days ago
Senior Staff Engineer — AI Inference & Cloud Infra
$230k - $250k
...is seeking a Sr. Member of Technical Staff in Sunnyvale, CA. This role involves designing resilient software features for cloud-based AI inference, leveraging AWS tools and services. Candidates should have a Master’s degree in Computer Science and experience with containerization...
Cloud
Cerebras Systems
Sunnyvale, CA
5 days ago
Tech Engagement Lead - Model Builder
$184k - $287.5k
...influential Generative AI Technical Engagement Lead to evangelize for,... ...This includes NVIDIA GPU architectures, DGX systems... ...NeMo frameworks, and inference libraries like... ...findings from large-scale model training and inference... ...on-premise and cloud infrastructures. Possess...
Cloud
NVIDIA
Santa Clara, CA
5 days ago
Principal Software Engineer - Rack Scale Systems Infrastructure
$272k - $431.25k
...unlimited potential of AI to define the next era... ...computing. An era in which our GPU acts as the brains of... ..., as a Principal Rack Scale Systems Infrastructure... ...NVIDIA, partners, and leading cloud and enterprise clients... ..., firmware, and infra management as one operational...
Cloud
Shift work
NVIDIA
Santa Clara, CA
2 days ago
Principal Tech Lead Manager - Embodied AI Evaluation Foundations
$296.3k
...Foundations team is a part of the Scaling Foundations team in Embodied AI and is responsible for... ...pipelines on modern cloud / GPU infrastructure, with... ...observability and cost efficiency. Lead development of... ...Consumption/Mining/Quality and Infra Foundations to turn...
Cloud
Local area
Flexible hours
General Motors
Sunnyvale, CA
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Inference Infra Tech Lead - Cloud GPU & Scale. Be the first to apply!