Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Inference Infra Tech Lead - Cloud GPU & Scale

$208.8k

ByteDance

A leading tech company in San Jose is looking for a Tech Lead Software Engineer specializing in AI Inference Infrastructure. This role entails designing container-based management systems and collaborating across teams to develop state-of-the-art inference solutions. Candidates should have significant experience in ML infrastructure and orchestration technologies like Docker and Kubernetes. This position offers an attractive salary range of $208,800 - $438,000 annually, alongside comprehensive benefits. #J-18808-Ljbffr

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the AI Inference Infra Tech Lead - Cloud GPU & Scale in San Jose, CA vacancy
  • $212.3k - $275.8k

     ...Join Cisco's CX AI Incubation Team as...  ...Experiences, across cloud and on-prem environments...  ...to large multi-GPU servers, including...  ...work on cutting-edge inference optimization - speculative...  ...models at scale. WhatYou'llDo...  ...~On-Prem, Edge & Infra Hands-on experience... 
    Cloud
    Full time
    Temporary work
    Local area
    Flexible hours
    3 days per week

    Cisco

    San Jose, CA
    6 days ago
  • $272k - $431.25k

     ...We are seeking a Principal AI and ML Infra Software Engineer, GPU Clusters at NVIDIA to join...  ...bottlenecks and lead initiatives to systematically...  ...processing, model training, and inference pipelines.* Proficiency in...  ...well as familiarity with cloud computing platforms (e.g.,... 
    Suggested

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $212.8k - $387.6k

     ...building large-scale and highly available cloud infrastructure,...  ...infrastructure or AI infrastructure....  ...the areas below: GPU Infra (GPU cluster management...  ...frameworks, Inference engines (vLLM,...  ...industry-leading public-cloud platforms...  ...rapidly growing tech company. By constantly... 
    Cloud
    Temporary work
    Local area

    ByteDance

    San Jose, CA
    1 day ago
  •  ...Tech Lead, Data & Inference Engineer Sunnyvale, California, United States About...  ...how business brands scale demand generation and account...  ...specialized vertical in Applied AI, Machine Learning, and Data...  ...Exposure to Kubernetes and cloud infrastructure (AWS, GCP, or... 
    Cloud
    Full time

    Catalyst Labs, LLC

    Sunnyvale, CA
    4 days ago
  • $184k - $356.5k

    NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves...  ...high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming experience... 
    Cloud

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $262k - $365k

    Senior Staff Software Architect, GPU Uber Tech Leads corporate_fare Google place...  ...information at massive scale, and extend well beyond web search...  ...that power Google’s AI and HPC infrastructure. Your...  ...critical Google services and Cloud. Your work is fundamental to... 
    Cloud
    Full time

    Google Inc.

    Sunnyvale, CA
    2 days ago
  • $181.1k - $318.4k

     ...GPU Software Architecture Engineer, Graphics...  ...engineer to lead server-side ML acceleration...  ...on Private Cloud Compute that enables...  ...Intelligence at unprecedented scale. It will involve...  ...understanding of inference workload...  ...define the future of AI experiences delivered... 
    Cloud
    Relocation

    Apple

    Cupertino, CA
    3 days ago
  •  ...builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the...  ...to deliver industry-leading training and inference speeds and empowers machine...  ...over 10 times faster than GPU-based hyperscale cloud inference services. This... 
    Cloud

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    5 days ago
  • $156k - $387.6k

     ...Volcano Engine Public Cloud. Our mission is...  ...for cloud and AI computing. Our...  ...acceleration - GPU virtualization and...  ...wave of cloud-scale computing. Responsibilities...  ...training and inference. - Drive end-to-...  ...people. We lead with curiosity,...  ...rapidly growing tech company. By... 
    Cloud
    Temporary work
    Local area

    ByteDance

    San Jose, CA
    3 days ago
  • $184k - $287.5k

     ...unlimited potential of AI to define the next era of...  ...computing. An era in which our GPU acts as the brains of...  ...on GPU Performance at Scale. At NVIDIA, this role is...  ...and develop new, leading solutions. Engage with HPC...  ...Experience with modern cloud and container-based enterprise... 
    Cloud
    Remote work

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $124k - $195.5k

     ...NVIDIA Corporation is seeking an AI Inference Performance Engineer - New College Grad 2026 in Santa Clara. This role involves optimizing AI inference benchmarks using NVIDIA’s accelerators and working with various teams on performance enhancements. Applicants should have... 

    NVIDIA

    Santa Clara, CA
    2 days ago
  •  ...Corporation is seeking a Principal Developer Operations Lead in Santa Clara, CA, to drive the global scale expansion of AI infrastructure. You will be responsible for...  ...strategic capacity planning across a growing AI/GPU infrastructure portfolio. Applying your extensive... 
    Cloud

    NVIDIA

    Santa Clara, CA
    2 days ago
  • Senior Systems Software Engineer - GPU Performance at Scale We are looking for a dedicated...  ...will drive innovation in AI and GPU computing. What You’ll Be Doing Lead the implementation of performance...  ...(CUDA). Experience with modern cloud and container‑based enterprise... 
    Cloud

    NVIDIA Corporation

    Santa Clara, CA
    5 days ago
  • $218.8k - $335.3k

     ...the team: The AV ML Infra team at GM builds...  ...unique demands of AI and ML innovation,...  ...includes: AI Validation & Inference: Ensures robust...  ...performance by running large-scale simulation...  ...inference across cloud and on‑prem compute...  ...infrastructure. You will lead technically complex... 
    Cloud
    Flexible hours

    General Motors

    Sunnyvale, CA
    3 days ago
  •  ...the team The AV ML Infra team builds end‑to‑...  ...products to support AI and ML innovation...  ...includes: AI Validation & Inference: Ensures robust...  ...by running large‑scale simulation workloads...  ...inference across cloud and on‑prem compute...  ...Project Ownership: Lead projects from inception... 
    Cloud

    Israelvcforum

    Sunnyvale, CA
    2 days ago
  • $275.8k - $340.5k

     ...team: The AV ML Infra team at GM builds ML...  ...unique demands of AI and ML innovation,...  ...AI Validation & Inference: Ensures robust model...  ...performance by running large-scale simulation...  ...and inference across cloud and on-prem compute...  .../ML Engineer will lead a growing... 
    Cloud
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    2 days ago
  • $244.8k

     ...About the Team The Inference Infrastructure...  ...plane for large-scale LLM inference....  ...computing across multi-cloud and global...  ...of cloud-native, GPU-optimized...  ...developers to bring AI workloads from research...  ...people. We lead with curiosity,...  ...rapidly growing tech company. By constantly... 
    Cloud
    Temporary work
    Local area

    ByteDance

    San Jose, CA
    1 day ago
  • $165k - $242k

     ...Senior Software Engineer II, Inference role at CoreWeave....  ...CoreWeave is The Essential Cloud for AI™. Built for pioneers by...  ...innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and...  ...cost-per-token analytics, GPU resource isolation).... 
    Cloud
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours
    Shift work

    CoreWeave

    Sunnyvale, CA
    2 days ago
  •  ...computing experiences-from AI and data centers, to PCs...  ...member of the LLM inference framework team, you will...  ...multi-node inference at scale. Your work will directly...  ...strategic partners, and cloud providers) and will be upstreamed...  ...systems, and GPU runtime and kernel backends... 
    Cloud

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    4 days ago
  • $136.8k - $259.2k

     ...Software Engineer Graduate (Inference Infrastructure) - 2026...  ...control plane for large-scale LLM inference. We are...  ...computing across multi-cloud and global datacenters....  ...external developers to bring AI workloads from research...  ..., scheduling, and GPU acceleration. Responsibilities... 
    Cloud
    Temporary work

    Pangleglobal

    San Jose, CA
    2 days ago
  • $165k - $242k

     ...Senior Software Engineer II, Inference Sunnyvale, CA /...  ...CoreWeave is The Essential Cloud for AI™. Built for pioneers by...  ...innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and...  ...cost-per-token analytics, GPU resource isolation).... 
    Cloud
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours
    Shift work

    CoreWeave

    Sunnyvale, CA
    6 hours ago
  • $139k - $204k

     ...Senior Software Engineer I, Inference Sunnyvale, CA /...  ...CoreWeave is The Essential Cloud for AI™. Built for pioneers by...  ...innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global...  ...-per-token analytics, GPU resource isolation).... 
    Cloud
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours
    Shift work

    CoreWeave

    Sunnyvale, CA
    6 hours ago
  • $272k - $425.5k

     ...Software Engineer – Large-Scale LLM Memory and...  ...throughput, low-latency inference framework for serving generative AI and reasoning models...  ...Dynamo orchestrates GPU shards, routes...  ...remote file/object/cloud storage to support large...  ...integrations with leading LLM serving engines... 
    Cloud
    Local area
    Remote work

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $168k - $322k

     ...NVIDIA Gruppe is seeking a Senior AI Platform Engineer to improve engineering efficiency and data security through AI-powered products. The role involves working with Cloud and AI/ML teams to build and scale infrastructure and shape the technological future of the organization... 
    Cloud

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...engineers to join us and build AI inference systems that serve large-scale models with extreme...  ...inference stacks, optimize GPU kernels and compilers, drive...  ..., multi-node, and multi-cloud environments. You’ll collaborate...  ...to the industry‑leading MLPerf Inference benchmarking... 
    Cloud

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the...  ...to deliver industry-leading training and inference speeds and empowers machine...  ...over 10 times faster than GPU-based hyperscale cloud inference services. About... 
    Cloud

    Cerebras

    Sunnyvale, CA
    2 days ago
  • $230k - $250k

     ...is seeking a Sr. Member of Technical Staff in Sunnyvale, CA. This role involves designing resilient software features for cloud-based AI inference, leveraging AWS tools and services. Candidates should have a Master’s degree in Computer Science and experience with containerization... 
    Cloud

    Cerebras Systems

    Sunnyvale, CA
    5 days ago
  • $184k - $287.5k

     ...influential Generative AI Technical Engagement Lead to evangelize for,...  ...This includes NVIDIA GPU architectures, DGX systems...  ...NeMo frameworks, and inference libraries like...  ...findings from large-scale model training and inference...  ...on-premise and cloud infrastructures. Possess... 
    Cloud

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $272k - $431.25k

     ...unlimited potential of AI to define the next era...  ...computing. An era in which our GPU acts as the brains of...  ..., as a Principal Rack Scale Systems Infrastructure...  ...NVIDIA, partners, and leading cloud and enterprise clients...  ..., firmware, and infra management as one operational... 
    Cloud
    Shift work

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $296.3k

     ...Foundations team is a part of the Scaling Foundations team in Embodied AI and is responsible for...  ...pipelines on modern cloud / GPU infrastructure, with...  ...observability and cost efficiency. Lead development of...  ...Consumption/Mining/Quality and Infra Foundations to turn... 
    Cloud
    Local area
    Flexible hours

    General Motors

    Sunnyvale, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Inference Infra Tech Lead - Cloud GPU & Scale. Be the first to apply!