Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Infrastructure Engineer: Scalable GPU Inference, On-Site

Spellbrush

An innovative studio is seeking an AI Infrastructure Engineer to enhance their ML infrastructure for groundbreaking anime games. This role involves designing and implementing cutting-edge inference architectures to support various platforms. As part of a small, agile team, you will work closely with top AI researchers, driving the evolution of generative AI in the gaming industry. If you are passionate about anime aesthetics and enjoy a fast-paced environment, this opportunity will allow you to contribute to a creative movement that impacts millions of users worldwide. #J-18808-Ljbffr

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the AI Infrastructure Engineer: Scalable GPU Inference, On-Site in San Francisco, CA vacancy
  • An innovative AI company is seeking a Software Engineer to develop infrastructure that supports AI training and inference workflows. This role requires strong object-...  ...ideal candidate will work on scalable model serving, optimize multi-GPU infrastructure, and enhance system... 
    Suggested

    SpreeAI

    San Francisco, CA
    3 days ago
  •  ...AI Infrastructure Engineer Spellbrush, the world's leading generative AI studio...  ...and run our next-generation inference architecture for running all...  ...excellent understanding of GPU's handling large workloads...  ...person teams, and prefer on-site collaboration in either our... 
    Website
    Work experience placement
    Work at office
    Visa sponsorship

    Spellbrush

    San Francisco, CA
    10 hours ago
  • $220k

    Perplexity is looking for an engineer to join their team in San Francisco. You will work on building and operating the inference engine, supporting new models, migrating GPU kernels, and developing a Rust-based serving runtime. The ideal candidate has 3+ years of experience... 
    Suggested

    Perplexity

    San Francisco, CA
    2 days ago
  • Requirements Deep experience with GPU programming and...  ...professional software engineering experience with meaningful work on ML inference or high-performance...  ...models in our inference infrastructure, from weight loading, request...  ...incidents #J-18808-Ljbffr Perplexity AI
    Suggested

    Perplexity AI

    San Francisco, CA
    1 day ago
  •  ...Senior HPC & GPU Infrastructure Engineer Sciforium is an AI infrastructure company developing next-generation multimodal...  ...staff, hardware vendors, and on-site technicians for repairs, RMA...  ..., model serving optimizations, or inference systems. Hands-on experience... 
    Website
    Flexible hours

    Sciforium

    San Francisco, CA
    3 days ago
  • $325k

     ...About the Team Our Inference team brings OpenAI's...  ...state-of-the-art AI models, allowing them...  ...Role We're hiring engineers to scale and...  ...OpenAI's inference infrastructure across emerging GPU platforms. You'll work...  ...correctness, performance, and scalability of model execution... 

    OpenAI

    San Francisco, CA
    2 days ago
  • Qualifications CUDA + GPU inference optimization vLLM, SGLang, or TensorRT-LLM experience KV caching...  ...Company Luminal (YC S25) builds an AI compiler and serving stack that makes models...  ...ready with one line. Role Founding, on site in downtown SF. Ship low latency, high throughput... 
    Website

    SupportFinity™

    San Francisco, CA
    3 days ago
  • $230k - $265k

     ...Parafin is seeking a Software Engineer to lead the evolution of their ML Platform, ensuring robust and scalable systems for data scientists. The role requires 5+ years of...  ...platform functionalities, enhance real-time inference processes, and collaborate across teams to ensure... 
    Remote work

    Parafin Inc

    San Francisco, CA
    3 days ago
  • $350k

     ...A leading AI research organization seeks an Infrastructure Research Engineer in San Francisco to optimize and scale systems powering large AI models. This role emphasizes enhancing inference speed, reliability, and cost-effectiveness. Ideal candidates possess a Bachelor... 
    Visa sponsorship

    Thinking Machines Lab Inc.

    San Francisco, CA
    2 days ago
  • $229.9k - $262.4k

    Senior Lead AI Engineer (Gen AI Platform Services) Overview...  ...in technology infrastructure and world-class talent...  ...product experiences and scalable, high-performance AI...  ...large language model inference, similarity search,...  ...available through this site. Capital One... 
    Website
    Full time
    Part time
    Local area

    Capital One

    San Francisco, CA
    2 days ago
  • $230k - $360k

     ...Staff AI Infrastructure Engineer A new class of intelligence is emerging, systems...  ...rapidly scaling 10k+ GPU fleets, pushing utilization...  ...knowledge into repeatable, scalable reliability for the entire...  ...unnecessary Scaling Training & Inference Define how... 
    Immediate start

    Luma AI

    San Francisco, CA
    3 days ago
  • A leading AI fashion-tech company is seeking a Software Engineer Intern to focus on building infrastructure for AI systems. This role involves designing scalable models, developing APIs, and optimizing for performance and reliability. An ideal candidate will have a strong... 
    Internship
    Immediate start

    SpreeAI

    San Francisco, CA
    3 days ago
  •  ...) About Virtue AI Virtue AI sets...  ...As an AI infra Engineer, you will own the...  ...optimize the LLM inference pipeline; build necessary...  ...to align infrastructure and inference behavior...  ...understanding of GPU behavior (memory...  ...Availability, scalability, fault isolation... 
    Remote work

    Virtue AI

    San Francisco, CA
    3 days ago
  •  ...AI Infra Engineer We are looking for an AI Infra engineer...  ...primarily on AWS. As an AI Infrastructure Engineer, you will be...  ...closely with our Inference and Research teams to...  ...deploy, and maintain scalable Kubernetes clusters...  ...Experience managing GPU clusters and optimizing... 

    Perplexity AI

    San Francisco, CA
    5 days ago
  •  ...jobr.pro is seeking a Senior Software Engineer to join its AI Platform team in San Francisco. In this role, you will help design and build scalable infrastructure to transform AI product development and enhance agent performance. The ideal candidate will possess a strong... 

    Jobr

    San Francisco, CA
    3 days ago
  • The AI Infrastructure team at Zensors builds the engine that powers our visual sensing platform. We...  ...accelerate the training and inference of computer vision...  ...stream to enable massive scalability of our SaaS product. Data...  ...Deep understanding of GPU hardware performance , including... 

    Zensors

    San Francisco, CA
    1 day ago
  • Senior Infrastructure Engineer - Bland As a Senior Infrastructure...  ...to the design of scalable architecture by...  ...and real-time inference serving across...  ...industries. Lead - AI/ML Stack...  ...deployments. Work with Site Reliability Engineering...  ...workloads with GPU support,... 
    Website
    Temporary work

    AI Chopping Block, Inc.

    San Francisco, CA
    3 days ago
  • As an AI Engineer at Eloquent AI, you will be at the forefront of building...  ...front‑end experiences and scalable back‑end systems that power...  ...Strong knowledge of cloud infrastructure (AWS, GCP, or Azure) and...  ...Full time Location Type On‑site Department At Eloquent AI,... 
    Website
    Full time

    Eloquent AI, Inc.

    San Francisco, CA
    10 hours ago
  •  ...time. As the leading AI Time platform for professional...  ...development, and engineering—innovative, humble,...  ...already process millions of inferences per day, but to keep...  ...designing AI infrastructure, not just models (REQUIRED...  ...in‑person company off‑sites, in unique locations,... 
    Website
    Work at office
    Remote work
    Relocation package
    2 days per week

    Laurel Property Services

    San Francisco, CA
    4 days ago
  • $160k - $250k

     ...Senior Backend Engineer, Inference Platform San Francisco...  ...the Role Together AI is building the Inference...  ...efficiency, scalability, and stability of complex...  ...~ Familiarity with GPU software stacks (CUDA...  ...the next generation AI infrastructure. Compensation We... 
    Full time
    Local area

    Together AI

    San Francisco, CA
    10 hours ago
  • A tech company specializing in AI infrastructure is seeking a skilled professional to build scalable infrastructure for AI model training and inference. You will lead architectural decisions and...  ...with core systems that power their GPU optimization platform. Candidates should... 

    Wafer

    San Francisco, CA
    1 day ago
  • $250k - $380k

     ...OpenAI’s LLM training and inference infrastructure that powers frontier...  ...execution across vast GPU/accelerator fleets. By...  ...We are looking for an engineer to design and...  ...APIs, modular code, and scalable abstractions, while recognizing...  ...OpenAI OpenAI is an AI research and... 
    Full time
    Work at office
    Local area
    Relocation package
    Flexible hours

    Slope

    San Francisco, CA
    1 day ago
  • $175k - $225k

     ...Our team is led by veteran operators and engineers, alumni of Sonos, Paypal, Tesla, Apple,...  .... The Role We're looking for an AI Inference Engineer who lives at the boundary of high...  ...CUDA kernels and perform low-level GPU tuning to maximize throughput and minimize... 
    Local area
    Remote work

    Sauron

    San Francisco, CA
    4 days ago
  • Fathom is seeking a Model Performance Engineer in San Francisco to optimize the...  ..., and reliability of its model inference stack while building fine-tuning infrastructure. The ideal candidate will have...  ...of meetings, ensuring efficient GPU utilization, and debugging production... 

    Fathom

    San Francisco, CA
    10 hours ago
  •  ...Meet Eloquent AI At Eloquent AI, we...  ...class talent in AI, engineering, and product as we...  ...Engineer, AIOps & Infrastructure at Eloquent AI,...  ...building, and optimizing scalable, high-performance...  ..., optimizing GPU workloads, and ensuring...  ...serving, and inference optimizations.... 

    Eloquent AI

    San Francisco, CA
    2 days ago
  • Hamilton Barnes Associates Limited is looking for a Senior Storage Engineer to support large-scale AI infrastructure in San Francisco. This role involves designing scalable storage solutions for high-performance GPU platforms. The ideal candidate has extensive experience in... 
    Remote job

    Hamilton Barnes Associates Limited

    San Francisco, CA
    3 days ago
  • A tech company focused on AI is seeking a Site Reliability Engineer to ensure the reliability and performance of its GPU marketplace. This role involves maintaining service level objectives, managing capacity, and implementing secure systems. The ideal candidate has strong... 
    Website

    Hyperbolic Labs

    San Francisco, CA
    3 days ago
  • $180k - $300k

     ...to help them hire. AI Engineer (Full-stack)...  ...San Francisco, CA (On-site) Company Stage of...  ...client is building the infrastructure layer that teaches AI...  ...production scale Create scalable vector search infrastructure...  ...datasets Build inference serving systems... 
    Website
    H1b
    Work at office
    Remote work
    Visa sponsorship

    Recruiting from Scratch

    San Francisco, CA
    2 days ago
  •  ...Role We are hiring Software Engineers focused on AI Infrastructure to build the systems that...  ...backend engineering - including GPU orchestration, large-scale inference systems, performance optimization...  ...You will work on: Scalable model serving and inference pipelines... 
    Internship
    Immediate start

    SpreeAI

    San Francisco, CA
    3 days ago
  • As vCluster’s AI Infrastructure Specialist , you will work directly with customers...  ...journey: from bare metal GPU nodes through to a...  ...happen. As an AI Infrastructure Engineer, your role will include: Lead...  ...: Experience with inference serving, GPU scheduling, and... 
    Flexible hours

    vCluster Labs

    San Francisco, CA
    10 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Infrastructure Engineer: Scalable GPU Inference, On-Site. Be the first to apply!