AI Infrastructure Engineer: Scalable GPU Inference, On-Site

Spellbrush

An innovative studio is seeking an AI Infrastructure Engineer to enhance their ML infrastructure for groundbreaking anime games. This role involves designing and implementing cutting-edge inference architectures to support various platforms. As part of a small, agile team, you will work closely with top AI researchers, driving the evolution of generative AI in the gaming industry. If you are passionate about anime aesthetics and enjoy a fast-paced environment, this opportunity will allow you to contribute to a creative movement that impacts millions of users worldwide. #J-18808-Ljbffr

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the AI Infrastructure Engineer: Scalable GPU Inference, On-Site in San Francisco, CA vacancy

AI Infrastructure Engineer — Scalable Training & Inference
An innovative AI company is seeking a Software Engineer to develop infrastructure that supports AI training and inference workflows. This role requires strong object-... ...ideal candidate will work on scalable model serving, optimize multi-GPU infrastructure, and enhance system...
Suggested
SpreeAI
San Francisco, CA
3 days ago
AI Infrastructure Engineer
...AI Infrastructure Engineer Spellbrush, the world's leading generative AI studio... ...and run our next-generation inference architecture for running all... ...excellent understanding of GPU's handling large workloads... ...person teams, and prefer on-site collaboration in either our...
Website
Work experience placement
Work at office
Visa sponsorship
Spellbrush
San Francisco, CA
10 hours ago
Senior AI Inference Engineer - GPU, Rust & CUDA
$220k
Perplexity is looking for an engineer to join their team in San Francisco. You will work on building and operating the inference engine, supporting new models, migrating GPU kernels, and developing a Rust-based serving runtime. The ideal candidate has 3+ years of experience...
Suggested
Perplexity
San Francisco, CA
2 days ago
AI Inference Engineer — High-Performance GPU Systems
Requirements Deep experience with GPU programming and... ...professional software engineering experience with meaningful work on ML inference or high-performance... ...models in our inference infrastructure, from weight loading, request... ...incidents #J-18808-Ljbffr Perplexity AI
Suggested
Perplexity AI
San Francisco, CA
1 day ago
Senior HPC & GPU Infrastructure Engineer
...Senior HPC & GPU Infrastructure Engineer Sciforium is an AI infrastructure company developing next-generation multimodal... ...staff, hardware vendors, and on-site technicians for repairs, RMA... ..., model serving optimizations, or inference systems. Hands-on experience...
Website
Flexible hours
Sciforium
San Francisco, CA
3 days ago
Software Engineer, Inference - AMD GPU Enablement
$325k
...About the Team Our Inference team brings OpenAI's... ...state-of-the-art AI models, allowing them... ...Role We're hiring engineers to scale and... ...OpenAI's inference infrastructure across emerging GPU platforms. You'll work... ...correctness, performance, and scalability of model execution...
OpenAI
San Francisco, CA
2 days ago
Cloud Inference Engineer
Qualifications CUDA + GPU inference optimization vLLM, SGLang, or TensorRT-LLM experience KV caching... ...Company Luminal (YC S25) builds an AI compiler and serving stack that makes models... ...ready with one line. Role Founding, on site in downtown SF. Ship low latency, high throughput...
Website
SupportFinity™
San Francisco, CA
3 days ago
Senior ML Platform Engineer - Remote, Scalable Inference
$230k - $265k
...Parafin is seeking a Software Engineer to lead the evolution of their ML Platform, ensuring robust and scalable systems for data scientists. The role requires 5+ years of... ...platform functionalities, enhance real-time inference processes, and collaborate across teams to ensure...
Remote work
Parafin Inc
San Francisco, CA
3 days ago
AI Inference Infrastructure Engineer
$350k
...A leading AI research organization seeks an Infrastructure Research Engineer in San Francisco to optimize and scale systems powering large AI models. This role emphasizes enhancing inference speed, reliability, and cost-effectiveness. Ideal candidates possess a Bachelor...
Visa sponsorship
Thinking Machines Lab Inc.
San Francisco, CA
2 days ago
Senior Lead AI Engineer (Gen AI Platform Services)
$229.9k - $262.4k
Senior Lead AI Engineer (Gen AI Platform Services) Overview... ...in technology infrastructure and world-class talent... ...product experiences and scalable, high-performance AI... ...large language model inference, similarity search,... ...available through this site. Capital One...
Website
Full time
Part time
Local area
Capital One
San Francisco, CA
2 days ago
Staff AI Infrastructure Engineer
$230k - $360k
...Staff AI Infrastructure Engineer A new class of intelligence is emerging, systems... ...rapidly scaling 10k+ GPU fleets, pushing utilization... ...knowledge into repeatable, scalable reliability for the entire... ...unnecessary Scaling Training & Inference Define how...
Immediate start
Luma AI
San Francisco, CA
3 days ago
AI Infrastructure Engineer Intern — Training & Inference
A leading AI fashion-tech company is seeking a Software Engineer Intern to focus on building infrastructure for AI systems. This role involves designing scalable models, developing APIs, and optimizing for performance and reliability. An ideal candidate will have a strong...
Internship
Immediate start
SpreeAI
San Francisco, CA
3 days ago
AI Infrastructure Engineering (Cloud, DevOps)
...) About Virtue AI Virtue AI sets... ...As an AI infra Engineer, you will own the... ...optimize the LLM inference pipeline; build necessary... ...to align infrastructure and inference behavior... ...understanding of GPU behavior (memory... ...Availability, scalability, fault isolation...
Remote work
Virtue AI
San Francisco, CA
3 days ago
Member of Technical Staff (AI Infrastructure Engineer)
...AI Infra Engineer We are looking for an AI Infra engineer... ...primarily on AWS. As an AI Infrastructure Engineer, you will be... ...closely with our Inference and Research teams to... ...deploy, and maintain scalable Kubernetes clusters... ...Experience managing GPU clusters and optimizing...
Perplexity AI
San Francisco, CA
5 days ago
Senior AI Platform Engineer - Scalable AI Infra
...jobr.pro is seeking a Senior Software Engineer to join its AI Platform team in San Francisco. In this role, you will help design and build scalable infrastructure to transform AI product development and enhance agent performance. The ideal candidate will possess a strong...
Jobr
San Francisco, CA
3 days ago
AI/ML Infrastructure Engineer
The AI Infrastructure team at Zensors builds the engine that powers our visual sensing platform. We... ...accelerate the training and inference of computer vision... ...stream to enable massive scalability of our SaaS product. Data... ...Deep understanding of GPU hardware performance , including...
Zensors
San Francisco, CA
1 day ago
AI DevOps Engineer Jobs
Senior Infrastructure Engineer - Bland As a Senior Infrastructure... ...to the design of scalable architecture by... ...and real-time inference serving across... ...industries. Lead - AI/ML Stack... ...deployments. Work with Site Reliability Engineering... ...workloads with GPU support,...
Website
Temporary work
AI Chopping Block, Inc.
San Francisco, CA
3 days ago
AI Engineer, Platform
As an AI Engineer at Eloquent AI, you will be at the forefront of building... ...front‑end experiences and scalable back‑end systems that power... ...Strong knowledge of cloud infrastructure (AWS, GCP, or Azure) and... ...Full time Location Type On‑site Department At Eloquent AI,...
Website
Full time
Eloquent AI, Inc.
San Francisco, CA
10 hours ago
Staff AI Platform Engineer
...time. As the leading AI Time platform for professional... ...development, and engineering—innovative, humble,... ...already process millions of inferences per day, but to keep... ...designing AI infrastructure, not just models (REQUIRED... ...in‑person company off‑sites, in unique locations,...
Website
Work at office
Remote work
Relocation package
2 days per week
Laurel Property Services
San Francisco, CA
4 days ago
Senior Backend Engineer, Inference Platform
$160k - $250k
...Senior Backend Engineer, Inference Platform San Francisco... ...the Role Together AI is building the Inference... ...efficiency, scalability, and stability of complex... ...~ Familiarity with GPU software stacks (CUDA... ...the next generation AI infrastructure. Compensation We...
Full time
Local area
Together AI
San Francisco, CA
10 hours ago
Technical Staff Lead, AI Inference & GPU Infra
A tech company specializing in AI infrastructure is seeking a skilled professional to build scalable infrastructure for AI model training and inference. You will lead architectural decisions and... ...with core systems that power their GPU optimization platform. Candidates should...
Wafer
San Francisco, CA
1 day ago
Data Infrastructure Engineer — GPU-Scale Datasets & APIs
$250k - $380k
...OpenAI’s LLM training and inference infrastructure that powers frontier... ...execution across vast GPU/accelerator fleets. By... ...We are looking for an engineer to design and... ...APIs, modular code, and scalable abstractions, while recognizing... ...OpenAI OpenAI is an AI research and...
Full time
Work at office
Local area
Relocation package
Flexible hours
Slope
San Francisco, CA
1 day ago
AI Inference Engineer
$175k - $225k
...Our team is led by veteran operators and engineers, alumni of Sonos, Paypal, Tesla, Apple,... .... The Role We're looking for an AI Inference Engineer who lives at the boundary of high... ...CUDA kernels and perform low-level GPU tuning to maximize throughput and minimize...
Local area
Remote work
Sauron
San Francisco, CA
4 days ago
AI Inference Performance Engineer
Fathom is seeking a Model Performance Engineer in San Francisco to optimize the... ..., and reliability of its model inference stack while building fine-tuning infrastructure. The ideal candidate will have... ...of meetings, ensuring efficient GPU utilization, and debugging production...
Fathom
San Francisco, CA
10 hours ago
AI Engineer, AIOps & Infrastructure
...Meet Eloquent AI At Eloquent AI, we... ...class talent in AI, engineering, and product as we... ...Engineer, AIOps & Infrastructure at Eloquent AI,... ...building, and optimizing scalable, high-performance... ..., optimizing GPU workloads, and ensuring... ...serving, and inference optimizations....
Eloquent AI
San Francisco, CA
2 days ago
Senior AI Storage Engineer - Remote GPU HPC Infra
Hamilton Barnes Associates Limited is looking for a Senior Storage Engineer to support large-scale AI infrastructure in San Francisco. This role involves designing scalable storage solutions for high-performance GPU platforms. The ideal candidate has extensive experience in...
Remote job
Hamilton Barnes Associates Limited
San Francisco, CA
3 days ago
Senior Site Reliability Engineer - AI Cloud & GPU Infra
A tech company focused on AI is seeking a Site Reliability Engineer to ensure the reliability and performance of its GPU marketplace. This role involves maintaining service level objectives, managing capacity, and implementing secure systems. The ideal candidate has strong...
Website
Hyperbolic Labs
San Francisco, CA
3 days ago
AI Engineer (Full-stack)
$180k - $300k
...to help them hire. AI Engineer (Full-stack)... ...San Francisco, CA (On-site) Company Stage of... ...client is building the infrastructure layer that teaches AI... ...production scale Create scalable vector search infrastructure... ...datasets Build inference serving systems...
Website
H1b
Work at office
Remote work
Visa sponsorship
Recruiting from Scratch
San Francisco, CA
2 days ago
Software Engineer Intern (AI Infrastructure / Training / Inference)
...Role We are hiring Software Engineers focused on AI Infrastructure to build the systems that... ...backend engineering - including GPU orchestration, large-scale inference systems, performance optimization... ...You will work on: Scalable model serving and inference pipelines...
Internship
Immediate start
SpreeAI
San Francisco, CA
3 days ago
AI Infrastructure Engineer
As vCluster’s AI Infrastructure Specialist , you will work directly with customers... ...journey: from bare metal GPU nodes through to a... ...happen. As an AI Infrastructure Engineer, your role will include: Lead... ...: Experience with inference serving, GPU scheduling, and...
Flexible hours
vCluster Labs
San Francisco, CA
10 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Infrastructure Engineer: Scalable GPU Inference, On-Site. Be the first to apply!