AI Inference Infrastructure Engineer

$350k

Thinking Machines Lab Inc.

A leading AI research organization seeks an Infrastructure Research Engineer in San Francisco to optimize and scale systems powering large AI models. This role emphasizes enhancing inference speed, reliability, and cost-effectiveness. Ideal candidates possess a Bachelor's in CS/Engineering, experience with deep learning frameworks, and collaborative skills in diverse teams. Competitive compensation between $350,000 and $475,000 USD is offered along with generous benefits including unlimited PTO and visa sponsorship. #J-18808-Ljbffr

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the AI Inference Infrastructure Engineer in San Francisco, CA vacancy

GPU Networking Engineer - RDMA & Distributed Inference
...A cutting-edge AI infrastructure company in San Francisco seeks an experienced network engineer to optimize high-performance networking protocols for AI models. The ideal... ...will integrate RDMA and InfiniBand into the inference stack, ensuring efficient communication and...
Suggested
BaseTen
San Francisco, CA
2 days ago
Staff ML Infrastructure Engineer: Scale Training & Inference
$300k - $430k
...is the leading conversational AI platform empowering every... ...team. About the Team The ML Infrastructure team builds the systems that... ...the routing layer that manages inference across multiple providers. We... ...hiring a Staff ML Infrastructure Engineer to own the platforms powering...
Suggested
Work at office
Decagon
San Francisco, CA
2 days ago
Machine Learning Infrastructure Engineer- Model Inference
$179k - $248k
...Machine Learning Infrastructure Engineer Join to apply for the Machine Learning Infrastructure... ...deeper understanding in healthcare. Our AI‑powered platform was purpose‑built for... ...scalable Kubernetes clusters for AI model inference and training Develop, optimize, and...
Suggested
Hourly pay
Full time
Flexible hours
Abridge
San Francisco, CA
2 days ago
ML Inference Infrastructure Engineer
A dynamic AI company is seeking an Infrastructure Software Engineer in San Francisco to build and maintain components of an ML inference platform. The successful candidate will develop infrastructure components using Python and Go, manage Kubernetes deployments, and enhance...
Suggested
Baseten
San Francisco, CA
3 days ago
ML Infrastructure Engineer - Model Inference & Scale
...A healthcare technology firm in San Francisco is seeking an ML Infrastructure Engineer, Model Inference to build and optimize AI-driven solutions. You will design scalable Kubernetes clusters, enhance ML model serving infrastructure, and collaborate with cross-functional...
Suggested
Abridge
San Francisco, CA
3 days ago
Cloud-Scale AI Inference Architect
...Francisco to enable enterprises to implement AI workloads effectively. The role involves... ...deployment architectures, solving AI inference challenges, and collaborating closely... ...candidates will have 3+ years in cloud infrastructure or DevOps, strong skills in Kubernetes,...
Flexible hours
FriendliAI
San Francisco, CA
2 days ago
AI Infrastructure Engineer Scalable Training & Inference
...An innovative AI company is seeking a Software Engineer to develop infrastructure that supports AI training and inference workflows. This role requires strong object-oriented programming skills and a solid foundation in data structures and algorithms. The ideal candidate...
SpreeAI
San Francisco, CA
2 days ago
AI Infrastructure Engineer: Scalable GPU Inference, On-Site
...An innovative studio is seeking an AI Infrastructure Engineer to enhance their ML infrastructure for groundbreaking anime games. This role involves designing and implementing cutting-edge inference architectures to support various platforms. As part of a small, agile...
Worldwide
Spellbrush
San Francisco, CA
2 days ago
Cloud Inference Engineer
...Qualifications CUDA + GPU inference optimization vLLM, SGLang, or TensorRT-LLM experience KV caching, paged attention, batching, token streaming... ...plus) No degree required Company Luminal (YC S25) builds an AI compiler and serving stack that makes models 10x faster and...
SupportFinity
San Francisco, CA
2 days ago
Staff + Sr. Software Engineer, Cloud Inference
$320k
...Staff + Sr. Software Engineer, Cloud Inference San Francisco, CA About Anthropic Anthropic... ...reliable, interpretable, and steerable AI systems. We want AI to be safe and... ...build, and own backend services and infrastructure that serve Claude across multiple CSPs...
Work at office
Visa sponsorship
Flexible hours
Anthropic
San Francisco, CA
4 days ago
Senior Backend Engineer, Inference Platform
$160k - $250k
...Senior Backend Engineer, Inference Platform San Francisco About the Role Together AI is building the Inference Platform that brings the most advanced generative... ...journey in building the next generation AI infrastructure. Compensation We offer competitive...
Full time
Local area
Together AI
San Francisco, CA
7 hours ago
Senior Backend Engineer, Inference Platform Low Latency
$160k - $250k
...A pioneering AI company in San Francisco is seeking a Senior Backend Engineer for their Inference Platform. The role involves optimizing latency, developing auto-scaling systems, and collaborating with ML researchers to scale architectures. Ideal candidates will have...
Together
San Francisco, CA
2 days ago
Staff Engineer, Scalable AI Inference Infrastructure
$200k - $400k
A leading AI technology company located in San Francisco is seeking an infrastructure engineer to build distributed systems for their AI inference engine. The role involves designing systems that ensure minimal latency and maximum reliability. Candidates should have a...
Visa sponsorship
Inferact
San Francisco, CA
4 days ago
Inference Performance TPM: AI Infrastructure Lead
A leading AI research organization in San Francisco is seeking a Technical Program Manager for Inference to bridge their systems with the broader organization. This role involves driving strategic initiatives across inference performance, coordinating launches, and ensuring...
Anthropic
San Francisco, CA
2 days ago
Software Engineer Intern (AI Infrastructure / Training / Inference)
...Software Engineer Intern (AI Infrastructure / Training / Inference) About the Role We are hiring Software Engineers focused on AI Infrastructure to build the systems that enable frontier multimodal AI to operate reliably at production scale. This role exists because modern...
Internship
Immediate start
SpreeAI
San Francisco, CA
2 days ago
Distributed Systems Engineer, Data & Inference Platform
...About Us Most AI is frozen in place - it doesn't adapt... ...into useful intelligence - the inference services that serve LLMs at scale... ...both. Researchers and ML engineers will hand you workloads that... ...Experience operating Kubernetes-based infrastructure, including custom operators...
Flexible hours
Adaption
San Francisco, CA
13 days ago
Staff Software Engineer, Inference Infrastructure
...Location Type Hybrid Department Inference Model Serving Who are we? Our... ...enterprises who are building AI systems to power magical... ...Cohere is a team of researchers, engineers, designers, and more, who are... ...running production infrastructure at a large scale Experience designing...
Full time
Work experience placement
Work at office
Remote work
Flexible hours
Jaide Health
San Francisco, CA
3 days ago
Founding Cloud Inference Engineer (Low-Latency AI Serving)
...A pioneering AI technology firm in San Francisco is seeking a founding member to optimize and serve models on Luminal Cloud. The role involves deploying models with advanced optimization techniques, conducting performance reviews, and enhancing scheduling processes. Ideal...
SupportFinity
San Francisco, CA
3 days ago
Infrastructure Engineer
$165k - $200k
...even played with ChatGPT or AI products early on), and prefer... ...You'll Do As a member of our infrastructure team, you'll be at the heart... ...—acting as an infrastructure engineer one moment, and a developer,... ...availability machine learning inference service. Collaborating with customer...
Second job
Remote work
Work from home
Relocation package
Flexible hours
Roboflow
San Francisco, CA
2 days ago
Senior Infrastructure Engineer
$120k - $200k
...Senior Infrastructure Engineer At Bland.com, our goal is to empower enterprises to make AI-phone agents at scale. Based out of San Francisco, we're a quickly growing team... ...handle real-time voice processing, scale ML inference, and integrate with enterprise telephony...
Work at office
Night shift
Bland AI
San Francisco, CA
7 hours ago
Senior HPC & GPU Infrastructure Engineer
...Senior HPC & GPU Infrastructure Engineer Sciforium is an AI infrastructure company developing next-generation multimodal AI models and a proprietary... ...Exposure to vLLM, model serving optimizations, or inference systems. Hands-on experience with configuration...
Flexible hours
Sciforium
San Francisco, CA
3 days ago
Infrastructure Engineer
...Tamarind Bio We enable any scientist to access AI-powered drug discovery. Thousands of scientists... ...released daily. About the Role We're looking for two Infrastructure Engineers to lead the scaling of our machine learning inference system. You'll be responsible for architecting...
Relocation
Tamarind Bio
San Francisco, CA
4 days ago
Infrastructure Engineer for AI Drug Design Platform
...Discovery Chai Discovery builds frontier AI models to design molecules and... ...About the role We are hiring an engineer obsessed with building systems and infrastructure that are as simple as possible... ...our product surface, model inference, and evaluation suite. You’ll work...
Full time
Work at office
Flexible hours
Menlo Ventures
San Francisco, CA
3 days ago
Infrastructure Engineer
...Infrastructure Engineer ENGINEERING | San Francisco, New York City | On-site | Full-time The Role We... ...the backbone of our enterprise-grade AI data platform. You’ll design systems that... ...to optimize infrastructure for LLM inference and training workloads and building agent...
Full time
Work at office
F2 AI
San Francisco, CA
2 days ago
Founding Infrastructure Engineer
$200k - $260k
...Rebuild Matterhaul's infrastructure and core systems from zero — AWS,... ...pipeline choices that the rest of engineering will build on for years.... ...Matterhaul is building the AI-native operating system for... ...: vector stores, GPU‑backed inference, embedding pipelines, prompt...
Full time
Work at office
Local area
Matterhaul Inc.
San Francisco, CA
2 days ago
Senior GPU Infrastructure Engineer
...Hyperbolic Labs is on a mission to democratize AI by breaking down the barriers to... ...an innovative GPU marketplace and AI inference service that promise affordability and... ...the Role We're seeking a Senior Infrastructure Engineer to help build and scale Hyperbolic's...
Remote work
Hyperbolic Labs
San Francisco, CA
1 day ago
Infrastructure Engineer
$130k - $240k
...Maxana is seeking an experienced Infrastructure Engineer for a confidential client — a fast-growing AI company. In this role you will build and maintain the platform layer supporting large-scale ML training, inference, and deployment. This is a high-impact role at the...
Flexible hours
Maxana
San Francisco, CA
3 days ago
Senior Infrastructure Engineer - AI
$150k - $200k
...Senior Infrastructure Engineer Location: On-site, San Francisco, CA (3 days/week in office) Salary: $150k – $200k + equity Industry: AI, Cloud Infrastructure What You’ll Drive Join a fast... ...impact platform reliability, ML inference performance, and the future of enterprise...
Work at office
3 days per week
Open Select
San Francisco, CA
3 days ago
Infrastructure Engineer (Storage)
$180k - $200k
...Infrastructure Engineer (Storage) Lightning AI is the company behind PyTorch Lightning. Founded in 2019, we build an end-to-end platform for developing... ...need for experimentation, training, and production inference, with security, observability, and control built in....
Remote work
Work from home
Flexible hours
Lightning AI
San Francisco, CA
4 days ago
Infrastructure Engineer (Observability)
...Infrastructure Engineer (Observability) Lightning AI is the company behind PyTorch Lightning. Founded in 2019, we build an end-to-end platform for developing... ...need for experimentation, training, and production inference, with security, observability, and control built in...
Work from home
Flexible hours
Lightning AI
San Francisco, CA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Inference Infrastructure Engineer. Be the first to apply!