Edge Transformer Inference Tech Lead

OpenAI

A leading AI research firm in San Francisco is seeking a Technical Lead to join its Future of Computing Research team. This role involves evaluating silicon platforms and optimizing model architectures while working in a hybrid model. Ideal candidates have expertise in evaluating workloads on accelerators, understanding transformer models, and leading teams focused on performance-critical software. The position offers relocation assistance and is centered on deploying cutting-edge AI technology responsibly and effectively. #J-18808-Ljbffr OpenAI

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the Edge Transformer Inference Tech Lead in San Francisco, CA vacancy

Edge Inference Developer Tooling Founder
$250k
...Edge AI is a production requirement across automotive, robotics... ...models are doing in the field. Inference latency, memory pressure,... ...the resources needed to build transformational companies. We've launched 17... ...Ability to attract, hire and lead world‑class teams. Demonstrated...
Suggested
Forum Ventures
San Francisco, CA
4 days ago
Research Intern, Inference (Fall 2026)
$58 - $63 per hour
...Research Intern, Inference (Fall 2026) San Francisco About The... ...critical intersection of cutting-edge model architectures, high-... ...Python Familiarity with Transformer architectures and recent developments... ...systems. Publications at leading conferences in machine...
Transformer
Hourly pay
Internship
Together AI
San Francisco, CA
18 hours ago
ML Inference Engineer San Francisco · Engineering · Full Time →
We're looking for an ML Inference Engineer with deep expertise in high-performance ML engineering... ..., and shaping Reactor's competitive edge in ultra-low-latency, high-throughput... ...hardware (NVIDIA) Strong understanding of transformer architectures and modern ML optimization...
Transformer
Full time
Visa sponsorship
Relocation package
Reactor
San Francisco, CA
18 hours ago
Member of Technical Staff (AI Inference Engineer)
$220k
We build and run the inference engine behind every Perplexity query and deploy dozens of model architectures at scale with tight latency... ...Of Real Work The Team Does New models support. Support transformer-based retrieval, text-generation, and multimodal models in our...
Transformer
Perplexity
San Francisco, CA
18 hours ago
Founding Engineer, ML Inference
...'re looking for a Founding Engineer, ML Inference with deep expertise in high-performance... ...performance, and shaping the competitive edge in ultra-low-latency, high-throughput environments... ...as needed Strong understanding of transformer architectures and modern ML model...
Transformer
Relocation
Visa sponsorship
Relocation package
Reactor
San Francisco, CA
1 day ago
Inference Engineer
...team to build and ship cutting edge models and experiences. We're funded by leading investors at Index Ventures and... ...About the Role We're hiring an Inference Engineer to advance our mission... ...cutting edge foundation models using Transformers, SSMs and hybrid models. Work...
Transformer
Work at office
Visa sponsorship
Flexible hours
Cartesia, Inc.
San Francisco, CA
3 days ago
Founding Software Engineer, Robot Learning
...fast data flywheel across multiple data modalities Deep debug failure modes in transformer and diffusion policy field deployments Optimize policies for real-time (~10hz) inference on edge hardware What you bring Experience deploying robot policies on hardware. No preference...
Transformer
Temporary work
Kovari
San Francisco, CA
4 days ago
Member of Technical Staff - Edge Inference Engineer
...exceptional people to help us get there. The Opportunity Our Edge Inference team compiles Liquid Foundation Models into optimized machine... ...device AI possible. You will work directly with the technical lead on problems that require deep understanding of both ML architectures...
Liquid AI
San Francisco, CA
2 days ago
Onboard AV Software Engineer
...and low-level systems engineering. Beyond inference, you'll profile and optimize the entire... ...with ML model architectures (transformers, CNNs) and the ability to reason about computational... ...autonomous vehicles, robotics, or IoT/edge devices Deep knowledge of CUDA, TensorRT...
Transformer
Local area
Humble Robotics
San Francisco, CA
18 hours ago
Software Engineer - GPU Kernels
...Engineer Baseten powers mission-critical inference for the world's most dynamic AI... ...at the frontier of AI to bring cutting-edge models into production. We're growing quickly... ...Nice to Have: Experience with Transformer models and attention optimization (e.g.,...
Transformer
Flexible hours
BaseTen
San Francisco, CA
1 day ago
Software Engineer- BIS (Baseten Inference Stack)
...BASETEN Baseten powers mission-critical inference for the world's most dynamic AI companies... ...operating at the frontier of AI to bring cutting-edge models into production. We're growing... ...cutting‑edge LLM models with industry-leading performance, scalability, reliability, and...
Flexible hours
The Consensus
San Francisco, CA
18 hours ago
LLM Inference Frameworks and Optimization Engineer
...state-of-the-art infrastructure to enable efficient and scalable inference for large language models (LLMs). Our mission is to optimize... ...inference. Optimization Techniques: Deep understanding of Transformer architectures and LLM/VLM/Diffusion model optimization . Knowledge...
Transformer
Gravity Engineering Services Pvt Ltd.
San Francisco, CA
2 days ago
Member of Technical Staff, Inference
$200k - $400k
About The Role We're looking for an inference runtime engineer to push the boundaries of what's possible in LLM and diffusion model serving... ...science, engineering, or similar. Deep understanding of transformer architectures and their variants. Strong programming skills in...
Transformer
Remote work
Visa sponsorship
Shift work
Inferact
San Francisco, CA
3 days ago
Member of Technical Staff, Inference
Member of Technical Staff — ML Systems & Inference Employment Type: Full-time Workplace: On... ...or low-level optimization. We work with leading AI labs, hyperscalers, and AI-native... ...architectures, attention mechanisms, and transformer inference behavior Experience with batching...
Transformer
Full time
Acceler8 Talent
San Francisco, CA
2 days ago
Staff+ Software Engineer, Inference Runtime
$405k
About the role Anthropic's Inference organization serves Claude to millions... ...Engineer to be a technical lead for Inference Runtime: the... ...their own specialization, and edge cases stitch back into the core... ...scheduling environments Prior tech lead experience on a developer...
Work at office
Visa sponsorship
Flexible hours
jobr.pro
San Francisco, CA
4 days ago
Software Engineer Intern (AI Infrastructure / Training / Inference)
...Engineer Intern (AI Infrastructure / Training / Inference) About the Role We are hiring Software... ...and brand at the forefront of fashion‑tech innovation. Your design work will... ...love of design, luxury fashion, and cutting‑edge tech, you’ll have the freedom to do it here...
Internship
Immediate start
SpreeAI
San Francisco, CA
18 hours ago
Software Engineer - ML Infrastructure
...machine learning systems that power real-time perception and inference across our edge-cloud platform. This role owns the training, deployment,... ...architectures and their deployment tradeoffs (YOLO, transformers, CNNs, real-time detection/tracking). Hands-on experience...
Transformer
Specter Services LLC
San Francisco, CA
3 days ago
CPU Storage Tech Lead
$342k
...infrastructure execution-translating cutting‑edge compute roadmaps into scalable,... ...We are seeking a CPU & Storage Technical Lead to define and drive the server compute and... ...storage systems are optimized for training, inference, and supporting services. You will work...
Local area
OpenAI
San Francisco, CA
4 days ago
Senior Software Engineer - ML/CV Infrastructure
...cloud deployment — ensuring our cutting-edge computer vision and multi-modal AI systems... ...optimize model serving for low-latency inference at scale. You\'ll work closely with our... ...learning models such as auto-regressive transformers and familiarity with inference optimization...
Transformer
Work at office
3 days per week
Claryo
San Francisco, CA
1 day ago
ML Software Engineer
...Vision encoders, etc.) onto edge devices, especially mobile NPUs... ..., memory, power/thermal), lead model-side optimization strategy... ...with at least one of: LLM inference optimization (quantization, attention... ...understanding across transformers / conformers / diffusion-vocoders...
Transformer
Full time
CAPSA
San Francisco, CA
3 days ago
Staff Software Engineer, Inference Infrastructure
...Montreal Employment Type Full time Location Type Hybrid Department Inference Model Serving Who are we? Our mission is to scale intelligence... ...and work environment Work closely with a team on the cutting edge of AI research Weekly lunch stipend, in-office lunches & snacks...
Full time
Work experience placement
Work at office
Remote work
Flexible hours
Jaide Health
San Francisco, CA
3 days ago
Tech Lead: Model Serving & Inference Performance
Dormont Manufacturing Co is looking for a Tech Lead / Staff-Level Engineer in San Francisco, CA. In this role, you will drive improvements in model serving efficiency, mentor junior engineers, and enhance system performance to support multimodal AI applications. The ideal...
Relocation package
Dormont Manufacturing Co
San Francisco, CA
18 hours ago
Technical Staff Lead, AI Inference & GPU Infra
A tech company specializing in AI infrastructure is seeking a skilled professional to build scalable infrastructure for AI model training and inference. You will lead architectural decisions and work with core systems that power their GPU optimization platform. Candidates...
Wafer
San Francisco, CA
3 days ago
Lead Kubernetes & GitOps Engineer for GPU Inference
A cutting-edge AI infrastructure startup is seeking a Kubernetes DevOps Engineer to join their innovative team in San Francisco. The role involves building and maintaining production-grade Kubernetes clusters across various environments, focusing on high-performance GPU...
Jack & Jill/External ATS
San Francisco, CA
3 days ago
Technical Lead Manager (Physical AI)
$248.8k - $311k
...automation. Role Overview As the Technical Lead Manager (TLM) for the Physical AI team of... ...you will bridge the gap between cutting‑edge Machine Learning research and physical... ...proficiency in PyTorch, with deep knowledge of Transformer architectures, Attention mechanisms, and...
Transformer
Full time
aijoblist
San Francisco, CA
2 days ago
Applied AI Inference Engineer
ABOUT BASETEN Baseten powers mission‑critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge... ...companies operating at the frontier of AI to bring cutting‑edge models into production. We're growing quickly and recently...
Work experience placement
Flexible hours
Baseten
San Francisco, CA
18 hours ago
AI Infrastructure Engineer — Scalable Training & Inference
...Engineer to develop infrastructure that supports AI training and inference workflows. This role requires strong object-oriented... ...infrastructure, and enhance system reliability. Join a cutting-edge team revolutionizing the fashion and e-commerce landscape. #J-1...
SpreeAI
San Francisco, CA
18 hours ago
Lead AI Engineer (FM Hosting, LLM Inference)
$197.3k - $225.1k
...Lead AI Engineer (FM Hosting, LLM Inference) Overview At Capital One, we are creating responsible and reliable AI systems, changing banking for... ...infrastructure. At Capital One, you will help bring the transformative power of emerging AI capabilities to reimagine how...
Full time
Part time
Local area
Capital One
San Francisco, CA
1 day ago
Engineering Manager, Model Inference
The Role Our generative AI-powered products are transforming the practice of medicine—and the inference systems that power them need to be fast, reliable, and world... ...-class. We’re looking for an Engineering Manager to lead and grow our Model Inference team. The Inference...
Transformer
Hourly pay
Full time
Flexible hours
AI Chopping Block, Inc.
San Francisco, CA
18 hours ago
LLM Post-Training Researcher: Distillation & Efficiency
$250k - $350k
A cutting-edge AI research startup in San Francisco is seeking a talented individual... ...training using PyTorch, strong knowledge of transformer architectures, and familiarity with... ...compensation between $250,000 - $350,000 plus equity and benefits. #J-18808-Ljbffr Inference
Transformer
Inference
San Francisco, CA
18 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Edge Transformer Inference Tech Lead. Be the first to apply!