Engineering Manager (AI Inference)

Perplexity

About the Role

We are looking for an Inference Engineering Manager to lead our AI Inference team. This is a unique opportunity to build and scale the infrastructure that powers Perplexity's products and APIs, serving millions of users with state-of-the-art AI capabilities.

You will own the technical direction and execution of our inference systems while building and leading a world-class team of inference engineers. Our current stack includes Python, PyTorch, Rust, C++, and Kubernetes. You will help architect and scale the large-scale deployment of machine learning models behind Perplexity's Comet, Sonar, Search, Deep Research products.
Why Perplexity?

Build SOTA systems that are the fastest in the industry with cutting-edge technology
High-impact work on a smaller team with significant ownership and autonomy
Opportunity to build 0-to-1 infrastructure from scratch rather than maintaining legacy systems
Work on the full spectrum: reducing cost, scaling traffic, and pushing the boundaries of inference
Direct influence on technical roadmap and team culture at a rapidly growing company

Responsibilities

Lead and grow a high-performing team of AI inference engineers
Develop APIs for AI inference used by both internal and external customers
Architect and scale our inference infrastructure for reliability and efficiency
Benchmark and eliminate bottlenecks throughout our inference stack
Drive large sparse/MoE model inference at rack scale, including sharding strategies for massive models
Push the frontier with building inference systems to support sparse attention, disaggregated pre-fill/decoding serving, etc.
Improve the reliability and observability of our systems and lead incident response
Own technical decisions around batching, throughput, latency, and GPU utilization
Partner with ML research teams on model optimization and deployment
Recruit, mentor, and develop engineering talent
Establish team processes, engineering standards, and operational excellence

Qualifications

5+ years of engineering experience with 2+ years in a technical leadership or management role
Deep experience with ML systems and inference frameworks (PyTorch, TensorFlow, ONNX, TensorRT, vLLM)
Strong understanding of LLM architecture: Multi-Head Attention, Multi/Grouped-Query Attention, and common layers
Experience with inference optimizations: batching, quantization, kernel fusion, FlashAttention
Familiarity with GPU characteristics, roofline models, and performance analysis
Experience deploying reliable, distributed, real-time systems at scale
Track record of building and leading high-performing engineering teams
Experience with parallelism strategies: tensor parallelism, pipeline parallelism, expert parallelism
Strong technical communication and cross-functional collaboration skills

Nice to Have

Experience with CUDA, Triton, or custom kernel development
Background in training infrastructure and RL workloads
Experience with Kubernetes and container orchestration at scale
Published work or contributions to inference optimization research

Apply

Vacancy posted 6 hours ago

Similar jobs that could be interesting for youBased on the Engineering Manager (AI Inference) in San Francisco, CA vacancy

Engineering Manager, Inference Routing and Performance
$405k
...interpretable, and steerable AI systems. We want AI to be safe... ...of committed researchers, engineers, policy experts, and business... ...shouldn't have been shed. The Inference Routing team owns this layer.... ...Have 5+ years of engineering management experience, ideally with at...
Suggested
Work at office
Visa sponsorship
Flexible hours
Shift work
Anthropic
San Francisco, CA
2 days ago
Engineering Manager, API Core
$405k
...interpretable, and steerable AI systems. We want AI to be safe... ...of committed researchers, engineers, policy experts, and business... ...that sits in front of every inference call Anthropic serves. As Claude... ...request multiplexing, connection management), rate limiting and...
Suggested
Temporary work
Work at office
Visa sponsorship
Flexible hours
Anthropic
San Francisco, CA
5 hours ago
Engineering Manager, AI DocGen
...close the justice gap using technology and AI. We empower personal injury lawyers and... ...lasting impact. Learn more at Life as an Engineer at EvenUp Location & Work Model... ...in personal injury law. As Engineering Manager for Document Generation, you will lead a...
Suggested
Full time
Temporary work
Work at office
Local area
Home office
Flexible hours
3 days per week
EvenUp Inc.
San Francisco, CA
5 hours ago
Engineering Manager (AI Products)
Job Title Disabled veteran A veteran who served on active duty in the U.S. military and is entitled to disability compensation (or who but for the receipt of military retired pay would be entitled to disability compensation) under laws administered by the Secretary of...
Suggested
GEMÜ
San Francisco, CA
8 hours ago
Senior Engineering Manager, Developer Productivity
$232.5k - $325.5k
...'s largest sources of information. For more information, visit We are looking for a Senior Engineering Manager, Agents & Developer Productivity to lead teams building AI-assisted engineering workflows. This role will partner with engineering leaders in infrastructure...
Suggested
For contractors
Work experience placement
Flexible hours
Shift work
Reddit
San Francisco, CA
8 hours ago
Engineering Lead / Manager, KYC (Relocation to Singapore)
...more - with fully integrated solutions to manage everything from business accounts,... ...you "get stuff done" end-to-end. You use AI to work smarter and solve problems faster... ...operations. What you'll do As an Engineering Lead / Manager in the KYC team, you will...
Worldwide
Relocation
Airwallex
San Francisco, CA
4 days ago
Principal Machine Learning Engineer
$275k - $350k
...Principal Machine Learning Engineer San Francisco, CA About... ...Scientific builds and commercializes AI agents for science.... ...Build efficient and flexible inference infrastructure, supporting complex... ...with leveraging and managing distributed computing resources...
Work at office
Flexible hours
Edison Scientific Inc.
San Francisco, CA
1 day ago
Principal Machine Learning Engineer
...Principal Software Engineer For Backend Development Our client... ...infrastructure behind their AI-powered game generation tools... ...learning, model training, and inference workflows, with a focus on real... ...tools such as Kubernetes for managing scalable infrastructures. ~...
Remote work
Flexible hours
NxT Level
San Francisco, CA
3 days ago
Engineering Manager, Cloud Platform
...Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence... ...Join us and help build the platform engineers turn to to ship AI products. THE ROLE As the Engineering Manager for Baseten's Cloud Platform team,...
Flexible hours
Baseten
San Francisco, CA
4 days ago
Senior Engineering Manager, Infrastructure EngineeringNew York, NY
...Senior Engineering Manager (M8) Rippling's Infrastructure organization owns the mission-critical systems that power our entire product ecosystem... ...a 24M+ line monolith) while driving forward modern, AI-powered development workflows. You will be responsible for...
Work at office
Local area
3 days per week
MyHealthTeam
San Francisco, CA
3 days ago
VP of Product and Engineering
$200k - $250k
...Series A funding but are looking for a VP of Engineering and Product Co-Pilot to the CEO to join... ...with a new trust-based protocol and AI-driven enforcement.Role OverviewAs VP of... ...mentor, and inspire exceptional product managers and engineers.Build a culture of collaboration...
Remote work
80Twenty
San Francisco, CA
4 days ago
Engineering Manager, Product Engineering
$280k - $300k
...Engineering Manager, Product Engineering At Instabase, we're passionate about democratizing access to cutting-edge AI innovation to enable any organization to solve previously unsolvable unstructured data problems in their industry. With customers representing some...
Work at office
Flexible hours
Instabase
San Francisco, CA
4 days ago
Engineering Manager, Identity Infrastructure
$293k - $385k
About the Team Identity is the foundation of trust in an AI-powered world. As people build a rich personal and organizational... ...on by many internal teams. About the Role As an Engineering Manager, Identity Infrastructure, you will lead a team owning foundational...
Work at office
Relocation package
Shift work
OpenAI
San Francisco, CA
3 days ago
Engineering Manager, Developer Infrastructure
$220k - $300k
...monitoring standard and our team is building its AI-native future. About the role Sentry's... ...s most critical force multipliers. Every engineer at Sentry depends on the tooling we build... ...lifecycle works. As the Engineering Manager for Dev Infra, you'll lead a talented...
Hourly pay
Remote work
Shift work
Sentry
San Francisco, CA
8 hours ago
Principal Data Engineer
$197.3k - $313.7k
...Salesforce Salesforce is the #1 AI CRM, where humans with agents... ...data modeler to build and manage the data model(s) for our... ...workloads, including feature engineering for ML models and real-time... ...training pipelines, and real-time inference. ~ A proven track record...
Work at office
Salesforce
San Francisco, CA
2 days ago
Principal/Staff HPC Network Engineer
...year contracts for computer and inference, but sell to customers on... ...the market? Otherwise, as AI scales, compute only becomes... ...shape culture, mentor junior engineers, and learn from our customers... ...of experience with hands-on management or architecture with network...
Long term contract
Contract work
Fixed term contract
Work at office
Local area
Visa sponsorship
Shift work
3 days per week
SF Compute
San Francisco, CA
1 day ago
Principal ML Research Engineer
...LILA Lila is building a platform where AI and automation co-evolve to solve the... ...We are seeking a Principal ML Research Engineer to be the founding engineering leader on... ...models, shared specialist-model serving and inference, agentic infrastructure, and the...
Lila Sciences
San Francisco, CA
2 days ago
Machine Learning Engineering Manager, Recommendations
...fastest growing consumer entertainment company and the leader in AI music. We are backed by leading investors including Menlo... ...and technical direction Partner with leaders across product, engineering, and research to decide how recommendations evolve with our platform...
Full time
Work at office
Local area
SUNO
San Francisco, CA
4 days ago
Technical Support Engineering Manager
$151k - $176k
...Technical Support Engineering Manager Merge is the leading provider of agentic tools and customer-facing integrations for frontier LLMs, Fortune... ...with a single API, and Merge Agent Handler, which empowers AI agents with secure access to thousands of third-party tools....
Work at office
Home office
Merge LLC
San Francisco, CA
1 day ago
Senior Engineering Manager, Data Engineering
$269k - $316k
...Senior Engineering Manager, Data Engineering Denver, Colorado, United States; San Francisco, California, United States Checkr is building... ...140,000 companies and millions of people rely on Checkr for AI verification in the moments that matter most: getting a new job...
Work at office
Local area
Remote work
Relocation
Flexible hours
3 days per week
Checkr
San Francisco, CA
3 days ago
Engineering Manager, Site Reliability Engineering
About the Role We're hiring an Engineering Managers for our Site Reliability Engineering organization... ...to lead the team that keeps Together AI's production infrastructure running.... ...-metal / day-0 / day-2 operations, our inference platform, and our virtual clusters platform...
Full time
Work at office
Relocation
Shift work
Together AI
San Francisco, CA
18 hours ago
Director, AI Engineering
$293.6k - $335.1k
...Director, AI Engineering Overview: At Capital One, we are creating responsible and reliable... ..., overseeing the development, and managing the growth of an organization's autonomous... ...algorithms or technologies (e.g. LLM Inference, Similarity Search and VectorDBs, Guardrails...
Full time
Part time
Local area
Capital One
San Francisco, CA
6 hours ago
Senior Engineering Manager, Security
$264k - $300k
...how we operate. We partner closely with Engineering, Product, IT, Legal, and Compliance to build... ...scale. We are seeking an Engineering Manager, Security to lead and grow our Security... ...posture. ~ Demonstrates curiosity about AI tools and emerging technologies, with a willingness...
Work at office
Local area
Work from home
Worldwide
Asana
San Francisco, CA
2 days ago
Principal Machine Learning & Data Engineer
$184.5k - $230.7k
...use Artificial Intelligence (AI) to help make our hiring process... ...L5 Machine Learning & Data Engineer to lead the design, build, and... ...architectures on AWS, including Terraform-managed infrastructure. ~ Deep... ...encryption) or on-device inference. Background in conversational...
Local area
Remote work
Worldwide
Twilio
San Francisco, CA
2 days ago
Engineering Manager, Application Security
$248k - $279k
...Be Doing Lead a team of security engineers who will build and implement application... ...Discord's product engineering and product management teams to champion new security features for... ...Rust, Go). You have development-with-AI experience and a good grasp of AI...
Full time
Relocation
Relocation package
Discord
San Francisco, CA
4 days ago
Senior Analytics Engineering Manager
$204k - $301k
...Your Future Neighbors The Analytics Engineering team at Nextdoor transforms diverse data... .... At Nextdoor, we operate in an AI-first environment and expect every team member... ...As a Senior Analytics Engineering Manager , you will Develop and own the...
Work at office
Local area
Work from home
Shift work
Nextdoor
San Francisco, CA
3 days ago
Engineering Manager
...Machine Learning Engineer Chalk is building the data platform that powers the future of machine learning applications. We tear down... ...of exceptional engineers building our platform for real-time inference at scale. This role is a unique opportunity to shape technical...
Work at office
Flexible hours
CHALK INC
San Francisco, CA
5 hours ago
Engineering Manager, GPU (ML Accelerator)
...Engineering Manager, GPU (ML Accelerator) San Francisco, CA | New York City, NY | Seattle, WA... ...reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial... ...use of our compute resources, be it inference or training. As an Engineering Manager...
Work at office
Visa sponsorship
Flexible hours
Anthropic
San Francisco, CA
5 days ago
Engineering Manager, Multimodal (API)
$293k - $385k
...all of humanity. Our API is the industry's most widely adopted AI platform, empowering startups, indie developers, and Fortune... ...and at scale. About the Role: We are seeking an Engineering Manager to lead our multimodal API product suite. Your team will be responsible...
Work at office
OpenAI
San Francisco, CA
5 hours ago
Director, GTM Engineering, Hazel AI
$250k - $300k
...Altruist is transforming the multi-trillion dollar wealth management industry by building an AI platform for wealth professionals. We partner with... ...obstacles. About Hazel Hazel is building the AI engine for wealth management that helps unlock 10x growth, efficiency...
Work at office
Immediate start
Altruist
San Francisco, CA
7 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Engineering Manager (AI Inference). Be the first to apply!