Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Engineering Manager (AI Inference)

Perplexity

About the Role

We are looking for an Inference Engineering Manager to lead our AI Inference team. This is a unique opportunity to build and scale the infrastructure that powers Perplexity's products and APIs, serving millions of users with state-of-the-art AI capabilities.

You will own the technical direction and execution of our inference systems while building and leading a world-class team of inference engineers. Our current stack includes Python, PyTorch, Rust, C++, and Kubernetes. You will help architect and scale the large-scale deployment of machine learning models behind Perplexity's Comet, Sonar, Search, Deep Research products.
Why Perplexity?
  • Build SOTA systems that are the fastest in the industry with cutting-edge technology
  • High-impact work on a smaller team with significant ownership and autonomy
  • Opportunity to build 0-to-1 infrastructure from scratch rather than maintaining legacy systems
  • Work on the full spectrum: reducing cost, scaling traffic, and pushing the boundaries of inference
  • Direct influence on technical roadmap and team culture at a rapidly growing company
Responsibilities
  • Lead and grow a high-performing team of AI inference engineers
  • Develop APIs for AI inference used by both internal and external customers
  • Architect and scale our inference infrastructure for reliability and efficiency
  • Benchmark and eliminate bottlenecks throughout our inference stack
  • Drive large sparse/MoE model inference at rack scale, including sharding strategies for massive models
  • Push the frontier with building inference systems to support sparse attention, disaggregated pre-fill/decoding serving, etc.
  • Improve the reliability and observability of our systems and lead incident response
  • Own technical decisions around batching, throughput, latency, and GPU utilization
  • Partner with ML research teams on model optimization and deployment
  • Recruit, mentor, and develop engineering talent
  • Establish team processes, engineering standards, and operational excellence
Qualifications
  • 5+ years of engineering experience with 2+ years in a technical leadership or management role
  • Deep experience with ML systems and inference frameworks (PyTorch, TensorFlow, ONNX, TensorRT, vLLM)
  • Strong understanding of LLM architecture: Multi-Head Attention, Multi/Grouped-Query Attention, and common layers
  • Experience with inference optimizations: batching, quantization, kernel fusion, FlashAttention
  • Familiarity with GPU characteristics, roofline models, and performance analysis
  • Experience deploying reliable, distributed, real-time systems at scale
  • Track record of building and leading high-performing engineering teams
  • Experience with parallelism strategies: tensor parallelism, pipeline parallelism, expert parallelism
  • Strong technical communication and cross-functional collaboration skills
Nice to Have
  • Experience with CUDA, Triton, or custom kernel development
  • Background in training infrastructure and RL workloads
  • Experience with Kubernetes and container orchestration at scale
  • Published work or contributions to inference optimization research
Vacancy posted 6 hours ago
Similar jobs that could be interesting for youBased on the Engineering Manager (AI Inference) in San Francisco, CA vacancy
  • $405k

     ...interpretable, and steerable AI systems. We want AI to be safe...  ...of committed researchers, engineers, policy experts, and business...  ...shouldn't have been shed. The Inference Routing team owns this layer....  ...Have 5+ years of engineering management experience, ideally with at... 
    Suggested
    Work at office
    Visa sponsorship
    Flexible hours
    Shift work

    Anthropic

    San Francisco, CA
    2 days ago
  • $405k

     ...interpretable, and steerable AI systems. We want AI to be safe...  ...of committed researchers, engineers, policy experts, and business...  ...that sits in front of every inference call Anthropic serves. As Claude...  ...request multiplexing, connection management), rate limiting and... 
    Suggested
    Temporary work
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    5 hours ago
  •  ...close the justice gap using technology and AI. We empower personal injury lawyers and...  ...lasting impact. Learn more at Life as an Engineer at EvenUp Location & Work Model...  ...in personal injury law. As Engineering Manager for Document Generation, you will lead a... 
    Suggested
    Full time
    Temporary work
    Work at office
    Local area
    Home office
    Flexible hours
    3 days per week

    EvenUp Inc.

    San Francisco, CA
    5 hours ago
  • Job Title Disabled veteran A veteran who served on active duty in the U.S. military and is entitled to disability compensation (or who but for the receipt of military retired pay would be entitled to disability compensation) under laws administered by the Secretary of...
    Suggested

    GEMÜ

    San Francisco, CA
    8 hours ago
  • $232.5k - $325.5k

     ...'s largest sources of information. For more information, visit We are looking for a Senior Engineering Manager, Agents & Developer Productivity to lead teams building AI-assisted engineering workflows. This role will partner with engineering leaders in infrastructure... 
    Suggested
    For contractors
    Work experience placement
    Flexible hours
    Shift work

    Reddit

    San Francisco, CA
    8 hours ago
  •  ...more - with fully integrated solutions to manage everything from business accounts,...  ...you "get stuff done" end-to-end. You use AI to work smarter and solve problems faster...  ...operations. What you'll do As an Engineering Lead / Manager in the KYC team, you will... 
    Worldwide
    Relocation

    Airwallex

    San Francisco, CA
    4 days ago
  • $275k - $350k

     ...Principal Machine Learning Engineer San Francisco, CA About...  ...Scientific builds and commercializes AI agents for science....  ...Build efficient and flexible inference infrastructure, supporting complex...  ...with leveraging and managing distributed computing resources... 
    Work at office
    Flexible hours

    Edison Scientific Inc.

    San Francisco, CA
    1 day ago
  •  ...Principal Software Engineer For Backend Development Our client...  ...infrastructure behind their AI-powered game generation tools...  ...learning, model training, and inference workflows, with a focus on real...  ...tools such as Kubernetes for managing scalable infrastructures. ~... 
    Remote work
    Flexible hours

    NxT Level

    San Francisco, CA
    3 days ago
  •  ...Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence...  ...Join us and help build the platform engineers turn to to ship AI products. THE ROLE As the Engineering Manager for Baseten's Cloud Platform team,... 
    Flexible hours

    Baseten

    San Francisco, CA
    4 days ago
  •  ...Senior Engineering Manager (M8) Rippling's Infrastructure organization owns the mission-critical systems that power our entire product ecosystem...  ...a 24M+ line monolith) while driving forward modern, AI-powered development workflows. You will be responsible for... 
    Work at office
    Local area
    3 days per week

    MyHealthTeam

    San Francisco, CA
    3 days ago
  • $200k - $250k

     ...Series A funding but are looking for a VP of Engineering and Product Co-Pilot to the CEO to join...  ...with a new trust-based protocol and AI-driven enforcement.Role OverviewAs VP of...  ...mentor, and inspire exceptional product managers and engineers.Build a culture of collaboration... 
    Remote work

    80Twenty

    San Francisco, CA
    4 days ago
  • $280k - $300k

     ...Engineering Manager, Product Engineering At Instabase, we're passionate about democratizing access to cutting-edge AI innovation to enable any organization to solve previously unsolvable unstructured data problems in their industry. With customers representing some... 
    Work at office
    Flexible hours

    Instabase

    San Francisco, CA
    4 days ago
  • $293k - $385k

    About the Team Identity is the foundation of trust in an AI-powered world. As people build a rich personal and organizational...  ...on by many internal teams. About the Role As an Engineering Manager, Identity Infrastructure, you will lead a team owning foundational... 
    Work at office
    Relocation package
    Shift work

    OpenAI

    San Francisco, CA
    3 days ago
  • $220k - $300k

     ...monitoring standard and our team is building its AI-native future. About the role Sentry's...  ...s most critical force multipliers. Every engineer at Sentry depends on the tooling we build...  ...lifecycle works. As the Engineering Manager for Dev Infra, you'll lead a talented... 
    Hourly pay
    Remote work
    Shift work

    Sentry

    San Francisco, CA
    8 hours ago
  • $197.3k - $313.7k

     ...Salesforce Salesforce is the #1 AI CRM, where humans with agents...  ...data modeler to build and manage the data model(s) for our...  ...workloads, including feature engineering for ML models and real-time...  ...training pipelines, and real-time inference. ~ A proven track record... 
    Work at office

    Salesforce

    San Francisco, CA
    2 days ago
  •  ...year contracts for computer and inference, but sell to customers on...  ...the market? Otherwise, as AI scales, compute only becomes...  ...shape culture, mentor junior engineers, and learn from our customers...  ...of experience with hands-on management or architecture with network... 
    Long term contract
    Contract work
    Fixed term contract
    Work at office
    Local area
    Visa sponsorship
    Shift work
    3 days per week

    SF Compute

    San Francisco, CA
    1 day ago
  •  ...LILA Lila is building a platform where AI and automation co-evolve to solve the...  ...We are seeking a Principal ML Research Engineer to be the founding engineering leader on...  ...models, shared specialist-model serving and inference, agentic infrastructure, and the... 

    Lila Sciences

    San Francisco, CA
    2 days ago
  •  ...fastest growing consumer entertainment company and the leader in AI music. We are backed by leading investors including Menlo...  ...and technical direction Partner with leaders across product, engineering, and research to decide how recommendations evolve with our platform... 
    Full time
    Work at office
    Local area

    SUNO

    San Francisco, CA
    4 days ago
  • $151k - $176k

     ...Technical Support Engineering Manager Merge is the leading provider of agentic tools and customer-facing integrations for frontier LLMs, Fortune...  ...with a single API, and Merge Agent Handler, which empowers AI agents with secure access to thousands of third-party tools.... 
    Work at office
    Home office

    Merge LLC

    San Francisco, CA
    1 day ago
  • $269k - $316k

     ...Senior Engineering Manager, Data Engineering Denver, Colorado, United States; San Francisco, California, United States Checkr is building...  ...140,000 companies and millions of people rely on Checkr for AI verification in the moments that matter most: getting a new job... 
    Work at office
    Local area
    Remote work
    Relocation
    Flexible hours
    3 days per week

    Checkr

    San Francisco, CA
    3 days ago
  • About the Role We're hiring an Engineering Managers for our Site Reliability Engineering organization...  ...to lead the team that keeps Together AI's production infrastructure running....  ...-metal / day-0 / day-2 operations, our inference platform, and our virtual clusters platform... 
    Full time
    Work at office
    Relocation
    Shift work

    Together AI

    San Francisco, CA
    18 hours ago
  • $293.6k - $335.1k

     ...Director, AI Engineering Overview: At Capital One, we are creating responsible and reliable...  ..., overseeing the development, and managing the growth of an organization's autonomous...  ...algorithms or technologies (e.g. LLM Inference, Similarity Search and VectorDBs, Guardrails... 
    Full time
    Part time
    Local area

    Capital One

    San Francisco, CA
    6 hours ago
  • $264k - $300k

     ...how we operate. We partner closely with Engineering, Product, IT, Legal, and Compliance to build...  ...scale. We are seeking an Engineering Manager, Security to lead and grow our Security...  ...posture. ~ Demonstrates curiosity about AI tools and emerging technologies, with a willingness... 
    Work at office
    Local area
    Work from home
    Worldwide

    Asana

    San Francisco, CA
    2 days ago
  • $184.5k - $230.7k

     ...use Artificial Intelligence (AI) to help make our hiring process...  ...L5 Machine Learning & Data Engineer to lead the design, build, and...  ...architectures on AWS, including Terraform-managed infrastructure. ~ Deep...  ...encryption) or on-device inference. Background in conversational... 
    Local area
    Remote work
    Worldwide

    Twilio

    San Francisco, CA
    2 days ago
  • $248k - $279k

     ...Be Doing Lead a team of security engineers who will build and implement application...  ...Discord's product engineering and product management teams to champion new security features for...  ...Rust, Go). You have development-with-AI experience and a good grasp of AI... 
    Full time
    Relocation
    Relocation package

    Discord

    San Francisco, CA
    4 days ago
  • $204k - $301k

     ...Your Future Neighbors The Analytics Engineering team at Nextdoor transforms diverse data...  .... At Nextdoor, we operate in an AI-first environment and expect every team member...  ...As a Senior Analytics Engineering Manager , you will Develop and own the... 
    Work at office
    Local area
    Work from home
    Shift work

    Nextdoor

    San Francisco, CA
    3 days ago
  •  ...Machine Learning Engineer Chalk is building the data platform that powers the future of machine learning applications. We tear down...  ...of exceptional engineers building our platform for real-time inference at scale. This role is a unique opportunity to shape technical... 
    Work at office
    Flexible hours

    CHALK INC

    San Francisco, CA
    5 hours ago
  •  ...Engineering Manager, GPU (ML Accelerator) San Francisco, CA | New York City, NY | Seattle, WA...  ...reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial...  ...use of our compute resources, be it inference or training. As an Engineering Manager... 
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    5 days ago
  • $293k - $385k

     ...all of humanity. Our API is the industry's most widely adopted AI platform, empowering startups, indie developers, and Fortune...  ...and at scale. About the Role: We are seeking an Engineering Manager to lead our multimodal API product suite. Your team will be responsible... 
    Work at office

    OpenAI

    San Francisco, CA
    5 hours ago
  • $250k - $300k

     ...Altruist is transforming the multi-trillion dollar wealth management industry by building an AI platform for wealth professionals. We partner with...  ...obstacles. About Hazel Hazel is building the AI engine for wealth management that helps unlock 10x growth, efficiency... 
    Work at office
    Immediate start

    Altruist

    San Francisco, CA
    7 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Engineering Manager (AI Inference). Be the first to apply!