Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Applied AI Inference Engineer

Baseten

ABOUT BASETEN

Baseten powers mission‑critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting‑edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products.

THE ROLE

As an Applied AI Inference Engineer at Baseten, you will partner directly with customers to architect, build, and deploy high‑scale production AI applications on Baseten’s platform. You’ll own the journey with customers from initial exploration to production deployment, translating ambiguous business goals into reliable, observable services with clear quality, latency, and cost outcomes. This role is a great fit for entrepreneurial engineers who want a front‑row view into how modern companies adopt AI at scale and who enjoy working across product, software development, performance engineering, and customer‑facing implementations. To be clear, this is an engineering role with hands‑on coding and software development that also includes aspects of product management, technical customer success, and pre‑sales solution engineering mixed in.

EXAMPLE INITIATIVES

Take a look at these blog posts written by members of our Forward Deployed Engineering team: Forward Deployed Engineering on the frontier of AI The fastest, most accurate Whisper transcription Deploy production‑ready model servers from Docker images Deploy custom ComfyUI workflows as APIs

RESPONSIBILITIES

Develop and maintain software systems and product features using one or more general‑purpose programming languages in a production‑level environment, with a preference for Python due to its relevance in ML projects. Drive customer impact by designing, implementing, and deploying Baseten solutions end‑to‑end (problem framing → evaluation → production deployment → monitoring). This involves working with customers’ engineering teams at every stage of the customer journey including: sales, implementation, and expansion. Deliver with velocity: turn vague objectives into clear specs and well‑defined PoCs so we can rapidly ship well‑tested services and outcomes for our customers. Optimize and enhance AI/ML projects, contributing to the continuous improvement of our technical stack. This includes developing features and PRDs with other engineering and product orgs. Own products and customer projects end‑to‑end, functioning as both an engineer, project manager, and product manager, with a focus on user empathy, project specification, and end‑to‑end execution. Navigate ambiguity and exercise good judgment on tradeoffs and tools needed to solve problems, avoiding unnecessary complexity. Demonstrate pride, ownership, and accountability for your work, expecting the same from your teammates.

REQUIREMENTS

Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field. 1+ years of professional work experience in a fast‑paced, high‑growth environment. Demonstrated experience with one or more general‑purpose programming languages in a production‑level environment, with a strong preference for Python. Familiarity with AI/ML pipelines and the lifecycle of ML model development and deployment. Strong communication skills, particularly on complex technical topics. Experience in building or optimizing AI/ML projects is highly valued.

BENEFITS

Competitive compensation, including meaningful equity. 100% coverage of medical, dental, and vision insurance for employee and dependents. Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!) Paid parental leave. Fertility and family‑building stipend through Carrot. Company‑facilitated 401(k). Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities. At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable). #J-18808-Ljbffr Baseten

Vacancy posted 10 hours ago
Similar jobs that could be interesting for youBased on the Applied AI Inference Engineer in San Francisco, CA vacancy
  • $175k - $225k

     ...Our team is led by veteran operators and engineers, alumni of Sonos, Paypal, Tesla, Apple,...  .... The Role We're looking for an AI Inference Engineer who lives at the boundary of high...  ...prototypes from Jetpack into Yocto Apply advanced optimization techniques-including... 
    Suggested
    Local area
    Remote work

    Sauron

    San Francisco, CA
    1 day ago
  • $178k - $316k

     ...Applied AI Engineer At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way. We...  ...SFT/DPO), optimize prompts, and improve latency/cost-aware inference; contribute to offline evals + human-in-the-loop and online... 
    Suggested
    Work at office
    3 days per week

    Quizlet

    San Francisco, CA
    2 days ago
  •  ...Sr. Applied AI Software Engineer- Vision Products Group & Siri Apple builds products that are loved by people around the world—products that...  ...and/or Machine Learning algorithms, including on-device inference, data-driven validation, requirement definition, and collaboration... 
    Suggested

    Apple

    San Francisco, CA
    1 day ago
  • $180k - $280k

     ...across the United States to help them hire. Title of Role: Applied AI Engineer Location: San Francisco, CA (On-site, 5 days/week)...  ...Improve browser-agent reliability and decision-making Optimize inference latency across the stack Build evaluation frameworks and... 
    Suggested
    H1b
    Work at office
    Remote work
    Visa sponsorship

    Recruiting from Scratch

    San Francisco, CA
    2 days ago
  • $175k - $220k

     ...is building the next generation of AI perception systems for the physical...  ...detection. We are seeking a Senior Applied AI & Machine Learning Engineer to design, optimize, and ship multimodal...  ...Build and maintain training and inference pipelines that support scalable experimentation... 
    Suggested
    Shift work

    Volt

    San Francisco, CA
    4 days ago
  •  ...About the job Applied AI Engineer About Us Catalyst Labs is a leading talent agency with a specialized vertical in Applied AI...  ...evaluation-driven iteration . Architect efficient, low-latency inference pipelines for thousands of simultaneous mind interactions.... 
    Visa sponsorship
    Relocation package

    Catalyst Labs, LLC

    San Francisco, CA
    3 days ago
  •  ...About Alembic Alembic is an applied science company building GPU-resident distributed...  ..., Graph Neural Networks, and causal inference to deliver real-time analytics that...  ...We're hiring a Senior Software Engineer onto our Applied AI team to build and extend the backend... 
    Work at office

    Alembic Limited

    San Francisco, CA
    4 days ago
  • $180k - $220k

     ...work is one of the harder problems in applied AI right now. It requires reasoning over long...  ...a small, highly capable team across engineering and design: Shravan (CTO), formerly...  ...modeling, algorithms, or statistical inference Experience translating domain expertise... 
    Work at office
    Relocation package

    daydream Labs, Inc

    San Francisco, CA
    1 day ago
  • $197.3k - $225.1k

     ...Lead AI Engineer (FM Hosting, LLM Inference) Overview At Capital One, we are creating responsible and reliable AI systems, changing banking...  ...banking. We are committed to continuing to build world-class applied science and engineering teams to deliver our industry... 
    Full time
    Part time
    Local area

    Capital One Financial Corp

    San Francisco, CA
    6 days ago
  •  ...Applied AI Engineer Serval is an AI-native automation platform transforming how enterprises operate. We build intelligent agents that understand...  ...AI systems — from model selection and fine-tuning to inference and evaluation pipelines. Integrate AI capabilities into... 

    Serval

    San Francisco, CA
    1 day ago
  • $180k - $250k

     ...Job Description Job Description Applied AI Engineer — Pointer Location: San Francisco, CA (Onsite, 5 days/week) Compensation: $180...  ...Browser agent reliability Document understanding systems Inference optimization Adaptive self-healing AI systems Fine-... 
    Full time
    H1b
    Visa sponsorship

    David Joseph & Company

    San Francisco, CA
    2 days ago
  • $181.1k - $318.4k

    Sr. Applied AI Software Engineer- Vision Products Group & Siri San Francisco Bay Area, California, United States Software and Services Apple builds...  ...and/or Machine Learning algorithms, including on-device inference, data-driven validation, requirement definition, and... 
    Relocation

    Apple Inc.

    San Francisco, CA
    4 days ago
  • This is a job that Jill, our AI Recruiter, is recruiting for on behalf of one of...  ...next step is speak to Jack. Job Title: Applied AI Engineer Company Description: Innovative AI simulation...  ..., data-efficient training, and inference optimization, enabling new frontiers in... 

    Jack & Jill/External ATS

    San Francisco, CA
    10 hours ago
  •  ...About the job Applied AI / ML Engineer About Us Catalyst Labs is a leading talent agency with a specialized vertical in Applied...  ...document processing at production scale. Design and optimize inference systems, dataset pipelines, and specific logic to improve... 
    Full time
    Visa sponsorship

    Catalyst Labs, LLC

    San Francisco, CA
    4 days ago
  •  ...speed, accuracy, and insight. Rillet is an AI-native ERP that can drive a zero-day...  ...-growing companies. Who We Need As an Applied AI Engineer on Rillet's AI & ML Team, you will design...  ...the infrastructure powering LLM inference, fine-tuning pipelines, and RAG systems... 
    Work at office
    Remote work
    Relocation
    Flexible hours

    Rillet

    San Francisco, CA
    1 day ago
  •  ...is building quantum-accelerated AI servers to exponentially speed up training and inference for AI. By integrating quantum...  ...pioneer new domains in physics, engineering, and AI, tackling the hardest...  ...technical program management, or applied systems engineering Culture &... 
    Casual work
    Visa sponsorship

    Atexo

    San Francisco, CA
    3 days ago
  •  ...and Hims & Hers Health). The Role We’re hiring an Applied AI Software Engineer to lead evaluations for agents in development and the post...  ...agents in Canvas using state of the art foundation model inference and fine-tuning APIs along with our server-side SDK. The... 
    Remote work
    Home office
    Flexible hours

    Canvas Medical

    San Francisco, CA
    15 days ago
  • $282k - $344k

     ...Join us to design and deliver AI-powered learning tools that scale...  ...potential. About the Team (Applied AI) Our mission is to invent...  ...Role As Sr. Staff Applied AI Engineer, you will be the hands‑on technical...  ..., and latency/cost‑aware inference strategies; define offline evals... 
    Work at office
    3 days per week

    Icon Ventures

    San Francisco, CA
    1 day ago
  • A dynamic AI company in San Francisco is looking for an Applied AI Inference Engineer to develop and deploy high-scale production AI applications. You will partner with customers to transform business goals into reliable services while engaging in software development and... 
    Flexible hours

    Baseten

    San Francisco, CA
    10 hours ago
  •  ...About Orum Orum ’s AI-powered suite frees salespeople to do...  ...ML best practices across the engineering org Partner with Product and...  ...and raise the bar on applied AI engineering practices...  ...deploying ML pipelines for training, inference, monitoring, and continuous improvement... 
    Full time
    Remote work

    Orum

    San Francisco, CA
    10 hours ago
  • $167.2k - $209k

    A leading cloud service provider is seeking a Senior Engineer 2 for their AI Inference Data Plane team. This remote role focuses on designing and developing high-scale, resilient data plane services that enhance AI-driven applications. The ideal candidate will have strong... 
    Remote work

    DigitalOcean

    San Francisco, CA
    10 hours ago
  •  ...institutions. In 2025, we started Handshake AI and built the fastest-growing AI data...  ...institutions Work together with engineers, scientists, operators, and more from Palantir...  .... About the Role As a Senior Applied AI Engineer at Handshake AI Enterprise, you... 
    Full time
    Work at office
    Remote work
    Flexible hours

    Handshake

    San Francisco, CA
    4 days ago
  • $212.63k - $381.15k

     ...Job Requisition ID # 25WD93825 Position Overview As the Distinguished Applied AI Engineer AI Transformation you will define and guide and accelerate the strategy for the rapid, responsible and outcome driven adoption of AI across Autodesk's go-to-market (GTM)... 
    For contractors

    Autodesk

    San Francisco, CA
    2 days ago
  •  ...Applied AI Engineer Everyone's talking about AI. But here's the truth: AI is trapped in a chat box. It can't take real actions in the real world. We are changing that forever. We're not just building another AI company - we're creating the infrastructure that will... 
    Shift work

    Arcade

    San Francisco, CA
    1 day ago
  •  ...Console is building the AI agents that power autonomous enterprises. We use AI to autonomously resolve IT, HR, Legal, Finance...  ...ahead of us. Join us. About the role We're hiring an Applied AI Engineer as the number and complexity of AI workflows in our system scale... 
    Work at office

    Console Systems, Inc

    San Francisco, CA
    10 hours ago
  •  ...Tolan is an Embodied Companion - a cute AI alien you talk with naturally, like a friend...  ...in-person in downtown SF. The Applied AI Team The Applied AI Team is responsible...  ...behavior. You'll work closely with design and engineering to develop new features and tune existing... 
    Weekend work

    Tolan

    San Francisco, CA
    10 hours ago
  •  ...Nexxa.ai is building artificial super intelligence for heavy industries - enabling machines, systems and operations to think...  ...problems in industry. Role Overview We're looking for Applied AI Engineers to work as forward deployed directly with customers and lead... 

    Nexxa.AI

    San Francisco, CA
    1 day ago
  • $150k - $240k

     ...Job Description Job Description Applied AI Engineer — Contrario Location: San Francisco, CA (Onsite) Compensation: $150,000 – $239,997 base + 0.2% – 0.6% equity Visa Sponsorship: H-1B, O-1, OPT supported Employment Type: Full-Time About Contrario Contrario... 
    Full time
    H1b
    Work at office
    Visa sponsorship
    Flexible hours

    David Joseph & Company

    San Francisco, CA
    2 days ago
  •  ...About the role Slash is building an AI-native financial platform, and we're...  ...stack AI features end to end, from prompt engineering and agent orchestration to React UI and...  ...Improve and scale our AI infrastructure: inference routing, model tiering, prompt caching,... 
    Work at office

    Slash Financial

    San Francisco, CA
    4 days ago
  • $200k - $320k

     ...Applied AI Engineer, Enterprise Tech San Francisco, CA | New York City, NY | Seattle, WA About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a... 
    Work at office
    Visa sponsorship
    Flexible hours
    3 days per week

    Anthropic

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Applied AI Inference Engineer. Be the first to apply!