Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Applied AI Inference Engineer

BaseTen

Applied AI Inference Engineer

Baseten powers mission-critical inference for the world's most dynamic AI companies. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. Join us and help build the platform engineers turn to to ship AI products.

As an Applied AI Inference Engineer at Baseten, you will partner directly with customers to architect, build, and deploy high-scale production AI applications on Baseten's platform. You'll own the journey with customers from initial exploration to production deployment, translating ambiguous business goals into reliable, observable services with clear quality, latency, and cost outcomes.

This role is a great fit for entrepreneurial engineers who want a front-row view into how modern companies adopt AI at scale and who enjoy working across product, software development, performance engineering, and customer-facing implementations.

To be clear, this is an engineering role with hands-on coding and software development that also includes aspects of product management, technical customer success, and pre-sales solution engineering mixed in.

Example Initiatives
  • Forward Deployed Engineering on the frontier of AI
  • The fastest, most accurate Whisper transcription
  • Deploy production-ready model servers from Docker images
  • Deploy custom ComfyUI workflows as APIs
Responsibilities
  • Develop and maintain software systems and product features using one or more general-purpose programming languages in a production-level environment, with a preference for Python due to its relevance in ML projects.
  • Drive customer impact by designing, implementing, and deploying Baseten solutions end-to-end (problem framing → evaluation → production deployment → monitoring). This involves working with customers' engineering teams at every stage of the customer journey including: sales, implementation, and expansion.
  • Deliver with velocity: turn vague objectives into clear specs and well-defined PoCs so we can rapidly ship well-tested services and outcomes for our customers
  • Optimize and enhance AI/ML projects, contributing to the continuous improvement of our technical stack. This includes developing features and PRDs with other engineering and product orgs.
  • Own products and customer projects end-to-end, functioning as both an engineer, project manager, and product manager, with a focus on user empathy, project specification, and end-to-end execution.
  • Navigate ambiguity and exercise good judgment on tradeoffs and tools needed to solve problems, avoiding unnecessary complexity.
  • Demonstrate pride, ownership, and accountability for your work, expecting the same from your teammates.
Requirements
  • Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field.
  • 1+ years of professional work experience in a fast-paced, high-growth environment.
  • Demonstrated experience with one or more general-purpose programming languages in a production-level environment, with a strong preference for Python.
  • Familiarity with AI/ML pipelines and the lifecycle of ML model development and deployment.
  • Strong communication skills, particularly on complex technical topics.
  • Experience in building or optimizing AI/ML projects is highly valued.
Benefits
  • Competitive compensation, including meaningful equity.
  • 100% coverage of medical, dental, and vision insurance for employee and dependents
  • Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
  • Paid parental leave
  • Fertility and family-building stipend through Carrot
  • Company-facilitated 401(k)
  • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you.

At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.

We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law.

Vacancy posted 11 hours ago
Similar jobs that could be interesting for youBased on the Applied AI Inference Engineer in United States vacancy
  • $175k - $225k

     ...Our team is led by veteran operators and engineers, alumni of Sonos, Paypal, Tesla, Apple,...  .... The Role We're looking for an AI Inference Engineer who lives at the boundary of high...  ...prototypes from Jetpack into Yocto Apply advanced optimization techniques-including... 
    Suggested
    Local area
    Remote work

    Sauron

    San Francisco, CA
    11 hours ago
  • $110k - $270k

     ...is targeted to run neural network (NN) inference workloads in a wide variety of edge...  ...and control code. Role The AI Inference Engineer in Quadric is the key bridge between the...  ...resonates with you, we encourage you to apply even if your experience does not... 
    Suggested
    Work at office
    Local area
    Immediate start
    Flexible hours
    2 days per week

    quadric.io

    Burlingame, CA
    4 days ago
  •  ...is building the next generation of AI perception systems for the physical...  ...detection. We are seeking a Senior Applied AI & Machine Learning Engineer to design, optimize, and ship multimodal...  ...Build and maintain training and inference pipelines that support scalable experimentation... 
    Suggested
    Shift work

    Volt

    Mundelein, IL
    1 day ago
  •  ...Senior AI Engineer Equifax is where you can power your possible. If you want to achieve...  ...powered applications, NLP pipelines, and ML inference infrastructure at enterprise scale....  ...Are you ready to power your possible? Apply today, and get started on a path toward... 
    Suggested
    Full time
    Work at office
    Immediate start
    Remote work
    Monday to Friday

    Equifax

    Alpharetta, GA
    1 day ago
  •  ...Senior Applied AI Engineer Paramount Skydance Corp. is seeking a Senior Applied AI Engineer to architect, build, and operationalize AI-driven...  ...ML workflows, Vertex AI Online Endpoints for real-time inference. Integrate Vertex AI with GCP services (BigQuery, Cloud Run... 
    Suggested

    Paramount Global Services

    Burbank, CA
    1 day ago
  •  ...Fortune Best Workplaces in Financial Services & Insurance Applied & Agentic AI Engineer Job Responsibilities Architect and deploy LLM-...  ...deterministic post-processing logic. Optimize token consumption, inference latency, and cloud infrastructure costs. Deploy... 

    Sedgwick

    Atlanta, GA
    4 days ago
  •  ...About the job Applied AI Engineer About Us Catalyst Labs is a leading talent agency with a specialized vertical in Applied AI...  ...evaluation-driven iteration . Architect efficient, low-latency inference pipelines for thousands of simultaneous mind interactions.... 
    Visa sponsorship
    Relocation package

    Catalyst Labs, LLC

    Houston, TX
    2 days ago
  • $147.4k - $220.9k

     ...Applied AI Software Engineer - Vision Products Group & Siri Apple builds products that are loved by people around the world—products that enrich...  ...and/or Machine Learning algorithms, including on-device inference, data-driven validation, requirement definition, and... 
    Relocation

    Apple

    Sunnyvale, CA
    11 hours ago
  • $10k

     ...place to do it. About the Role The Applied AI team at Ramp is at the forefront of...  ...platform. We are seeking strong full-stack engineers who are proficient in web frameworks,...  ...models, and build infrastructure for LLM inference. If you're passionate about working on... 
    Full time
    Work at office
    Remote work
    Home office
    Relocation package
    Flexible hours

    Paribus (Ramp)

    United States
    18 hours ago
  •  ...Model Optimization & Deployment Engineer, you will focus on bringing...  ..., and build highly concurrent inference code to ensure real-time, deterministic...  ...and minimize latency on AI accelerators. Write...  ...compensation package. The listed range applies only to the base salary.... 
    Temporary work
    Relocation package

    Zoox

    Foster, CA
    1 day ago
  •  ...Overview: Job Title: Applied AI Engineer Location: Orlando, FL (Day1 Onsite) Duration: 12+ Months Job Description:...  ...• Experience integrating LLM APIs (OpenAI, HuggingFace Inference API, etc.) into real-world applications • Experience with... 

    Magicforce

    Orlando, FL
    1 day ago
  •  ...Applied AI Engineer PermitFlow is redefining how America builds. We're an applied AI company serving the nation's builders, tackling one...  ...agent accuracy, reliability, and cost-effectiveness Optimize inference pipelines and backend systems for speed, scalability, and... 
    For contractors
    Work at office
    Remote work
    Relocation
    Home office
    3 days per week

    PermitFlow

    United States
    11 hours ago
  • $197.3k - $225.1k

    Lead AI Engineer (FM Hosting, LLM Inference) Job Description At Capital One, we are creating responsible and reliable AI systems, changing banking...  ...banking. We are committed to continuing to build world-class applied science and engineering teams to deliver our industry... 
    Full time
    Part time
    Local area
    Immediate start

    Capital One

    New York, NY
    13 hours ago
  • $178k - $316k

     ...Applied AI Engineer At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way. We...  ...SFT/DPO), optimize prompts, and improve latency/cost-aware inference; contribute to offline evals + human-in-the-loop and online... 
    Work at office
    3 days per week

    Quizlet

    Seattle, WA
    1 day ago
  •  ...Description: We are seeking an Applied AI Software Engineer to help accelerate the integration of advanced artificial intelligence and machine...  ...AI/ML concepts and frameworks (e.g., model integration, inference pipelines, or data processing workflows) - Experience working... 
    Full time
    Work at office
    Remote work
    Relocation
    Flexible hours
    Shift work
    3 days per week

    Lockheed Martin Corporation

    Fort Worth, TX
    3 days ago
  •  ...This is a job that Jill, our AI Recruiter, is recruiting for on behalf of one of...  ...step is to speak to Jack. Job Title: Applied AI Engineer Company Description: Innovative AI...  ...orchestration, data-efficient training, and inference optimization, enabling new frontiers in... 

    Jack and Jill AI

    San Jose, CA
    4 days ago
  • $170k - $233k

     ...Senior Technical Leader, AI Solutions Position Summary: Responsible...  ...and internal expert in applied AI, shaping the organization'...  ...five (5) + years technical engineering experience with coding in...  ...federated learning, or real-time inference systems. Track record of... 
    Local area

    loanDepot

    Irvine, CA
    1 day ago
  • $180k - $280k

     ...across the United States to help them hire. Title of Role: Applied AI Engineer Location: San Francisco, CA (On-site, 5 days/week)...  ...Improve browser-agent reliability and decision-making Optimize inference latency across the stack Build evaluation frameworks and... 
    H1b
    Work at office
    Remote work
    Visa sponsorship

    Recruiting from Scratch

    San Francisco, CA
    1 day ago
  • $150k - $160k

     ...Applied AI Engineer Location: Sunnyvale, CA/Austin, TX - Onsite Job Job Type: Full time Salary: $150-160K Production AI Systems: Has...  ..., or graceful degradation for AI systems • Streaming inference and async agent orchestration • Cost/latency optimization: caching... 
    Full time

    Diverse Lynx

    Austin, TX
    1 day ago
  • $181.5k - $283.8k

     ...and target discovery. We are seeking a highly specialized Applied AI Engineer Clinical Informatician to lead research at the intersection...  ...experience applying ML methods including survival analysis, causal inference, NLP, and deep learning to clinical or genomic research... 
    Full time
    For contractors
    Flexible hours

    Eli Lilly

    Boston, MA
    4 days ago
  •  ...robots that deliver healthcare. Our AI sensing platform enables existing robotic...  ...The position We're looking for an Applied AI Engineer to take our growing collection of foundation...  ...deployed fleet Deploy across our inference surfaces: third-party APIs, self-hosted... 

    Norbert Health

    New York, NY
    2 days ago
  • $200k - $250k

     ...Applied AI Engineer Career Renew is recruiting for one of its clients an Applied AI Engineer—this is a hybrid position in New York. Salary...  ...) that power client-specific data access, analytics, and AI inference Frontend Development: Build intuitive, responsive user... 

    Career Renew

    New York, NY
    2 days ago
  • $181.1k - $272.1k

     ...Applied AI Engineer Imagine what you could do here. At Apple, new ideas have a way of becoming outstanding products, services, and customer...  .... Experience with anomaly detection and causal inference models. Sound communication skills - adept at messaging domain... 
    Relocation

    Apple

    Cupertino, CA
    11 hours ago
  •  ...models to deploying reliable inference and retrieval systems in production...  ...closely with product and engineering to translate real operational needs into high-performing AI features, operating across cloud...  ...If this role sounds like you, apply and share with us your... 
    Full time
    Work at office
    Flexible hours

    Twenty Inc.

    Washington DC
    6 days ago
  •  ...About Alembic Alembic is an applied science company building GPU-resident distributed...  ..., Graph Neural Networks, and causal inference to deliver real-time analytics that...  ...We're hiring a Senior Software Engineer onto our Applied AI team to build and extend the backend... 
    Work at office

    Alembic Limited

    San Francisco, CA
    3 days ago
  • $180k - $220k

     ...work is one of the harder problems in applied AI right now. It requires reasoning over long...  ...a small, highly capable team across engineering and design: Shravan (CTO), formerly...  ...modeling, algorithms, or statistical inference Experience translating domain expertise... 
    Work at office
    Relocation package

    daydream Labs, Inc

    San Francisco, CA
    11 hours ago
  •  ...Applied AI Engineer Raleigh, NC Apply Who We Are: Bandwidth, a prior “Best of EC” award winner, is a global software company that...  ...knowledge of AI infrastructure, including model serving, inference optimization, GPU/CPU resource management, and MLOps pipelines... 

    Bandwidth

    Raleigh, NC
    2 days ago
  •  ...accuracy, and insight. Rillet is an AI-native ERP that can drive a zero-day close...  ...companies. Who We Need As an Applied AI Engineer on Rillet's AI & ML team, you'll design...  ...-tuning or training models, not just inference Familiarity with Python, Kotlin, Java... 
    Work at office
    Remote work
    Relocation
    Flexible hours

    Rillet

    New York, NY
    4 days ago
  • $180k - $250k

     ...Job Description Job Description Applied AI Engineer — Pointer Location: San Francisco, CA (Onsite, 5 days/week) Compensation: $180...  ...Browser agent reliability Document understanding systems Inference optimization Adaptive self-healing AI systems Fine-... 
    Full time
    H1b
    Visa sponsorship

    David Joseph & Company

    San Francisco, CA
    1 day ago
  •  ...organization that is investing heavily in AI, advanced engineering, and next generation digital solutions...  ...AI trends, frameworks, and tools and apply them to strategic planning •...  ...federated learning edge AI or real time inference is a plus • Contribution to open source... 

    The Intersect Group

    Irvine, CA
    11 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Applied AI Inference Engineer. Be the first to apply!