Software Engineer- BIS (Baseten Inference Stack)
Baseten
Software Engineer - Inference Stack
Baseten powers mission-critical inference for the world's most dynamic AI companies. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. Join us and help build the platform engineers turn to to ship AI products.
Baseten's Inference Stack team builds the distributed runtime that powers large-scale LLM inference across our platform. We operate at the intersection of distributed systems, model performance, infrastructure, and developer experience. We enable customers to deploy and operate cutting-edge LLM models with industry-leading performance, scalability, reliability, and ease of use.
As a Software Engineer on the Inference Stack team, you'll work across the stack - from the developer experience customers use to deploy models, the libraries used for features like tool calling and reasoning, all the way down to the systems we use to orchestrate deployments in Kubernetes and route traffic efficiently. This is an ideal role for engineers who enjoy owning systems in production, solving hard integration problems, and making complex infrastructure simple and reliable for users.
Example Initiatives:
- Blog Posts
Responsibilities:
- Develop infrastructure and orchestration systems for deploying and managing large-scale distributed LLM inference
- Work across the stack, from customer-facing features to low-level infrastructure components
- Build platform capabilities related to routing, autoscaling, scheduling, observability, and runtime management
- Improve the reliability, scalability, and usability of our inference stack
- Collaborate closely with Model Performance engineers to make new inference optimizations broadly available to customers and easy to configure
- Help define best practices around testing, release automation, benchmarking, and operational excellence
- Debug complex production systems spanning Kubernetes, distributed runtimes, networking, and GPU workloads
- Make thoughtful engineering tradeoffs balancing performance, reliability, operational simplicity, and developer experience
- Own projects end-to-end: from architecture and implementation through deployment, monitoring, and iteration based on customer feedback
Requirements:
- Bachelor's, Master's, or Ph.D. in Computer Science, Engineering, or a related field
- Strong background in distributed systems, backend infrastructure, or platform engineering
- Experience building and operating production systems where reliability, latency, and scale are first-class concerns
- Strong sense of developer experience: you think about how systems are used, not just how they work
- Motivated and willing to learn new languages, frameworks, and systems as needed
- Ability to debug complex systems across multiple layers of the stack
- Genuine interest in inference engineering. You don't need to have hands on experience but are willing to learn
- Excellent communication and collaboration skills
Bonus:
- Experience with Kubernetes, including concepts like operators and custom resources
- Prior work on Dynamo, vLLM, SGLang, TensorRT-LLM, or similar inference frameworks
- Experience with distributed scheduling, autoscaling, or service orchestration
- Experience operating GPU workloads in production
- Familiarity with observability tooling, CI/CD systems, or release automation
- Experience contributing to open-source infrastructure or ML systems
Benefits:
- Competitive compensation, including meaningful equity.
- 100% coverage of medical, dental, and vision insurance for employee and dependents
- Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
- Paid parental leave
- Fertility and family-building stipend through Carrot
- Company-facilitated 401(k)
- Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you.
At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.
We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law.
- ...and SOTA LLM and Multimodal inference at scale across multi-GPU and... ...collaborate across internal GPU software teams and engage with open-... ...THE PERSON: Skilled engineer with strong technical and... ...level optimizations with full-stack performance goals. Initiate and...Suggested
- ...customers. Cohere is a team of researchers, engineers, designers, and more, who are passionate... ..., trust, and pay for. As a Senior Software Engineer, you will: Improve the... ...writing clean backend code. Our stack includes: Golang and React. You've built...SuggestedFull timeWork at officeRemote workFlexible hours
- ...Software Engineer - Dedicated Inference Team Baseten powers mission-critical inference for the world's most dynamic AI companies. By uniting applied AI research... ...customer issues with urgency Work across the stack - regardless of where you start, you'll end up touching...SuggestedRemote workFlexible hours
- ...Full-Stack Software Engineer Opportunity Drug discovery is a design problem. Chemists spend hours each week combining experimental data with... ...infrastructure for model management and low-latency inference, including security features, performance optimization, and...SuggestedRemote work
- ...Baseten Voice AI Engineer Baseten powers mission-critical inference for the world's most dynamic AI companies. By uniting applied AI research, flexible infrastructure... ...of Baseten Voice AI - our in-house inference stack to power Voice AI models - from product roadmap...SuggestedRemote workFlexible hours
$125k - $160k
...started. Role Overview We are seeking a versatile Full Stack Software Engineer to join our engineering team. Reporting to the Software... ...-Augmented Generation) architectures, or local model inference (Ollama). Experience in automated testing at multiple levels...Full timeLocal areaRemote workVisa sponsorshipWork visaShift work- ...Full Stack Software Engineer Location: Merrifield, VA on site Type: Full Time Complete Data Solutions (CDS) is a leading data engineering... ..., MongoDB, Janus Graph). Integrate AI/ML models or inference APIs to enhance data analysis and decision support tools....Full time
- ...Overview BigBear.ai is seeking a Full Stack Software Engineer to help build the next generation of AI infrastructure that will drive innovation... ...for the customer’s AI capabilities. You will focus on inference services while supporting the broader ecosystem of AI-enabled...
- ...Full-Stack Software Engineer We are the movers of the world and the makers of the future. We get up every day, roll up our sleeves and build... ...solutions for low-latency image acquisition and real-time inference. Database Integration: Architect and manage data pipelines...Full timeImmediate startRemote workRelocationFlexible hours
- ...Full Stack Software Engineer About Patlytics: Patlytics is the fastest-growing AI-native patent intelligence platform, transforming how... ...our Python/FastAPI backend, Next.js frontend, and LLM inference pipeline, each serving millions of patent analysis requests...Work experience placementImmediate startRemote work
$110k - $270k
...(GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint... ...C++ DSP and control code. Role The Full-Stack Engineer is key to making the Quadric product and...Work at officeLocal areaImmediate startWorldwideFlexible hours$120k - $180k
...yet, our team is tackling cutting-edge engineering challenges to bring revolutionary... ...role We are looking for a full-stack software enginee r to turn whiteboard ideas into... ...features that showcase real-time sensing and inference in compelling, reliable ways....Visa sponsorship- ...the Team We are a team of engineers, scientists, and domain... ...We are looking for full stack engineers who are passionate... ...excited to write high quality software to solve complex challenges.... ...production-level software to enable inference, optimization, and other complex...
- Full-Stack Software Engineer About Deep AI Lab Deep AI Lab is building the future of accounting work. We’re building the multi-tenant, SaaS, agentic... ...Boot. Integrate backend services with knowledge bank, inference and agentic AI pipelines. Collaborate closely with UI/UX,...
$98.4k - $164k
...Job Description Summary Job Description Summary Full-Stack Software Engineer & Science (Virtual Sensing) - Decentralized Grid Operations We’re... ...coordination. Develop and deploy robust virtual sensing algorithms to infer critical power grid parameters (e.g., voltage stability,...Contract workWork experience placementRemote workRelocation package- ...Full-Stack Software Engineer We are seeking a motivated, hardworking Full-Stack Software Engineer to join our team. The ideal candidate has... ...Support integrating AI/ML into internal tools (data pipelines, inference endpoints, and dashboard integration). System...Internship
$8k
...required. Visionist has an exciting new opportunity for a Full Stack Software Engineer. You will be joining a critical mission supporting our... ...implement, and optimize infrastructure to support AI model inference at scale - Support the development and ongoing...Permanent employmentContract workTemporary workFlexible hours- ...intelligent experiences across hardware, software and service products. We are looking for a senior full-stack software engineer who is passionate about building tooling that... ...(data preparation, training, evaluation, inference) and the developer experience challenges...
- ...accelerators creates challenges that few engineers ever encounter. In Apple’s Machine... ...that powers large-scale ML training and inference workloads, bringing together expertise in... ...throughout the company. You'll work across the stack — from data pipelines and backend...
$10k
...the Work You Do, Any Mission Is Possible Position: Full Stack Software Engineer- AI Infrastructure ***(Active Clearance with a Polygraph... ...underpins the organization's AI capabilities, with a focus on inference services while supporting the broader ecosystem of AI-...Extra income- ...innovation across the customer organization. We're seeking a full-stack software engineer to support our AI infrastructure team. In this role, you'll... ...for the customer's AI capabilities, with a focus on inference services while supporting a broader ecosystem of AI-enabled...
$142.2k - $204.6k
...P-1284 About This Role As a software engineer for GenAI inference, you will help design, develop, and optimize the inference engine that powers... .... Your work will touch the full GenAI inference stack - from kernels and runtimes to orchestration and memory...Local areaWorldwide$2,000 per month
...building the world's first AI inference system purpose-built for... ...investors and staffed by leading engineers, Etched is redefining the infrastructure... ...or complex distributed software systems like Linux internals,... ...and user-space networking stacks. Deep understanding of...Work at officeRelocation package$187.5k - $395k
...Software Engineer, Inference Luma's mission is to build multimodal AI to expand human imagination and capabilities. We believe that multimodality... ...~ Bonus points: ~ Experience with modern networking stacks, including RDMA (RoCE, Infiniband, NVLink) ~ Experience...$150k - $230k
...Software Engineer, Full Stack (Serverless) San Francisco fal is the generative media ecosystem powering the next generation of AI products... ...but practical: a unified platform where high-performance inference, orchestration, and observability come together to unlock...Currently hiringRelocation package- ...Clearance at the TS/SCI level. We're hiring a solution driven Software Engineer to work onsite with U.S. Government customers to create... ...and maintainable software solutions. Work with cloud tech stacks to perform data extraction, manipulation, transformation; visualization...Full timeFlexible hours
$184k - $287.5k
...We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency... ...architect and implement high-performance inference stacks, optimize GPU kernels and compilers, drive industry...- ...Clearance at the TS/SCI level. We're hiring a solution driven Software Engineer to work onsite with U.S. Government customers to create... ...and maintainable software solutions. Work with cloud tech stacks to perform data extraction, manipulation, transformation; visualization...Full timeFlexible hours
$200k - $220k
Description: Bytoa is seeking a Full-Stack Software Engineer to support our AI infrastructure team. In this role, you’ll help build and maintain... ...foundation for the customer’s AI capabilities, focusing on inference services while supporting the broader ecosystem of AI-...Extra incomeContract work$139k - $204k
...Senior Software Engineer I, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave... ...Kubernetes at production scale, CI/CD, and observability stacks (Prometheus, Grafana, OpenTelemetry). Practical...Permanent employmentTemporary workCasual workWork at officeRemote workFlexible hoursShift work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer- BIS (Baseten Inference Stack). Be the first to apply!
- software sales engineer United States
- software engineer full time United States
- facebook software engineer United States
- startup software engineer United States
- intermediate software engineer United States
- research software engineer United States
- software developer no experience United States
- labview software developer United States
- rust software engineer United States
- freelance software developer United States


