Software Engineer - Model Products
Baseten
ABOUT BASETEN Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products. THE ROLE: Baseten's Model Performance (MP) team is responsible for ensuring the models running on our platform are fast, reliable, and costefficient. As part of this team, you'll focus on Model API's - the infrastructure powering our hosted API endpoints for the latest opensource models. This work spans distributed systems, model serving, and developer experience. You'll join a small, highimpact team operating at the intersection of product, model performance, and infra, helping to define how developers interact with AI models at scale. RESPONSIBILITIES:
Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you. At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
- Design, build, and operate the Model APIs surface with focus on advanced inference capabilities: structured outputs (JSON mode, grammar-constrained generation), tool/function calling and multi-modal serving
- Profile and optimize TensorRT-LLM kernels, analyze CUDA kernel performance, implement custom CUDA operators, tune memory allocation patterns for maximum throughput and optimize communication patterns across multi-GPU setups
- Productionize performance improvements across runtimes with deep understanding of their internals: speculative decoding implementations, guided generation for structured outputs, custom scheduling and routing algorithms for high-performance serving
- Build comprehensive benchmarking frameworks that measure real-world performance across different model architectures, batch sizes, sequence lengths, and hardware configurations
- Productionize performance improvements across runtimes (e.g.TensorRT, TensorRTLLM): speculative decoding, quantization, batching, and KVcache reuse.
- Instrument deep observability (metrics, traces, logs) and build repeatable benchmarks to measure speed, reliability, and quality.
- Implement platform fundamentals: API versioning, validation, usage metering, quotas, and authentication.
- Collaborate closely with other teams to deliver robust, developerfriendly model serving experiences.
- 3+ years experience building and operating distributed systems or largescale APIs.
- Proven track record of owning lowlatency, reliable backend services (ratelimiting, auth, quotas, metering, migrations).
- Infra instincts with performance sensibilities: profiling, tracing, capacity planning, and SLO management.
- Comfortable debugging complex systems, from runtime internals to GPU execution traces.
- Strong written communication; able to produce clear design docs and collaborate across functions.
- Experience with LLM runtimes (vLLM, SGLang, TensorRTLLM) or contributions to open-source inference engines (vLLM, TensorRT-LLM, SGLang, TGI)
- Knowledge of Kubernetes, service meshes, API gateways, or distributed scheduling.
- Background in developerfacing infrastructure or opensource APIs.
- We value infraleaning generalists who bring strong engineering fundamentals and curiosity. ML experience is a plus, but not required.
- Competitive compensation, including meaningful equity.
- 100% coverage of medical, dental, and vision insurance for employee and dependents
- Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
- Paid parental leave
- Fertility and family-building stipend through Carrot
- Company-facilitated 401(k)
- Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you. At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Software Engineer - Model Products in New York, NY vacancy
- ...frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised... .... Join us and help build the platform engineers turn to to ship AI products. THE... ...artificial intelligence? We are looking for a Software Engineer focused on ML performance to...SuggestedFlexible hours
$405k
...Model Performance Software Engineer, Claude Code San Francisco, CA | New York City, NY About Anthropic Anthropic's mission is to create reliable... ...Serve as a senior technical bridge between product and research, using strong product intuition to influence...SuggestedWork at officeVisa sponsorshipFlexible hours- ...Software Engineer, Model Routing & Inference Engineering · Full-time · New York; San Francisco Our mission is to automate coding. The first... ...inference platform that powers every AI interaction in the product. This team owns the full inference path: making Cursor's...SuggestedFull timeWork at office
$180k - $210k
...more secure world for all. The Full-Stack Product Engineering team is responsible for building a... ...engineer who can build reliable and scalable software that will be used by all TRM customers.... ...for one or more products. Is a role model and mentor to the entire engineering...SuggestedImmediate startWorldwide$165k - $200k
...Software Engineer II - (Fullstack / Product Engineering) New York, New York, United States StubHub is on a mission to redefine the live event experience on a global scale. Whether someone is looking to attend their first event or their hundredth, we're here to delight...SuggestedTemporary workWork at officeRemote workWorldwideFlexible hours$192k - $240k
...management, bill pay, and travel software, Brex enables founders and... ...need to grow your career. Engineering at Brex Engineering at... ...Brex, you will help develop new products from concept to launch, whether... ...user experience and data models to scalability, operability,...Work at officeRemote workWork from home$170k - $195k
...during the summer and winter holidays). ABOUT THE ROLE The Product Engineering teams at Betterment build the applications our customers... ...have been Betterment’s hallmark since inception. Our software guides customers through the most important life events in their...Temporary workSummer holidayWork at officeLocal areaFlexible hours$200k - $220k
...our future and has the power to change our trajectory. The Product Engineering team is responsible for building a comprehensive set of features... ...backend engineer who can build reliable and scalable software that will be used by all TRM customers. You’ll also design backend...For contractors- ...A cutting-edge healthcare technology firm is seeking an experienced Software Engineer (Product) to design and scale systems for member and clinical experiences. This remote role involves building high-quality features with a focus on collaboration with product teams....Remote work
$210k - $240k
...Senior Product Software Engineer, Applied Technologies New York, NY or Los Angeles, CA The Opportunity This is an exciting time at Enigma... ...Technologies team, you'll design, build, and operate model-driven solutions and products that address complex problems...$148.7k - $199.4k
...Senior Software Engineer - GCI Experiences On any given day at Disney Entertainment & ESPN Technology, we're reimagining ways to create... ...the future. Whether that's evolving our streaming and digital products in new and immersive ways, powering worldwide advertising...Work experience placementWorldwide- ...seeking a motivated individual to join their fully remote team. The successful candidate will collaborate across teams to enhance product quality and work on diverse technical challenges. Ideal applicants have experience in HTML, CSS, and Typescript. This role emphasizes...Remote work
- ...of inventive research, design, and engineering. Our organization is very flat, and... ...shipping code. About the Role As a Software Engineer on the Model Routing & Inference team at Cursor,... ...powers every AI interaction in the product. This team owns the full inference...
$148.7k - $199.4k
...Sr Product Software Engineer Technology is at the heart of Disney's past, present, and future. Disney Entertainment and ESPN Product & Technology is a global organization of engineers, product developers, designers, technologists, data scientists, and more – all working...Worldwide$148.7k - $199.4k
...Senior Product Software Engineer Technology is at the heart of Disney's past, present, and future. Disney Entertainment and ESPN Product & Technology is a global organization of engineers, product developers, designers, technologists, data scientists, and more – all...Local areaWorldwide$148.7k - $199.4k
...Senior Software Engineer - GCI Experiences On any given day at Disney Entertainment & ESPN Technology, we're reimagining ways to create... ...the future. Whether that's evolving our streaming and digital products in new and immersive ways, powering worldwide advertising...Work experience placementLocal areaWorldwide- ...Rippling Financial Product Engineer Rippling gives businesses one place to run HR, IT, and Finance. It brings together all of the workforce... .... Work across the stack, focusing on API design, event modeling, and robust data management while ensuring a seamless connection...Work at office3 days per week
$159k - $278.25k
...Rippling.com addresses. About the Spend Product Team We build the system that gives... ...limits, and policy enforcement. For engineering candidates, it’s a chance to work on payments... ...distributed systems architecture, data modeling, and performance optimization of...Live inWork at officeFlexible hours3 days per week- ...implementation. Work across the stack, focusing on API design, event modeling, and robust data management while ensuring a seamless... ...development, demonstrating the ability to deliver reliable, end-to-end products using a backend language (e.g., Python/Go) and frontend...Work at office3 days per week
$190k - $250k
...transform outbound into a top-performing growth engine by making go-to-market execution... ...About the role: You will bring new products from zero to one working closely with our... ...Who you are: You have 4+ years of software engineering experience and have a track...$213k - $339.9k
...bring the power of computing and software development to everyone. We... ...customers use the product to run global processes across... ...As a Full-Stack, Backend engineer at Airtable, you will have the... ...while we employ a hybrid working model at Airtable (flexible in working...For contractorsLive inWork at officeRemote workFlexible hours- ...frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised... .... Join us and help build the platform engineers turn to to ship AI products. THE... ...We are looking for early-career Software Engineers to join our team. This is a...Flexible hours
$170k - $195k
About the role The Product Engineering teams at Betterment build the applications our customers use when they are logged in to their Betterment... ...that have been Betterment’s hallmark since inception. Our software guides customers through the most important life events in...Temporary workWork at officeLocal areaFlexible hours3 days per week$123.75k - $175k
...Product, Platform & Enterprise Full Stack Software Engineer II (Remote - US) be part of a team that values safety, inclusion, and excellence. we are one of the largest U.S. railroads transporting the nation’s freight across 28 western states and 3 Canadian provinces. as...Full timeH1bRemote work$125k - $240k
A healthcare technology company is seeking a Software Engineer (Product) to design and build systems that enhance member experiences. You will own projects from design to launch, collaborating with product managers and clinical teams. The role requires strong full-stack...Remote jobFlexible hours- SCALIS, a Y Combinator-backed AI company in New York, is looking for Product Engineers to drive the development of innovative AI solutions. You will build features that are utilized immediately by customers, working directly with founders in a high-velocity environment....Immediate start
- ...operations. We move 2-3× faster than any competitor, ship new products constantly, and work hand-in-hand with the largest HVAC, plumbing... ...are still being built. About the Role We’re hiring Product Engineers—generalist full-stack builders who thrive at the intersection...Immediate start
- A tech-driven company in New York seeks a Product-minded Software Engineer to lead feature development from backend services to AI user experiences. This in-person role offers the chance to directly impact products used by thousands, aiming to revolutionize customer engagement...
$125k - $240k
A healthcare technology company is looking for a Software Engineer (Product) to design and build systems powering member experiences. This remote role involves delivering features across web and mobile platforms, collaborating with cross-functional teams. Candidates should...Remote job$140k - $250k
...Software Engineer (Product) Atria Health is seeking a Software Engineer (Product) to help design, build, and scale the systems that power our... ...Architecture & Systems Design and implement APIs, data models, and integrations supporting your core domain (member...Work at office3 days per week
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer - Model Products. Be the first to apply!
Related searches
- graduate software developer New York, NY
- rust software engineer New York, NY
- senior software design engineer New York, NY
- software engineer student New York, NY
- software engineer amazon New York, NY
- software developer positions New York, NY
- software engineer full time New York, NY
- software qa engineer New York, NY
- new graduate software engineer New York, NY
- junior software developer New York, NY

