Software Engineer (Model Performance)
BaseTen
ABOUT BASETEN Baseten powers mission‑critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting‑edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products. THE ROLE Are you passionate about advancing the application of artificial intelligence? We are looking for a Software Engineer focused on ML performance to join our dynamic team. This role is ideal for someone who thrives in a fast‑paced startup environment and is eager to make significant contributions to the exciting field of LLM Inference. If you are a backend engineer who thrives on making things faster and is excited about open‑source ML models, we look forward to your application. EXAMPLE INITIATIVES You'll get to work on these types of projects as part of our Model Performance team: Baseten Embeddings Inference: The fastest embeddings solution available The Baseten Inference Stack Driving model performance optimization RESPONSIBILITIES Implement, refine, and productionize cutting‑edge techniques (quantization, speculative decoding, kv cache reuse, chunked prefill and LoRA) for ML model inference and infrastructure. Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT‑LLM, vllm, sglang, CUDA, and other libraries to debug ML performance issues. Apply and scale optimization techniques across a wide range of ML models, particularly large language models. Collaborate with a diverse team to design and implement innovative solutions. Own projects from idea to production. REQUIREMENTS Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field. Experience with one or more general‑purpose programming languages, such as Python or C++. Familiarity with LLM optimization techniques (e.g., quantization, speculative decoding, continuous batching). Strong familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT‑LLM. Demonstrated interest and experience in LLM’s. Deep understanding of GPU architecture. Bonus: Proficiency in enhancing the performance of software systems, particularly in the context of large language models (LLMs). Experience with CUDA or similar technologies. Deep understanding of software engineering principles and a proven track record of developing and deploying AI/ML inference solutions. Experience with Docker and Kubernetes. BENEFITS Competitive compensation, including meaningful equity. 100% coverage of medical, dental, and vision insurance for employee and dependents Generous PTO policy including company‑wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!) Paid parental leave Company‑facilitated 401(k) Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities. Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward‑thinking team, we would love to hear from you. At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. #J-18808-Ljbffr
$325k
...access our start-of-the-art AI models, allowing them to do things... ...able to before. We focus on performant and efficient model inference... ...the Role We are looking for an engineer who wants to take the world's... ...5 years of professional software engineering experience. Have...Performance- ...at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently... .... Join us and help build the platform engineers turn to to ship AI products. THE ROLE: Baseten’s Model Performance (MP) team is responsible for ensuring the...PerformanceFlexible hours
$230k - $385k
...above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the... ...results, or market conditions. About the Team We’re hiring software engineers to make OpenAI’s Model Performance teams more productive. These teams work on...PerformanceFull timeWork at officeLocal areaRelocation packageFlexible hours- ...Software Engineer (Model Evaluation & Benchmarking) About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems... ...automated benchmarking, dataset-driven testing, and performance validation pipelines. You will work at the intersection of...Performance
$173.11k - $234.39k
...Location Type Hybrid Department Engineering Compensation $173,113 – $234... ..., qualifications, interview performance, and work location. We are... ...data, and run AI agents and models directly in their workflows.... ...QUALIFICATIONS 3+ years of software engineering or equivalent...PerformanceFull timeWork at officeLocal areaFlexible hoursShift work3 days per week$220k - $320k
...squeezing every last drop of performance out of GPUs, diving deep into... ...and hosts specialized language models for companies that need... ...well-funded ten-person team of engineers who work in-person in downtown... ...has founded and run their own software companies. We are high-agency...PerformanceWork at office$166k - $225k
...to improve their business. Databricks’ Model Serving product provides enterprises with... ...strong SLAs and cost efficiency. As a Senior Engineer, you’ll play a critical role in shaping... ...decisions and trade-offs to optimize performance, throughput, autoscaling, and...PerformanceLocal areaWorldwide$405k
...group of committed researchers, engineers, policy experts, and... ...We're looking for a Staff Software Engineer to set technical direction... ...eval frameworks that measure model capabilities across diverse... ...technical initiatives in high-performance, demanding environments—trading...PerformanceWork at officeVisa sponsorshipFlexible hours$172.43k - $230.95k
...Senior Software Engineer For The Ai Model Lifecycle Team Crusoe is on a mission to accelerate the abundance of energy and intelligence. As the... ...partners advance their AI strategies, and be part of a high-performing team that believes in each other, come build with us at...PerformanceTemporary work$300 per month
...and intelligence. We’re crafting the engine that powers a world where people can... ...role About this role: The Staff Software Engineer for the Model LifeCycle team will play a key role... ...source AI projects. Experience with performance optimizations on GPU systems and inference...PerformanceTemporary work$192k - $260k
...improve their business. Foundation Model Serving is the API Product for hosting... ...is necessary. We're looking for engineers who have owned high scale operational sensitive... ...decisions and trade-offs to optimize performance, throughput, autoscaling, and operational...PerformanceLocal areaWorldwide- ...data and AI company in San Francisco is seeking a Senior Engineer to enhance their Model Serving platform. This role requires expertise in... ...distributed systems and collaboration across teams to optimize performance and reliability. Ideal candidates will have a strong...Performance
$192k - $260k
...to improve their business. Databricks’ Model Serving product provides enterprises with... ...SLAs and cost efficiency. As a Staff Engineer, you’ll play a critical role in shaping... ...architectural decisions and trade-offs to optimize performance, throughput, autoscaling, and...PerformanceLocal areaWorldwide- A fast-growing AI company seeks a Software Engineer to focus on Model Evaluation & Benchmarking. This role involves building evaluation systems for multimodal AI, ensuring reliable performance. The ideal candidate will possess strong Python programming skills, familiarity...Performance
- ...AI company in San Francisco is seeking a Staff Engineer to design and implement systems for their AI/ML Model Serving platform. You will collaborate with product... ..., and research teams to ensure high-performance system delivery. The ideal candidate has over 10...Performance
- Role Overview We’re hiring a Model Performance Engineer to own the speed, cost, and reliability of our model inference stack, and to build the... ...one). Benefits The opportunity to shape the foundational software services of a growing company. A role that balances innovation...Performance
- ...developing next-generation multimodal AI models and a proprietary, high-efficiency... ...from AMD with hands-on support from AMD engineers the team is scaling rapidly to build the... ...generation model serving platform , the high-performance engine that will bring a multimodal,...PerformanceWork at officeFlexible hours
$216k - $270k
...As a Software Engineer on the ML Infrastructure team, you will design and build platforms for... ...Build and maintain fault-tolerant, high-performance systems for serving LLMs workloads at... ...and engineers to integrate and optimize models for production and research use cases....PerformanceFull time- ...Python Kubernetes ML infrastructure Requirements Mid level Visa Sponsorship Not mentioned Relocation Not mentioned About the Role ML model serving infrastructure engineer. Interested in this role? Apply directly on Baseten's website Apply for this Position #J-18808-Ljbffr...RelocationVisa sponsorship
$180k - $270k
...involves collaborating with machine learning researchers and engineering teams to define metrics, improve model capabilities, and ensure effective performance tracking. Candidates should bring strong software engineering skills, particularly in Python, and the ability to...Performance- ...Francisco is seeking a skilled individual to enhance the API infrastructure supporting AI models. The role involves designing and optimizing backend services, focusing on performance and reliability. Candidates should have over 3 years of experience with distributed...Performance
- ...A leading technology company in San Francisco is looking for a Senior Engineering Manager to oversee the Model Serving product. This role involves leading a high-performing engineering team, defining the product roadmap, and ensuring the product meets rigorous performance...Performance
- ...in San Francisco, California. The Role: As a Research Engineer - Language Model Pre-Training , you'll shape our language model roadmap... ...Large-scale training runs and model parallelization Performance optimization of our pretraining stack Dataset collection...PerformanceWork at officeRelocation package
- ...research company in San Francisco is seeking a Staff Research Engineer to enhance the efficiency of large language models. In this role, you will develop and implement advanced techniques to optimize model performance in production. Ideal candidates will hold a PhD in...PerformanceRemote work
- ...A leading AI platform company in San Francisco is seeking a Software Engineer focused on machine learning performance. This role involves implementing advanced techniques for ML model inference and debugging performance issues with frameworks like PyTorch and TensorRT....Performance
$325k
...company in San Francisco seeks an engineer to optimize their powerful AI models for high-volume production... ...ideal candidate has over 5 years of software engineering experience, strong familiarity... ...with researchers and focus on performance optimization. Compensation ranges...Performance- ...Slope is seeking a Founding Compiler Engineer in San Francisco, responsible for designing core compiler infrastructure and optimizing AI models. You will write CUDA kernels and conduct performance reviews, contributing to Luminal's mission of making AI workloads portable...PerformanceFull time
- ...Databricks is seeking a Senior Engineering Manager to lead the Model Serving team, responsible for both customer-facing capabilities and foundational... .... You will define product roadmaps and ensure high performance and reliability across systems. The ideal candidate has...Performance
- ...looking for an experienced leader for the Model Routing & Inference team in San... ...traffic routing, cluster management, and performance. The ideal candidate has a strong background in high-throughput systems and software engineering fundamentals, combined with leadership...Performance
$220k - $270k
...reliability. By combining high-performance inference, orchestration,... ...companies, particularly Model Labs, focused on driving successful... ...the AI, tech, or enterprise software industries. Prior... ...with diverse teams, including engineering, product, and customer success...PerformanceTemporary workCurrently hiringRelocationVisa sponsorship
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer (Model Performance). Be the first to apply!
- software sales engineer San Francisco, CA
- software engineer internship remote San Francisco, CA
- IT software developer San Francisco, CA
- new grad software engineer San Francisco, CA
- software engineer staff San Francisco, CA
- integration software engineer San Francisco, CA
- machine learning software engineer San Francisco, CA
- software engineer part time San Francisco, CA
- facebook software engineer San Francisco, CA
- senior robotics software engineer San Francisco, CA


