Software Engineer - Model Performance
Baseten
Software Engineer Focused On ML Performance
Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products.
Are you passionate about advancing the application of artificial intelligence? We are looking for a Software Engineer focused on ML performance to join our dynamic team. This role is ideal for someone who thrives in a fast-paced startup environment and is eager to make significant contributions to the exciting field of LLM Inference. If you are a backend engineer who thrives on making things faster and is excited about open-source ML models, we look forward to your application.
You'll get to work on these types of projects as part of our Model Performance team:
- Baseten Embeddings Inference: The fastest embeddings solution available
- The Baseten Inference Stack
- Driving model performance optimization
Implement, refine, and productionize cutting-edge techniques (quantization, speculative decoding, kv cache reuse, chunked prefill and LoRA) for ML model inference and infrastructure.
Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vllm, sglang, CUDA, and other libraries to debug ML performance issues.
Apply and scale optimization techniques across a wide range of ML models, particularly large language models.
Collaborate with a diverse team to design and implement innovative solutions.
Own projects from idea to production.
Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field.
Experience with one or more general-purpose programming languages, such as Python or C++.
Familiarity with LLM optimization techniques (e.g., quantization, speculative decoding, continuous batching).
Strong familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM.
Demonstrated interest and experience in LLM's.
Deep understanding of GPU architecture.
Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents
Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
Paid parental leave
Fertility and family-building stipend through Carrot
Company-facilitated 401(k)
Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.
We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
- ...access our start-of-the-art AI models, allowing them to do things... ...able to before. We focus on performant and efficient model inference... ...Role We are looking for an engineer who wants to take the world's... ...5 years of professional software engineering experience. Have...Performance
- ...About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems that ensure multimodal AI behaves reliably, consistently... ...automated benchmarking, dataset-driven testing, and performance validation pipelines. You will work at the...Performance
$230k - $385k
About the Team We're hiring software engineers to make OpenAI's Model Performance teams more productive. These teams work on the systems, tooling, and infrastructure that help improve model performance across OpenAI's training and inference workloads at frontier scale....Performance$220k - $320k
...squeezing every last drop of performance out of GPUs, diving deep into... ...and hosts specialized language models for companies that need... ...well-funded ten-person team of engineers who work in-person in downtown... ...has founded and run their own software companies. We are high-agency...PerformanceWork at office- ...Baseten Model Performance Engineer Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer...PerformanceRemote workFlexible hours
$405k
...Model Performance Software Engineer, Claude Code San Francisco, CA | New York City, NY About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole...PerformanceWork at officeVisa sponsorshipFlexible hours$140k - $390k
...AI ASIC). This role sits at the intersection of ML modeling and hardware-aware systems engineering - you will architect and train state-of-the-art models... ...underlying silicon and compiler stack to maximize performance. You will drive the full lifecycle from model research...PerformanceHourly payFull timeTemporary workFlexible hours- Core Model Software Development Engineer Hyundai America Technical Center, Inc. (HATCI) is currently looking for a Core Model Software Development... ...tool and subsystem models for fuel economy, linear performance, grade performance, and trailer tow simulation...PerformanceFor contractorsFlexible hours
$172.43k - $230.95k
...Senior Software Engineer For The Ai Model Lifecycle Team Crusoe is on a mission to accelerate the abundance of energy and intelligence. As the... ...partners advance their AI strategies, and be part of a high-performing team that believes in each other, come build with us at...PerformanceTemporary work$45 per hour
...our global user base! You will work on improving the performance and efficiency of large-scale AI models across training, inference, and deployment. This is... ...early. Responsibilities: - Support research and engineering efforts to optimize deep learning models for speed,...PerformanceHourly payFull timeSummer workInternshipLocal area$193.3k - $261.5k
...(AWS) builds AWS Neuron, the software development kit used to accelerate... ...ML inference and training performance. The Inference Enablement... ...of running a wide range of models and supporting novel architecture... ...-software boundary, our engineers build systematic infrastructure...PerformanceWork experience placementInternshipLocal areaFlexible hours$184k - $287.5k
Responsibilities Develop state‑of‑the‑art model optimization techniques—... ...—to boost end‑to‑end model performance for production deployments.... ...on the road. Architect the software interface to seamlessly... ...Computer Science, Computer Engineering, or a related technical...Performance$173.11k - $234.39k
...Location Type Hybrid Department Engineering Compensation $173,113 - $23... ..., qualifications, interview performance, and work location. We are... ...data, and run AI agents and models directly in their workflows.... ...QUALIFICATIONS 3+ years of software engineering or equivalent...PerformanceFull timeWork at officeLocal areaFlexible hoursShift work3 days per week$166k - $225k
...to improve their business. Databricks’ Model Serving product provides enterprises with... ...SLAs and cost efficiency. As a Senior Engineer, you’ll play a critical role in shaping... ...architectural decisions and trade-offs to optimize performance, throughput, autoscaling, and...PerformanceLocal areaWorldwide- ...Software Engineer Opportunity Baseten powers mission-critical inference for the world's most... ...frontier of AI to bring cutting-edge models into production. Join us and help build... ...sitting at the intersection of high-performance computing (HPC) and Large Language Model...PerformanceRemote workFlexible hours
- ...Senior AI Engineer In Pre-training Evaluation Aleph Alpha Research's mission is to deliver... ...Our organization develops foundational models and next-generation methods that make it... ...and whether it predicts downstream performance. Other weeks you'll be optimizing pipeline...PerformanceRemote workRelocationFlexible hours
$212.8k
...Responsibilities: - Convert and compile ML models for execution on edge NPUs,... ...- Profile and analyze model performance and power consumption on... ...Science, Electrical Engineering, Computer Engineering, or a... ...in machine learning software engineering, model deployment...PerformanceTemporary workLocal area- ...Software Engineer, Apple Intelligence Model Platform The Proactive Intelligence Platform is at the heart of an intelligent system experience that... ...released code. You will develop and improve unit tests, performance tests, and diagnose and resolve customer reported...PerformanceWorldwide
$149.2k - $214.5k
...Role Abnormal AI is looking for a Software Backend Engineer II to join the Detection Team. The... ...on building systems for Detection's Model Platform, you will be responsible for... ...computer science, data structures, and performance optimization. ~ BS degree in Computer...PerformanceImmediate startRemote work$100k
...Opportunity The Consumer ML Model Compute & Serving Systems team... ...framework, a compute orchestration engine, and many more. We are looking for strong software engineers for this team, which... ...availability, throughput, and performance. You are adept at building...PerformanceHourly payFull timeImmediate startRemote workFlexible hours$145k - $200k
...Palantir builds the world’s leading software for data-driven decisions and... .... The Role We are a software engineering team with expertise in enabling ML models in production. We deploy AI... ...Responsibilities Building high-performance model serving infrastructure that...PerformanceFull timeWork experience placementWork at officeRemote workWork from homeRelocation package$40 per hour
...We are looking for a Software Developer to join our team to train AI models. You will measure the progress of these AI chatbots, evaluate their logic, and... ...quality produced by AI models for correctness and performance Qualifications Fluency in English (native or...PerformanceHourly payFull timeContract workPart timeRemote work- ...Senior Python Developer - AI/ML Model SDKs **(USCs + GC... ...used by data scientists and ML engineers • Develop SDKs that support... ...Python expertise and strong software engineering practices to build... ...reviews, CI/CD, linting, and performance optimization • Manage the end...PerformanceContract work
- ...Python Infrastructure Engineer — Model Evaluation What if your Python expertise could directly shape how the world's most advanced AI... ...What You'll Do Design, build, and optimize high-performance Python systems supporting AI data pipelines and model evaluation...PerformanceHourly payOngoing contractContract workFreelanceRemote workFlexible hours
$175k - $250k
...Python Developer - EQ Factor Model Risk Technology Millennium is looking for an exceptional... ...impactful work at the intersection of engineering, data, and quantitative analytics.... ...models into the firm's delivery platforms Perform extensive back-testing of existing and...Performance$40 per hour
...specializing in AI is seeking a Systems Developer for a remote position. The role involves training AI models, providing coding challenges, and evaluating their performance. Candidates should be proficient in at least one programming language, including Python or...PerformanceHourly payRemote workFlexible hours- ...financial services industry, is seeking a CrossMargin Quantitative Model Developer to join their team. As a CrossMargin Quantitative... ...Potential for contract extension based on project needs and performance. Work in a vibrant city with a hybrid work schedule, combining...PerformanceContract workWork at officeRemote work
- ...Interactive Brokers (IBKR) seeks a Quantitative Software Engineer to join our elite transaction... ...next-generation surveillance models to detect emerging manipulation patterns... ...millions of daily trades) Evaluate model performance to optimize detection accuracy while minimizing...PerformanceWork at officeRemote work
$139.9k - $274.8k
...Llama, and more. As a? Principal Software Engineer , you will shape the future of one of... ...AI strategy. Our mission is to serve models at scale-reliably, efficiently, and with... ...scalability, observability, efficiency, and performance across mission-critical services....PerformanceOngoing contractLocal area$192k - $260k
...improve their business. Foundation Model Serving is the API Product for hosting... ...is necessary. We're looking for engineers who have owned high scale operational sensitive... ...decisions and trade-offs to optimize performance, throughput, autoscaling, and operational...PerformanceLocal areaWorldwide
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer - Model Performance. Be the first to apply!
- software sales engineer United States
- software engineer full time United States
- facebook software engineer United States
- startup software engineer United States
- intermediate software engineer United States
- research software engineer United States
- software developer no experience United States
- labview software developer United States
- rust software engineer United States
- freelance software developer United States


