Software Engineer - Model Performance

Baseten

Software Engineer Focused On ML Performance

Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products.

Are you passionate about advancing the application of artificial intelligence? We are looking for a Software Engineer focused on ML performance to join our dynamic team. This role is ideal for someone who thrives in a fast-paced startup environment and is eager to make significant contributions to the exciting field of LLM Inference. If you are a backend engineer who thrives on making things faster and is excited about open-source ML models, we look forward to your application.

You'll get to work on these types of projects as part of our Model Performance team:

Baseten Embeddings Inference: The fastest embeddings solution available
The Baseten Inference Stack
Driving model performance optimization

Implement, refine, and productionize cutting-edge techniques (quantization, speculative decoding, kv cache reuse, chunked prefill and LoRA) for ML model inference and infrastructure.

Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vllm, sglang, CUDA, and other libraries to debug ML performance issues.

Apply and scale optimization techniques across a wide range of ML models, particularly large language models.

Collaborate with a diverse team to design and implement innovative solutions.

Own projects from idea to production.

Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field.

Experience with one or more general-purpose programming languages, such as Python or C++.

Familiarity with LLM optimization techniques (e.g., quantization, speculative decoding, continuous batching).

Strong familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM.

Demonstrated interest and experience in LLM's.

Deep understanding of GPU architecture.

Competitive compensation, including meaningful equity.

100% coverage of medical, dental, and vision insurance for employee and dependents

Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)

Paid parental leave

Fertility and family-building stipend through Carrot

Company-facilitated 401(k)

Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.

We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the Software Engineer - Model Performance in United States vacancy

Software Engineer, Model Inference
...access our start-of-the-art AI models, allowing them to do things... ...able to before. We focus on performant and efficient model inference... ...Role We are looking for an engineer who wants to take the world's... ...5 years of professional software engineering experience. Have...
Performance
OpenAI
San Francisco, CA
1 day ago
Software Engineer (Model Evaluation & Benchmarking)
...About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems that ensure multimodal AI behaves reliably, consistently... ...automated benchmarking, dataset-driven testing, and performance validation pipelines. You will work at the...
Performance
SPREEAI
San Francisco, CA
1 day ago
Software Engineer, Productivity - Model Performance
$230k - $385k
About the Team We're hiring software engineers to make OpenAI's Model Performance teams more productive. These teams work on the systems, tooling, and infrastructure that help improve model performance across OpenAI's training and inference workloads at frontier scale....
Performance
OpenAI
San Francisco, CA
7 hours ago
Senior Software Engineer - Model Performance
$220k - $320k
...squeezing every last drop of performance out of GPUs, diving deep into... ...and hosts specialized language models for companies that need... ...well-funded ten-person team of engineers who work in-person in downtown... ...has founded and run their own software companies. We are high-agency...
Performance
Work at office
Inference
San Francisco, CA
3 days ago
Software Engineer - Model Products
...Baseten Model Performance Engineer Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer...
Performance
Remote work
Flexible hours
Baseten
United States
7 hours ago
Model Performance Software Engineer, Claude Code
$405k
...Model Performance Software Engineer, Claude Code San Francisco, CA | New York City, NY About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole...
Performance
Work at office
Visa sponsorship
Flexible hours
anthropic
New York, NY
3 days ago
Software Engineer, Model Hardware CoDesign
$140k - $390k
...AI ASIC). This role sits at the intersection of ML modeling and hardware-aware systems engineering - you will architect and train state-of-the-art models... ...underlying silicon and compiler stack to maximize performance. You will drive the full lifecycle from model research...
Performance
Hourly pay
Full time
Temporary work
Flexible hours
Tesla
Palo Alto, CA
3 days ago
Core Model Software Development Engineer
Core Model Software Development Engineer Hyundai America Technical Center, Inc. (HATCI) is currently looking for a Core Model Software Development... ...tool and subsystem models for fuel economy, linear performance, grade performance, and trailer tow simulation...
Performance
For contractors
Flexible hours
Hyundai America Technical Center
Superior, MI
2 days ago
Senior Software Engineer, AI Model Lifecycle
$172.43k - $230.95k
...Senior Software Engineer For The Ai Model Lifecycle Team Crusoe is on a mission to accelerate the abundance of energy and intelligence. As the... ...partners advance their AI strategies, and be part of a high-performing team that believes in each other, come build with us at...
Performance
Temporary work
Crusoe
San Francisco, CA
7 hours ago
Software Engineer Intern (AI Model Optimization) - 2026 Summer (BS/MS)
$45 per hour
...our global user base! You will work on improving the performance and efficiency of large-scale AI models across training, inference, and deployment. This is... ...early. Responsibilities: - Support research and engineering efforts to optimize deep learning models for speed,...
Performance
Hourly pay
Full time
Summer work
Internship
Local area
Tik Tok
San Jose, CA
1 day ago
Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference
$193.3k - $261.5k
...(AWS) builds AWS Neuron, the software development kit used to accelerate... ...ML inference and training performance. The Inference Enablement... ...of running a wide range of models and supporting novel architecture... ...-software boundary, our engineers build systematic infrastructure...
Performance
Work experience placement
Internship
Local area
Flexible hours
Amazon
Cupertino, CA
1 day ago
Senior DL Software Engineer, Model Optimization and Edge Deployment - Autonomous Vehicles
$184k - $287.5k
Responsibilities Develop state‑of‑the‑art model optimization techniques—... ...—to boost end‑to‑end model performance for production deployments.... ...on the road. Architect the software interface to seamlessly... ...Computer Science, Computer Engineering, or a related technical...
Performance
NVIDIA Gruppe
Santa Clara, CA
5 days ago
Software Engineer (Model Hub)
$173.11k - $234.39k
...Location Type Hybrid Department Engineering Compensation $173,113 - $23... ..., qualifications, interview performance, and work location. We are... ...data, and run AI agents and models directly in their workflows.... ...QUALIFICATIONS 3+ years of software engineering or equivalent...
Performance
Full time
Work at office
Local area
Flexible hours
Shift work
3 days per week
Menlo Ventures
San Francisco, CA
2 days ago
Senior Software Engineer, Model Serving
$166k - $225k
...to improve their business. Databricks’ Model Serving product provides enterprises with... ...SLAs and cost efficiency. As a Senior Engineer, you’ll play a critical role in shaping... ...architectural decisions and trade-offs to optimize performance, throughput, autoscaling, and...
Performance
Local area
Worldwide
Cacheflow
San Francisco, CA
4 days ago
Software Engineer, Model Performance Systems
...Software Engineer Opportunity Baseten powers mission-critical inference for the world's most... ...frontier of AI to bring cutting-edge models into production. Join us and help build... ...sitting at the intersection of high-performance computing (HPC) and Large Language Model...
Performance
Remote work
Flexible hours
Baseten
United States
3 days ago
Senior AI Software Engineer - Model Evaluation (f/m/d)
...Senior AI Engineer In Pre-training Evaluation Aleph Alpha Research's mission is to deliver... ...Our organization develops foundational models and next-generation methods that make it... ...and whether it predicts downstream performance. Other weeks you'll be optimizing pipeline...
Performance
Remote work
Relocation
Flexible hours
Aleph Alpha
United States
3 days ago
Edge ML Software Engineer (Model Optimization-PICO) - San Jose
$212.8k
...Responsibilities: - Convert and compile ML models for execution on edge NPUs,... ...- Profile and analyze model performance and power consumption on... ...Science, Electrical Engineering, Computer Engineering, or a... ...in machine learning software engineering, model deployment...
Performance
Temporary work
Local area
ByteDance
San Jose, CA
4 days ago
Software Engineer, Apple Intelligence Model Platform
...Software Engineer, Apple Intelligence Model Platform The Proactive Intelligence Platform is at the heart of an intelligent system experience that... ...released code. You will develop and improve unit tests, performance tests, and diagnose and resolve customer reported...
Performance
Worldwide
Apple
Cupertino, CA
7 hours ago
Software Engineer II - Model Platform
$149.2k - $214.5k
...Role Abnormal AI is looking for a Software Backend Engineer II to join the Detection Team. The... ...on building systems for Detection's Model Platform, you will be responsible for... ...computer science, data structures, and performance optimization. ~ BS degree in Computer...
Performance
Immediate start
Remote work
Abnormal AI, Inc.
United States
3 days ago
Software Engineer L4/L5, Model Serving Systems, Machine Learning Platform
$100k
...Opportunity The Consumer ML Model Compute & Serving Systems team... ...framework, a compute orchestration engine, and many more. We are looking for strong software engineers for this team, which... ...availability, throughput, and performance. You are adept at building...
Performance
Hourly pay
Full time
Immediate start
Remote work
Flexible hours
Netflix
Los Gatos, CA
2 hours ago
Software Engineer - Hosted Model Infrastructure
$145k - $200k
...Palantir builds the world’s leading software for data-driven decisions and... .... The Role We are a software engineering team with expertise in enabling ML models in production. We deploy AI... ...Responsibilities Building high-performance model serving infrastructure that...
Performance
Full time
Work experience placement
Work at office
Remote work
Work from home
Relocation package
Palantir Technologies
New York, NY
21 hours ago
Remote Software Developer - AI Model Training & Evaluation
$40 per hour
...We are looking for a Software Developer to join our team to train AI models. You will measure the progress of these AI chatbots, evaluate their logic, and... ...quality produced by AI models for correctness and performance Qualifications Fluency in English (native or...
Performance
Hourly pay
Full time
Contract work
Part time
Remote work
DataAnnotation
United States
2 days ago
Senior Python Developer - AI/ML Model SDKs
...Senior Python Developer - AI/ML Model SDKs **(USCs + GC... ...used by data scientists and ML engineers • Develop SDKs that support... ...Python expertise and strong software engineering practices to build... ...reviews, CI/CD, linting, and performance optimization • Manage the end...
Performance
Contract work
Diverse Lynx
Alpharetta, GA
1 day ago
Python Insfrastructure Engineer - Model Evaluation
...Python Infrastructure Engineer — Model Evaluation What if your Python expertise could directly shape how the world's most advanced AI... ...What You'll Do Design, build, and optimize high-performance Python systems supporting AI data pipelines and model evaluation...
Performance
Hourly pay
Ongoing contract
Contract work
Freelance
Remote work
Flexible hours
Alignerr
United States
4 days ago
Python Developer - EQ Factor Model Risk Technology
$175k - $250k
...Python Developer - EQ Factor Model Risk Technology Millennium is looking for an exceptional... ...impactful work at the intersection of engineering, data, and quantitative analytics.... ...models into the firm's delivery platforms Perform extensive back-testing of existing and...
Performance
Millennium Management Corp
New York, NY
3 days ago
Remote AI Systems Developer: Model Evaluation & Training
$40 per hour
...specializing in AI is seeking a Systems Developer for a remote position. The role involves training AI models, providing coding challenges, and evaluating their performance. Candidates should be proficient in at least one programming language, including Python or...
Performance
Hourly pay
Remote work
Flexible hours
DataAnnotation
United States
4 days ago
CrossMargin Quantitative Model Developer
...financial services industry, is seeking a CrossMargin Quantitative Model Developer to join their team. As a CrossMargin Quantitative... ...Potential for contract extension based on project needs and performance. Work in a vibrant city with a hybrid work schedule, combining...
Performance
Contract work
Work at office
Remote work
Manpower Group Inc.
Charlotte, NC
1 day ago
Quantitative Model Engineer
...Interactive Brokers (IBKR) seeks a Quantitative Software Engineer to join our elite transaction... ...next-generation surveillance models to detect emerging manipulation patterns... ...millions of daily trades) Evaluate model performance to optimize detection accuracy while minimizing...
Performance
Work at office
Remote work
Interactive Brokers
Greenwich, CT
3 days ago
Principal Software Engineer - CoreAI Model Inference & Serving
$139.9k - $274.8k
...Llama, and more. As a? Principal Software Engineer , you will shape the future of one of... ...AI strategy. Our mission is to serve models at scale-reliably, efficiently, and with... ...scalability, observability, efficiency, and performance across mission-critical services....
Performance
Ongoing contract
Local area
Microsoft Corporation
Redmond, WA
1 day ago
Staff Software Engineer, Foundational Model Serving
$192k - $260k
...improve their business. Foundation Model Serving is the API Product for hosting... ...is necessary. We're looking for engineers who have owned high scale operational sensitive... ...decisions and trade-offs to optimize performance, throughput, autoscaling, and operational...
Performance
Local area
Worldwide
Databricks
San Francisco, CA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer - Model Performance. Be the first to apply!