Senior Software Engineer - Model Performance
$220k - $320kInference
Help us make inference blazingly fast. If you love squeezing every last drop of performance out of GPUs, diving deep into CUDA kernels, and turning optimization techniques into production systems, we'd love to meet you. About Inference.net Inference.net trains and hosts specialized language models for companies that need frontier-quality AI at a fraction of the cost. The models we train match GPT-5 accuracy but are smaller, faster, and up to 90% cheaper. Our platform handles everything end-to-end: distillation, training, evaluation, and planet-scale hosting. We are a well-funded ten-person team of engineers who work in-person in downtown San Francisco on difficult, high-impact engineering problems. Everyone on the team has been writing code for over 10 years, and has founded and run their own software companies. We are high-agency, adaptable, and collaborative. We value creativity alongside technical prowess and humility. We work hard, and deeply enjoy the work that we do. Most of us are in the office 4 days a week in SF; hybrid works for Bay Area candidates. About the Role You will be responsible for making our inference stack as fast and efficient as possible. Your work spans from implementing known optimization techniques to experimenting with novel approaches, always with the goal of serving models faster and cheaper at scale. Your north star is inference performance: latency, throughput, cost efficiency, and how quickly we can bring new model architectures into production. You'll work across the full inference stack—from CUDA kernels to serving frameworks—to find and eliminate bottlenecks. This role reports directly to the founding team. You'll have the autonomy, a large compute budget, and technical support to push the limits of what's possible in model serving. Key Responsibilities Implement and productionize optimization techniques including quantization, speculative decoding, KV cache optimization, continuous batching, and LoRA serving Deep dive into inference frameworks (vLLM, SGLang, TensorRT-LLM) and underlying libraries to debug and improve performance Profile and optimize CUDA kernels and GPU utilization across our serving infrastructure Add support for new model architectures, ensuring they meet our performance standards before going to production Experiment with novel inference techniques and bring successful approaches into production Build tooling and benchmarks to measure and track inference performance across our fleet Collaborate with applied ML engineers to ensure trained models can be served efficiently Requirements 2+ years of experience in ML systems, inference optimization, or GPU programming Strong proficiency in Python and familiarity with C++ Hands-on experience with LLM inference frameworks (vLLM, SGLang, TensorRT-LLM, or similar) Deep understanding of GPU architecture and experience profiling GPU workloads Familiarity with LLM optimization techniques (quantization, speculative decoding, continuous batching, KV cache management) Experience with PyTorch and understanding of how models execute on hardware Track record of measurably improving system performance Nice-to-Have Experience with CUDA programming Familiarity with serving non-LLM models (TTS, vision, embeddings) Experience with distributed inference and multi-GPU serving Contributions to open-source inference frameworks Experience with Docker and Kubernetes You don't need to tick every box. Curiosity and the ability to learn quickly matter more. Compensation We offer competitive compensation, equity in a high-growth startup, and comprehensive benefits. The base salary range for this role is $220,000 - $320,000, plus equity and benefits, depending on experience. Equal Opportunity Inference.net is an equal opportunity employer. We welcome applicants from all backgrounds and don't discriminate based on race, color, religion, gender, sexual orientation, national origin, genetics, disability, age, or veteran status. If you're excited about making AI inference faster for everyone, we'd love to hear from you. Please send your resume and GitHub to View email address on click.appcast.io and/or apply here on Ashby. #J-18808-Ljbffr
$166k - $225k
...to improve their business. Databricks’ Model Serving product provides enterprises... ...strong SLAs and cost efficiency. As a Senior Engineer, you’ll play a critical role in shaping... ...architectural decisions and trade-offs to optimize performance, throughput, autoscaling, and...SeniorPerformanceLocal areaWorldwide- ...leading data and AI company in San Francisco is seeking a Senior Engineer to enhance their Model Serving platform. This role requires expertise in... ...distributed systems and collaboration across teams to optimize performance and reliability. Ideal candidates will have a strong...SeniorPerformance
- ...AI company in San Francisco is seeking a Staff Engineer to design and implement systems for their AI/ML Model Serving platform. You will collaborate with product... ..., and research teams to ensure high-performance system delivery. The ideal candidate has over 10...SeniorPerformance
$237.6k - $318.24k
...and partners advance their AI strategies, and be part of a high‑performing team that believes in each other, come build with us at Crusoe. About This Role: The Senior Staff Software Engineer for the AI Model Lifecycle team will play a crucial role in building a...SeniorPerformanceTemporary work$325k
...company in San Francisco seeks an engineer to optimize their powerful AI models for high-volume production... ...ideal candidate has over 5 years of software engineering experience, strong familiarity... ...with researchers and focus on performance optimization. Compensation ranges...SeniorPerformance- ...the frontier of AI to bring cutting-edge models into production. With our recent $150M... ...customer demand. THE ROLE Baseten’s Model Performance (MP) team is responsible for ensuring... ...contributions to open-source inference engines (vLLM, TensorRT-LLM, SGLang, TGI) Knowledge...PerformanceFlexible hours
$325k
...access our start-of-the-art AI models, allowing them to do things... ...able to before. We focus on performant and efficient model inference... ...the Role We are looking for an engineer who wants to take the world's... ...5 years of professional software engineering experience. Have...Performance- Software Engineer (Model Evaluation & Benchmarking) About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems... ...automated benchmarking, dataset-driven testing, and performance validation pipelines. You will work at the intersection of...Performance
- ...plans to support business objectives and drive growth. Manage all aspects of marketing campaigns, including planning, execution, and performance analysis. Collaborate with cross-functional teams to develop marketing collateral, content, and messaging. Oversee digital...SeniorPerformanceFreelance
- ...frontier of AI to bring cutting‑edge models into production. We're growing... ...Join us and help build the platform engineers turn to to ship AI products. THE... ...intelligence? We are looking for a Software Engineer focused on ML performance to join our dynamic team. This role...PerformanceFlexible hours
$230k - $385k
...above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the... ...results, or market conditions. About the Team We’re hiring software engineers to make OpenAI’s Model Performance teams more productive. These teams work on...PerformanceFull timeWork at officeLocal areaRelocation packageFlexible hours$320k
...group of committed researchers, engineers, policy experts, and business... ...The Role We’re looking for a Software Engineer to work at the... ...evaluation systems that measure model capabilities across diverse coding... ...systems Working in high‑performance, demanding environments—trading...PerformanceWork experience placementWork at officeVisa sponsorshipFlexible hours- A leading AI research accelerator is looking for a skilled software engineer based in the US or Canada. This contractor role involves evaluating... ...requires a minimum of 10 hours per week, with a duration of 1 month and potential extensions based on performance. #J-18808-LjbffrSeniorPerformanceFor contractorsRemote work10 hours per week
$187k - $231k
...days per week in office) The Opportunity We’re hiring Senior Software Engineers to help shape the future of private markets. At Sydecar... ...compensation may vary depending on experience, qualifications, performance and other factors. Sydecar’s values Our values are...SeniorPerformanceBank staffWork at officeNight shift2 days per week- ...eliminating complexity and friction with seamless automation. As a Senior Software Engineer at Capably, you’ll build the systems that make enterprise... ...that power Capably’s product, with a sharp focus on performance, reliability, maintainability, and user impact. You’ll...SeniorPerformanceImmediate start
- ...Improve agent intelligence, reliability, and real-time voice performance Work closely with Product, Customer Success, and Sales to... ...Looking For 4-8 years of experience in backend or full-stack software engineering Strong engineering fundamentals and product intuition...SeniorPerformanceWork at officeImmediate start
- ...Job Title Senior Software Engineer Location: San Francisco, CA, United States Job Type: Full-Time Job Summary The world is experiencing explosive... ...great designs and great code. You will be part of a high performance team that is working on an industry-leading Optimization...SeniorPerformanceFull timeWorldwide
- MSCI is seeking a Senior Software Engineer in San Francisco, CA, to design and build high-performance distributed systems. This role requires strong hands-on engineering skills alongside architectural thinking to create scalable and reliable services. Candidates should...SeniorPerformance
- Nuro is seeking talented engineers to join our Performance team to optimize the performance of our AV software. The role focuses on ensuring our vehicles react quickly and safely to their environment. Candidates should have strong C++ skills and a deep understanding of...SeniorPerformance
- ...About the role This isn't just another engineering role. This is a unique opportunity to be... ...function and shape the future of performance and scalability at Persona. As our products... ...What you'll bring to Persona A strong software engineering background, demonstrated by...SeniorPerformanceFull timeTemporary workFor contractorsInternship
$400k
...Mechanize RL Engineer Mechanize builds reinforcement learning environments... ...and evaluate their coding models. Learn more at mechanize.work... ..., judgment-heavy parts of software engineering. We build the... ...infrastructure, distributed systems, performance, security, or other...SeniorPerformance- ...sensors) foot scan into precision engineered, 3D‑printed insoles that... ...the world 3 Proprietary AI models that power the experience... ...month‑over‑month. The Role As a Senior Software Engineer, you’ll build and... ...or days. you’ll join a high‑performing team of top engineers and AI...SeniorPerformance
- ...Role Overview Transcend is hiring a Senior Software Engineer on the Workflows Team to build software that helps companies tackle privacy... ...roadmap discussions and technical design reviews, weighing performance, cost, reliability, and scale trade‑offs. Proactively fix...SeniorPerformanceFull timeRemote workFlexible hours
- ...described here is that of an individually contributing senior software development engineer. Your time will mostly be spent understanding the problem... ...solved to make this frictionless, scalable and highly performant. There’s always the thrill of working at an early stage...SeniorPerformanceRemote work
- ...advertising platform built for the next era of performance marketing. We operate at the... .... Role Overview We are looking for a Software Engineer or Senior Software Engineer (Backend) to join... ...integrate and deploy machine learning models into production environments Improve...SeniorPerformanceWork at officeMonday to Friday
- ...Core team is dedicated to developing the software that powers Walrus’ storage nodes and... ...teams, to keep the Walrus network secure, performant, and reliable. Optimize existing... ...Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical...SeniorPerformance
$160k - $220k
...hardware, robotics, and product software. Our software team builds the systems that connect models to motors, sensors to... ...production. About the Role As a Senior Software Engineer at Droyd, you will own core... ...in large codebases and performance-sensitive environments Can...SeniorPerformanceFull time- ...Senior Software Engineer | AI Healthcare Platform Startup San Francisco, CA (Onsite) | High-growth startup | Mission-driven We’re hiring... ...’ll work on You’ll be building the backbone of a high-performance AI system, including: Architecting and scaling distributed...SeniorPerformance
- ...Senior Software Engineer San Francisco, California, United States Or refer someone Job Openings Senior Software Engineer Position Title... ...functional teams Uphold high standards for code quality, performance, and maintainability Contribute to product strategy and...SeniorPerformanceFull time
$225k
...Senior Software Engineer Job ID: 6824015 Posted: 9 days ago Experience: Senior Level Salary: $225,000 - $450,000 per year Job Details What You... ..., automation, and AI research agents Improving the performance and reliability of MCP servers and backend agentic services...SeniorPerformanceWork at office2 days per week3 days per week
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Software Engineer - Model Performance. Be the first to apply!
- software sales engineer San Francisco, CA
- software engineer amazon San Francisco, CA
- software engineer student San Francisco, CA
- agile software developer San Francisco, CA
- rust software engineer San Francisco, CA
- software developer positions San Francisco, CA
- senior software design engineer San Francisco, CA
- software developer San Francisco, CA
- ngo software engineer San Francisco, CA
- startup software engineer San Francisco, CA



