Software Engineer, Model Performance Systems
Baseten
ABOUT BASETEN Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products. THE OPPORTUNITY We are looking for early-career Software Engineers to join our team. This is a specialized role sitting at the intersection of high-performance computing (HPC) and Large Language Model (LLM) engineering. You will be responsible for building the automated "speedometer and diagnostic" suite for our next-generation AI infrastructure. In this role, you won't just be using models; you will be tearing them apart to see how they run on the metal. You will build tools that measure GPU FLOPS, stress-test InfiniBand clusters, and define the benchmarks that ensure our systems are production-ready. RESPONSIBILITIES
Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you. At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
- Performance Benchmarking: Run and automate standard LLM quality benchmarks (GSM8K, MMLU) alongside custom performance suites for specific workloads (e.g., long-context window, KV cache reuse).
- Infrastructure Validation: Create automated acceptance tests for new GPU clusters across x86 and ARM systems, measuring GPU memory bandwidth, networking throughput, and multi-node networking performance.
- Model Dev Experience: Develop and maintain internal GPU-enabled development environments (similar to GitHub Codespaces). You will ensure the team has seamless, high-performance "dev machines" optimized for model experimentation.
- Tool Development: Build and contribute to tools such as InferenceMAX and genai-bench to automate model evaluation and optimization.
- Deep Hardware Profiling: Use PyTorch Profiler and NVIDIA Nsight Systems to collect performance profiles, identify bottlenecks, and debug the NVIDIA compute/networking stack.
- Monitoring & Observability: Develop real-time dashboards and alerts to monitor system health, model startup times, and runtime performance.
- Continuous Integration: Automate performance testing via CI/CD pipelines to catch regressions in model setups before they hit production.
- Optimization Automation: Build tools to find the "Pareto frontier"-identifying the absolute best configuration (latency vs. cost vs. quality) for a given model and workload.
- A Love for Systems & Hardware: You aren't just interested in the AI; you want to understand GPU memory subsystems, InfiniBand, and how data moves across a cluster.
- An Automation Mindset: You believe that if a task has to be done twice, it should be scripted. You have a passion for stress-testing and fuzzy testing to find the "breaking point" of a system.
- Mathematical Curiosity: A desire to understand the underlying math of Transformers and how it translates into FLOPs and memory requirements.
- Interest in Optimization: You are excited to learn about (or already play with) quantization, speculative decoding, disaggregated serving, and kernel-level optimizations.
- Technical Toolkit: Familiarity with Python, and an eagerness to master the NVIDIA software stack. C++ familiarity is good to have.
- Direct Impact: Your tools will be the gatekeeper for what defines "good" performance for our customers.
- Deep Learning (Literally): You will gain world-class expertise in GPU orchestration and LLM inference that few engineers in the industry possess.
- High Ownership: As a small team of freshers led by experts, you will have the autonomy to build tools from scratch and contribute to open-source projects.
- Competitive compensation, including meaningful equity.
- 100% coverage of medical, dental, and vision insurance for employee and dependents
- Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
- Paid parental leave
- Fertility and family-building stipend through Carrot
- Company-facilitated 401(k)
- Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you. At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Software Engineer, Model Performance Systems in New York, NY vacancy
$40 per hour
A data-focused technology company is seeking a Systems Developer to join their team in training AI models. This role involves measuring chatbot performance and solving coding challenges. Candidates should be proficient in at least one programming language like Python or...PerformanceHourly payContract workRemote workFlexible hours- ...Software Engineer, Model Routing & Inference Engineering · Full-time · New York; San Francisco... ...-throughput, low-latency distributed systems, especially in inference serving, traffic... ...'re comfortable reasoning about cost/performance tradeoffs at scale (GPU utilization,...PerformanceFull timeWork at office
- ...of AI to bring cutting-edge models into production. We're growing... ...and help build the platform engineers turn to to ship AI products.... ...THE ROLE: Baseten's Model Performance (MP) team is responsible for... ...This work spans distributed systems, model serving, and developer...PerformanceFlexible hours
$405k
...Model Performance Software Engineer, Claude Code San Francisco, CA | New York City, NY About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a...PerformanceWork at officeVisa sponsorshipFlexible hours- ...AI to bring cutting-edge models into production. We're growing... ...help build the platform engineers turn to to ship AI... ...intelligence? We are looking for a Software Engineer focused on ML performance to join our dynamic team.... ...performance of software systems, particularly in the...PerformanceFlexible hours
- ...inventive research, design, and engineering. Our organization is very... .... About the Role As a Software Engineer on the Model Routing & Inference team... ...low-latency distributed systems, especially in inference... ...reasoning about cost/performance tradeoffs at scale (GPU utilization...Performance
- ...Talentzo Delhi is looking for a talented software engineer located in the United States. The ideal candidate will contribute to high-... ...This role offers the opportunity to build scalable and high-performance systems while collaborating with product and engineering teams. #J...Performance
- A leading technology staffing firm is seeking a Software Engineer III to design and build large-scale systems that support millions of users. The ideal candidate... ...-functional teams and an emphasis on optimizing performance and operational excellence. Join us to tackle industry...Performance
$200k - $300k
...Software Engineer - Trading System Developer The Firm XTX Markets is a leading algorithmic trading firm... ...seeking new methods and ideas. The models that drive our trading strategies... ...Technology: Consists of high-performance trading venue integrations, fair value...PerformanceCurrently hiringWork at officeImmediate startWorldwide- ...platform (vendor → internal transition) High-impact backend/system-level work: .NET 8 (LTS) services, API design, reliability, performance, and modernization of legacy areas (ASP.NET 4.8/WCF) Real engineering standards: PR-only workflow, branch protection, mandatory code...PerformanceLong term contractRemote workFlexible hours
$135k - $250k
...About the Role As a Software Engineer at Alchemy, you’ll be focused on building one of the most... ...and high-throughput distributed systems that power the global backbone powering... ...complex design, scaling, latency, or performance problems in high-throughput, low‑latency...PerformanceWork at officeHome office- ...actually needs. We enable teams developing models in complex domains, such as weather &... ...ideas. We're looking for exceptional engineers with deep experience in low-level, high-performance software and cloud-scale storage systems. What we're looking for: ~5+ years...PerformanceWork at office
$135k - $200k
...Forward Deployed Software Engineer - Edge Autonomous Systems Title of Role: Forward Deployed Software Engineer - Edge Autonomous Systems Location... ...operational settings, ensuring high reliability and performance. Work closely with hardware teams and data engineers...PerformanceWork at office- ...Senior Rust Software Engineer - Distributed Systems (AI Infrastructure) About the Role What if your... ...powering the world's most advanced AI models? We're looking for Senior Rust... ...Engineers to build and optimize the high-performance data pipelines, annotation tooling,...PerformanceHourly payContract workFreelanceRemote workFlexible hours
$200k - $245k
...Senior Software Engineer/Algorithmic Trading Platform Global electronic trading industry leader... ...testing of trading platforms, systems, and execution algorithms. This person... ...frameworks covering all functionality, performance and stability Triage critical production...PerformanceFull timeImmediate startRemote work- ...Zapier, Vercel, and Ramp use Braintrust to compare models, test prompts, and catch regressions - turning... .... About the role We're looking for a software engineer who loves to build high performance data processing systems. Our customers are scaled up companies who...PerformanceFlexible hours
- ..., NJ Our client seeks a senior software engineer to lead design and development of next-generation electronic trading systems. The role focuses on low-latency, high-throughput... ...teams, mentor engineers, and drive performance, scalability, and resilience...Performance
- ...developers save time by accelerating software builds and tests. Our cloud-... ...we build tools that empower engineering teams—from startups to... ...velocity and improve build performance. Learn more about our... ...Engineer with a focus on build systems, compilers, and languages ,...PerformanceRemote work
- ...digital assets. Our novel clearing system allows solvers to net flows... ...looking for an experienced engineer who is passionate about Defi/... ...Role We're seeking a Senior Software Engineer to develop and... ...Ensure high reliability and performance of cross‑chain transaction processing...PerformanceContract workRemote workFlexible hours
$200k - $300k
...able to talk both high-level distributed systems design trade-offs and low-level OS-... ...operating systems, low-level systems-level performance issues, and networking The estimated... ...and computer science, physics and engineering, media and tech. We’re a community of self...PerformanceWork at officeLocal areaImmediate start$130k - $230k
...world's most advanced hardware systems, from spacecraft and... .... Our platform gives hardware engineering teams a single place to ingest data, analyze performance, automate test execution, and... ...intersection of hardware and software. We serve top-tier commercial...PerformancePermanent employment$125k - $160k
...Job Title: Software Engineer II - Distributed Systems Location: Remote (US Based Only) *We cannot sponsor or transfer any visas, of any kind, at this... ...ideal candidate will have: Experience architecting high performance, distributed systems Ability to own and manage all...PerformanceLocal areaRemote workVisa sponsorship$150k - $250k
...Senior Software Engineer (Agentic AI / Healthcare) Location: New York, NY (Manhattan), Hybrid... ...architecture, shipping production systems, and balancing speed with quality. The... ...databases) and how to apply them for performance and reliability ~ Cloud experience (...PerformanceWork at office3 days per week$170k - $210k
...About the job Software Engineer - Full Stack (Marketplace Systems) Software Engineer - Full Stack (Marketplace... ...into a clean backend domain model with well-defined API contracts... ...filtering, pagination, versioning, performance constraints) Improve system reliability...Performance$150k - $190k
...Software Engineer - Investment Systems New York Company Overview KKR is a leading global investment... ...scheduling), PySpark (ETL / Medallion model), with AI-assisted development... ...factors such as individual and team performance. Base Salary Range $150,000 -...PerformanceLocal area$300k - $405k
...Software Engineer, Sandboxing (Systems) San Francisco, CA | New York City, NY About Anthropic Anthropic... ...training and serving frontier AI models. Responsibilities:... ...our virtualization stack, improving performance, reliability, and efficiency of our...PerformanceWork at officeVisa sponsorshipFlexible hours$185k
About the Role The Engineering Acceleration team builds and operates the foundational systems that engineers use to build, test, and... ...distributed infrastructure, and software quality. You will work on the... ...to adopt. Improve CI performance and reliability across Buildkite...PerformanceLocal areaRemote work$175k - $250k
...excellence. The Role: We are looking for an elite Software Engineer to join our options trading team with a heavy focus on technology... ..., advanced statistics, massive time-series data, and high-performance systems. You will work side-by-side with traders and...PerformanceWork at office$128k - $204k
...eligible US locations About the role and about you: As Senior Software Engineer, Game Systems, you'll work with a team of talented engineers to ensure... ...high-quality, testable code, improving application performance, creating instrumentation and metrics, and ensuring...PerformanceFull timeRemote workWorldwide$212.5k - $250k
...Senior Software Engineer II, Design Systems San Francisco, CA; Seattle, WA; New York, NY The Company You'll Join Carta connects founders,... ...focus is to make Financial Technology safe, intuitive, performant, and fun (yes, we said that). As a Senior Design Systems...PerformanceFull timeWork at office
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer, Model Performance Systems. Be the first to apply!
Related searches
- graduate software developer New York, NY
- rust software engineer New York, NY
- senior software design engineer New York, NY
- software engineer student New York, NY
- software engineer amazon New York, NY
- software developer positions New York, NY
- software engineer full time New York, NY
- software qa engineer New York, NY
- new graduate software engineer New York, NY
- junior software developer New York, NY

