Software Engineer, Model Performance Systems
Baseten
ABOUT BASETEN Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products. THE OPPORTUNITY We are looking for early-career Software Engineers to join our team. This is a specialized role sitting at the intersection of high-performance computing (HPC) and Large Language Model (LLM) engineering. You will be responsible for building the automated "speedometer and diagnostic" suite for our next-generation AI infrastructure. In this role, you won't just be using models; you will be tearing them apart to see how they run on the metal. You will build tools that measure GPU FLOPS, stress-test InfiniBand clusters, and define the benchmarks that ensure our systems are production-ready. RESPONSIBILITIES
Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you. At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
- Performance Benchmarking: Run and automate standard LLM quality benchmarks (GSM8K, MMLU) alongside custom performance suites for specific workloads (e.g., long-context window, KV cache reuse).
- Infrastructure Validation: Create automated acceptance tests for new GPU clusters across x86 and ARM systems, measuring GPU memory bandwidth, networking throughput, and multi-node networking performance.
- Model Dev Experience: Develop and maintain internal GPU-enabled development environments (similar to GitHub Codespaces). You will ensure the team has seamless, high-performance "dev machines" optimized for model experimentation.
- Tool Development: Build and contribute to tools such as InferenceMAX and genai-bench to automate model evaluation and optimization.
- Deep Hardware Profiling: Use PyTorch Profiler and NVIDIA Nsight Systems to collect performance profiles, identify bottlenecks, and debug the NVIDIA compute/networking stack.
- Monitoring & Observability: Develop real-time dashboards and alerts to monitor system health, model startup times, and runtime performance.
- Continuous Integration: Automate performance testing via CI/CD pipelines to catch regressions in model setups before they hit production.
- Optimization Automation: Build tools to find the "Pareto frontier"-identifying the absolute best configuration (latency vs. cost vs. quality) for a given model and workload.
- A Love for Systems & Hardware: You aren't just interested in the AI; you want to understand GPU memory subsystems, InfiniBand, and how data moves across a cluster.
- An Automation Mindset: You believe that if a task has to be done twice, it should be scripted. You have a passion for stress-testing and fuzzy testing to find the "breaking point" of a system.
- Mathematical Curiosity: A desire to understand the underlying math of Transformers and how it translates into FLOPs and memory requirements.
- Interest in Optimization: You are excited to learn about (or already play with) quantization, speculative decoding, disaggregated serving, and kernel-level optimizations.
- Technical Toolkit: Familiarity with Python, and an eagerness to master the NVIDIA software stack. C++ familiarity is good to have.
- Direct Impact: Your tools will be the gatekeeper for what defines "good" performance for our customers.
- Deep Learning (Literally): You will gain world-class expertise in GPU orchestration and LLM inference that few engineers in the industry possess.
- High Ownership: As a small team of freshers led by experts, you will have the autonomy to build tools from scratch and contribute to open-source projects.
- Competitive compensation, including meaningful equity.
- 100% coverage of medical, dental, and vision insurance for employee and dependents
- Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
- Paid parental leave
- Fertility and family-building stipend through Carrot
- Company-facilitated 401(k)
- Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you. At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
Vacancy posted 19 hours ago
Similar jobs that could be interesting for youBased on the Software Engineer, Model Performance Systems in New York, NY vacancy
$405k
...Model Performance Software Engineer, Claude Code San Francisco, CA | New York City, NY About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a...PerformanceWork at officeVisa sponsorshipFlexible hours- Talentzo Delhi is looking for a talented software engineer located in the United States. The ideal candidate will contribute to high-level... ...role offers the opportunity to build scalable and high-performance systems while collaborating with product and engineering teams. #J...Performance
$135k - $200k
...Forward Deployed Software Engineer - Edge Autonomous Systems Title of Role: Forward Deployed Software Engineer - Edge Autonomous Systems Location... ...operational settings, ensuring high reliability and performance. Work closely with hardware teams and data engineers...PerformanceWork at office- ...Senior Rust Software Engineer - Distributed Systems (AI Infrastructure) About the Role What if your... ...powering the world's most advanced AI models? We're looking for Senior Rust... ...Engineers to build and optimize the high-performance data pipelines, annotation tooling,...PerformanceHourly payContract workFreelanceRemote workFlexible hours
$175k - $250k
...Quantitative Developer - Equity Factor Model Risk Technology Millennium... ..., compute-heavy distributed systems that power both historical... ...work at the intersection of engineering, data, and quantitative... ...firm's delivery platforms ~ Perform extensive back-testing of...Performance$200k - $245k
...Senior Software Engineer/Algorithmic Trading Platform Global electronic trading industry leader... ...testing of trading platforms, systems, and execution algorithms. This person... ...frameworks covering all functionality, performance and stability Triage critical production...PerformanceFull timeImmediate startRemote work$130k - $230k
...world's most advanced hardware systems, from spacecraft and... .... Our platform gives hardware engineering teams a single place to ingest data, analyze performance, automate test execution, and... ...intersection of hardware and software. We serve top-tier commercial...PerformancePermanent employment$150k - $250k
...Senior Software Engineer (Agentic AI / Healthcare) Location: New York, NY (Manhattan), Hybrid... ...architecture, shipping production systems, and balancing speed with quality. The... ...databases) and how to apply them for performance and reliability ~ Cloud experience (...PerformanceWork at office3 days per week$135k - $250k
About the Role As a Software Engineer at Alchemy, you’ll be focused on building one of the most... ...and high-throughput distributed systems that power the global backbone powering... ...complex design, scaling, latency, or performance problems in high-throughput, low‑latency...PerformanceWork at officeHome office$170k - $210k
...About the job Software Engineer - Full Stack (Marketplace Systems) Software Engineer - Full Stack (Marketplace... ...into a clean backend domain model with well-defined API contracts... ...filtering, pagination, versioning, performance constraints) Improve system reliability...Performance$145k - $200k
...builds the world's leading software for data-driven... ...FedRAMP). As a Software Engineer on the Apollo team,... ...large-scale distributed system to allow the remote... ...into a portable, high-performance artifact within minutes... ...Palantir's unique deployment models. You'll also build and...PerformanceWork experience placementWork at officeRemote workWork from homeRelocation package- ...deployment of AI across health systems. We are a growing team of... ...creatives, technologists, and engineers working together to empower people... ...are looking for experienced software engineers to join our team and help improve the performance, stability, and scalability of...PerformanceHourly payFull timeFlexible hours
$140.83k - $166.22k
...Advanced Software Engineer - Revenue Systems Job ID: 14252 Business Unit: MTA Headquarters Location... ..., adaptation, and adoption of new models, methods, and tools. Collaborates across... .... Manages suppliers to meet key performance indicators. Continuously...PerformanceContract workTemporary workFor contractorsWork at office$2,000 per month
As a Systems Engineer at Octogen, you will take on ambitious problems at the intersection of AI, search, and commerce. You will design and... ...thoughtful architectural decisions around cost, latency, performance, and scalability Ensure reliability, observability, and performance...PerformanceImmediate start$180k - $320k
...Career Renew is recruiting for one of its clients a Software Engineer, Distributed Systems (Core) - this is a fully remote role for US/Canada candidates... ...deliver personalized customer experiences, optimize performance marketing, and move faster by leveraging data and AI...PerformanceRemote workVisa sponsorship- Alignerr is seeking a Python Infrastructure Engineer for remote contract work focusing on AI model evaluation. In this role, you'll design high-performance systems and develop back-end services, contributing to projects that influence AI quality at scale. Ideal candidates...PerformanceRemote jobContract workFlexible hours
- ...to bring cutting-edge models into production. We're... ...help build the platform engineers turn to to ship AI... ...the global operating system for distributed, heterogeneous... ...to architect the software fabric that unifies thousands... ...validate networking performance on bleeding-edge...PerformanceFlexible hours
$250k - $325k
...senior low-latency trading engineer, you will apply your... ...structure and high performance programming techniques... ...specify and implement software for trading numerous financial... ...Engineer computer models for different... ...build/engineer a software system for model simulation,...PerformanceCasual workWork at officeLocal areaHome officeFlexible hours- ...looking for a talented, senior engineering professional ready to take... .... As a Vice President, Software Engineer at JPMorganChase, you... ...implementation of distributed systems at scale. You will drive... ...tolerance, convergence, and performance at scale. Design and implement...Performance
$104.7k - $153k
...technologies in data and intelligent systems. Explore the opportunities... ...the intersection of backend engineering and AI, helping to transform... ...Impact As a passionate software engineer, you bring... ...Employees on sales plans earn performance-based incentive pay on top of...PerformanceFull timeTemporary workApprenticeshipLocal areaFlexible hours- CellType Inc. is seeking a Founding Research Engineer to develop and optimize systems for their biological AI models. This pivotal role involves training, evaluation... ...understanding of reinforcement learning and performance debugging in production systems. The position...PerformanceRemote work
$123.6k - $200.1k
...dedicated team members are engineering the foundation of Cisco's core... ...innovations in operating systems, firmware, networking stacks... ...on experience with hardware-software integration and low-level networking... ...compatibility, network performance, and security for Cisco's...PerformanceFull timeTemporary workApprenticeshipLocal areaFlexible hours$100k - $140k
...highly motivated and hands-on Software Engineer to design, develop,... ...opportunities for automation, systems integration, workflow optimization... ..., integration, and performance issues. Systems Integration... ..., performance tuning, data modeling, ETL/ELT processes, and relational...PerformanceFull time- HRB is seeking a Lead Systems Programmer Z/OS in Hoboken, New Jersey. This role involves leading systems programming activities, product... ...should have proven experience in Z/OS product installation, performance tuning, and strong supervisory skills. This position offers a...Performance
- LEAD SYSTEMS PROGRAMMER Z/OS Hybrid work environment (3× week on site required). Great benefits & annual bonus program. Proven skills... ..., BAL/ASSEMBLER) I/O configuration expertise Z/OS mainframe performance & tuning Debugging skills Strong knowledge of monitoring...Performance
$103.71k - $138.28k
...independent efforts to all aspects of system integration including design, analysis,... ...experience in system architecture and engineering disciplines. Specific technical knowledge... ...applications for deficiencies such as slow performance and use of deprecated dependencies and...PerformanceFull timeTemporary workRemote work- ...technology firm is seeking a Full Stack AI Engineer for a remote opportunity. In this role,... ...AI-driven solutions to optimize performance for providers and health-plan organizations... ...extensive experience in building scalable systems, a passion for user experiences, and proficiency...PerformanceRemote job
$85k - $95k
...global water solutions company in the United States is seeking a hands-on Applications Engineer responsible for the performance, reliability, and optimization of its FlexNet system. You'll coordinate technical activities, analyze system performance, and troubleshoot complex...Performance- Position HPC Scientific Applications Systems Analyst/Programmer Responsibilities Devise... ...in a Scientific or Computer Science/Engineering discipline. Experience: 5+ years of related... ...-premise and cloud-based HPC systems. Performance analysis and optimization tuning...PerformanceRemote work
$80 per hour
...testing, evaluating, and improving AI systems. Participation is project-based, not permanent... ...quality 5+ years of experience as a Software Engineer (primarily Python ) Deep experience... ...to up to $80/hour* depending on performance and volume Opportunity to contribute...PerformancePermanent employmentTemporary workFreelanceRemote workFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer, Model Performance Systems. Be the first to apply!
Related searches
- software developer internship no experience New York, NY
- federal - software developer New York, NY
- research software engineer New York, NY
- software engineer contract New York, NY
- part time software developer New York, NY
- software engineer healthcare New York, NY
- network software engineer New York, NY
- ngo software engineer New York, NY
- software development engineer aws New York, NY
- software developer internship New York, NY



