Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, Model Performance Systems

Baseten

ABOUT BASETEN

Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products.

THE OPPORTUNITY

We are looking for early-career Software Engineers to join our team. This is a specialized role sitting at the intersection of high-performance computing (HPC) and Large Language Model (LLM) engineering. You will be responsible for building the automated "speedometer and diagnostic" suite for our next-generation AI infrastructure.

In this role, you won't just be using models; you will be tearing them apart to see how they run on the metal. You will build tools that measure GPU FLOPS, stress-test InfiniBand clusters, and define the benchmarks that ensure our systems are production-ready.

RESPONSIBILITIES
  • Performance Benchmarking: Run and automate standard LLM quality benchmarks (GSM8K, MMLU) alongside custom performance suites for specific workloads (e.g., long-context window, KV cache reuse).
  • Infrastructure Validation: Create automated acceptance tests for new GPU clusters across x86 and ARM systems, measuring GPU memory bandwidth, networking throughput, and multi-node networking performance.
  • Model Dev Experience: Develop and maintain internal GPU-enabled development environments (similar to GitHub Codespaces). You will ensure the team has seamless, high-performance "dev machines" optimized for model experimentation.
  • Tool Development: Build and contribute to tools such as InferenceMAX and genai-bench to automate model evaluation and optimization.
  • Deep Hardware Profiling: Use PyTorch Profiler and NVIDIA Nsight Systems to collect performance profiles, identify bottlenecks, and debug the NVIDIA compute/networking stack.
  • Monitoring & Observability: Develop real-time dashboards and alerts to monitor system health, model startup times, and runtime performance.
  • Continuous Integration: Automate performance testing via CI/CD pipelines to catch regressions in model setups before they hit production.
  • Optimization Automation: Build tools to find the "Pareto frontier"-identifying the absolute best configuration (latency vs. cost vs. quality) for a given model and workload.
WHAT WE'RE LOOKING FOR

This is a fresher-friendly role. We care more about your trajectory, curiosity, and technical depth than your years of experience. We want to talk to you if you have:
  • A Love for Systems & Hardware: You aren't just interested in the AI; you want to understand GPU memory subsystems, InfiniBand, and how data moves across a cluster.
  • An Automation Mindset: You believe that if a task has to be done twice, it should be scripted. You have a passion for stress-testing and fuzzy testing to find the "breaking point" of a system.
  • Mathematical Curiosity: A desire to understand the underlying math of Transformers and how it translates into FLOPs and memory requirements.
  • Interest in Optimization: You are excited to learn about (or already play with) quantization, speculative decoding, disaggregated serving, and kernel-level optimizations.
  • Technical Toolkit: Familiarity with Python, and an eagerness to master the NVIDIA software stack. C++ familiarity is good to have.
WHY THIS ROLE
  • Direct Impact: Your tools will be the gatekeeper for what defines "good" performance for our customers.
  • Deep Learning (Literally): You will gain world-class expertise in GPU orchestration and LLM inference that few engineers in the industry possess.
  • High Ownership: As a small team of freshers led by experts, you will have the autonomy to build tools from scratch and contribute to open-source projects.
BENEFITS
  • Competitive compensation, including meaningful equity.
  • 100% coverage of medical, dental, and vision insurance for employee and dependents
  • Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
  • Paid parental leave
  • Fertility and family-building stipend through Carrot
  • Company-facilitated 401(k)
  • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you.

At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.

We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
Vacancy posted 19 hours ago
Similar jobs that could be interesting for youBased on the Software Engineer, Model Performance Systems in New York, NY vacancy
  • $405k

     ...Model Performance Software Engineer, Claude Code San Francisco, CA | New York City, NY About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a... 
    Performance
    Work at office
    Visa sponsorship
    Flexible hours

    anthropic

    New York, NY
    3 days ago
  • Talentzo Delhi is looking for a talented software engineer located in the United States. The ideal candidate will contribute to high-level...  ...role offers the opportunity to build scalable and high-performance systems while collaborating with product and engineering teams. #J... 
    Performance

    Talentzo Delhi

    New York, NY
    19 hours ago
  • $135k - $200k

     ...Forward Deployed Software Engineer - Edge Autonomous Systems Title of Role: Forward Deployed Software Engineer - Edge Autonomous Systems Location...  ...operational settings, ensuring high reliability and performance. Work closely with hardware teams and data engineers... 
    Performance
    Work at office

    Recruiting from Scratch

    New York, NY
    5 days ago
  •  ...Senior Rust Software Engineer - Distributed Systems (AI Infrastructure) About the Role What if your...  ...powering the world's most advanced AI models? We're looking for Senior Rust...  ...Engineers to build and optimize the high-performance data pipelines, annotation tooling,... 
    Performance
    Hourly pay
    Contract work
    Freelance
    Remote work
    Flexible hours

    Alignerr

    New York, NY
    19 hours ago
  • $175k - $250k

     ...Quantitative Developer - Equity Factor Model Risk Technology Millennium...  ..., compute-heavy distributed systems that power both historical...  ...work at the intersection of engineering, data, and quantitative...  ...firm's delivery platforms ~ Perform extensive back-testing of... 
    Performance

    Millennium Management Corp

    New York, NY
    4 days ago
  • $200k - $245k

     ...Senior Software Engineer/Algorithmic Trading Platform Global electronic trading industry leader...  ...testing of trading platforms, systems, and execution algorithms. This person...  ...frameworks covering all functionality, performance and stability Triage critical production... 
    Performance
    Full time
    Immediate start
    Remote work

    Harris Allied

    New York, NY
    4 days ago
  • $130k - $230k

     ...world's most advanced hardware systems, from spacecraft and...  .... Our platform gives hardware engineering teams a single place to ingest data, analyze performance, automate test execution, and...  ...intersection of hardware and software. We serve top-tier commercial... 
    Performance
    Permanent employment

    Nominal

    New York, NY
    2 days ago
  • $150k - $250k

     ...Senior Software Engineer (Agentic AI / Healthcare) Location: New York, NY (Manhattan), Hybrid...  ...architecture, shipping production systems, and balancing speed with quality. The...  ...databases) and how to apply them for performance and reliability ~ Cloud experience (... 
    Performance
    Work at office
    3 days per week

    NxT Level

    New York, NY
    4 days ago
  • $135k - $250k

    About the Role As a Software Engineer at Alchemy, you’ll be focused on building one of the most...  ...and high-throughput distributed systems that power the global backbone powering...  ...complex design, scaling, latency, or performance problems in high-throughput, low‑latency... 
    Performance
    Work at office
    Home office

    Framework Ventures

    New York, NY
    19 hours ago
  • $170k - $210k

     ...About the job Software Engineer - Full Stack (Marketplace Systems) Software Engineer - Full Stack (Marketplace...  ...into a clean backend domain model with well-defined API contracts...  ...filtering, pagination, versioning, performance constraints) Improve system reliability... 
    Performance

    Essence Coaching Group

    New York, NY
    3 days ago
  • $145k - $200k

     ...builds the world's leading software for data-driven...  ...FedRAMP). As a Software Engineer on the Apollo team,...  ...large-scale distributed system to allow the remote...  ...into a portable, high-performance artifact within minutes...  ...Palantir's unique deployment models. You'll also build and... 
    Performance
    Work experience placement
    Work at office
    Remote work
    Work from home
    Relocation package

    Palantir Technologies

    New York, NY
    19 hours ago
  •  ...deployment of AI across health systems. We are a growing team of...  ...creatives, technologists, and engineers working together to empower people...  ...are looking for experienced software engineers to join our team and help improve the performance, stability, and scalability of... 
    Performance
    Hourly pay
    Full time
    Flexible hours

    Abridge Al, Inc

    New York, NY
    4 days ago
  • $140.83k - $166.22k

     ...Advanced Software Engineer - Revenue Systems Job ID: 14252 Business Unit: MTA Headquarters Location...  ..., adaptation, and adoption of new models, methods, and tools. Collaborates across...  .... Manages suppliers to meet key performance indicators. Continuously... 
    Performance
    Contract work
    Temporary work
    For contractors
    Work at office

    MTA, Inc.

    New York, NY
    1 day ago
  • $2,000 per month

    As a Systems Engineer at Octogen, you will take on ambitious problems at the intersection of AI, search, and commerce. You will design and...  ...thoughtful architectural decisions around cost, latency, performance, and scalability Ensure reliability, observability, and performance... 
    Performance
    Immediate start

    Octogen Systems Inc.

    New York, NY
    2 days ago
  • $180k - $320k

     ...Career Renew is recruiting for one of its clients a Software Engineer, Distributed Systems (Core) - this is a fully remote role for US/Canada candidates...  ...deliver personalized customer experiences, optimize performance marketing, and move faster by leveraging data and AI... 
    Performance
    Remote work
    Visa sponsorship

    Career Renew

    New York, NY
    28 days ago
  • Alignerr is seeking a Python Infrastructure Engineer for remote contract work focusing on AI model evaluation. In this role, you'll design high-performance systems and develop back-end services, contributing to projects that influence AI quality at scale. Ideal candidates... 
    Performance
    Remote job
    Contract work
    Flexible hours

    Alignerr

    New York, NY
    19 hours ago
  •  ...to bring cutting-edge models into production. We're...  ...help build the platform engineers turn to to ship AI...  ...the global operating system for distributed, heterogeneous...  ...to architect the software fabric that unifies thousands...  ...validate networking performance on bleeding-edge... 
    Performance
    Flexible hours

    Baseten

    New York, NY
    19 hours ago
  • $250k - $325k

     ...senior low-latency trading engineer, you will apply your...  ...structure and high performance programming techniques...  ...specify and implement software for trading numerous financial...  ...Engineer computer models for different...  ...build/engineer a software system for model simulation,... 
    Performance
    Casual work
    Work at office
    Local area
    Home office
    Flexible hours

    Quant Blueprint LLC

    New York, NY
    2 days ago
  •  ...looking for a talented, senior engineering professional ready to take...  .... As a Vice President, Software Engineer at JPMorganChase, you...  ...implementation of distributed systems at scale. You will drive...  ...tolerance, convergence, and performance at scale. Design and implement... 
    Performance

    JPMorgan Chase & Co.

    New York, NY
    3 days ago
  • $104.7k - $153k

     ...technologies in data and intelligent systems. Explore the opportunities...  ...the intersection of backend engineering and AI, helping to transform...  ...Impact As a passionate software engineer, you bring...  ...Employees on sales plans earn performance-based incentive pay on top of... 
    Performance
    Full time
    Temporary work
    Apprenticeship
    Local area
    Flexible hours

    Cisco

    New York, NY
    1 day ago
  • CellType Inc. is seeking a Founding Research Engineer to develop and optimize systems for their biological AI models. This pivotal role involves training, evaluation...  ...understanding of reinforcement learning and performance debugging in production systems. The position... 
    Performance
    Remote work

    CellType Inc.

    New York, NY
    4 days ago
  • $123.6k - $200.1k

     ...dedicated team members are engineering the foundation of Cisco's core...  ...innovations in operating systems, firmware, networking stacks...  ...on experience with hardware-software integration and low-level networking...  ...compatibility, network performance, and security for Cisco's... 
    Performance
    Full time
    Temporary work
    Apprenticeship
    Local area
    Flexible hours

    Cisco

    New York, NY
    2 days ago
  • $100k - $140k

     ...highly motivated and hands-on Software Engineer to design, develop,...  ...opportunities for automation, systems integration, workflow optimization...  ..., integration, and performance issues. Systems Integration...  ..., performance tuning, data modeling, ETL/ELT processes, and relational... 
    Performance
    Full time

    Truecare Homecare Agency

    Brooklyn, NY
    2 days ago
  • HRB is seeking a Lead Systems Programmer Z/OS in Hoboken, New Jersey. This role involves leading systems programming activities, product...  ...should have proven experience in Z/OS product installation, performance tuning, and strong supervisory skills. This position offers a... 
    Performance

    HRB

    Hoboken, NJ
    3 days ago
  • LEAD SYSTEMS PROGRAMMER Z/OS Hybrid work environment (3× week on site required). Great benefits & annual bonus program. Proven skills...  ..., BAL/ASSEMBLER) I/O configuration expertise Z/OS mainframe performance & tuning Debugging skills Strong knowledge of monitoring... 
    Performance

    HRB

    Hoboken, NJ
    3 days ago
  • $103.71k - $138.28k

     ...independent efforts to all aspects of system integration including design, analysis,...  ...experience in system architecture and engineering disciplines. Specific technical knowledge...  ...applications for deficiencies such as slow performance and use of deprecated dependencies and... 
    Performance
    Full time
    Temporary work
    Remote work

    Lumen

    New York, NY
    3 days ago
  •  ...technology firm is seeking a Full Stack AI Engineer for a remote opportunity. In this role,...  ...AI-driven solutions to optimize performance for providers and health-plan organizations...  ...extensive experience in building scalable systems, a passion for user experiences, and proficiency... 
    Performance
    Remote job

    Reveleer

    New York, NY
    19 hours ago
  • $85k - $95k

     ...global water solutions company in the United States is seeking a hands-on Applications Engineer responsible for the performance, reliability, and optimization of its FlexNet system. You'll coordinate technical activities, analyze system performance, and troubleshoot complex... 
    Performance

    Xylem

    New York, NY
    19 hours ago
  • Position HPC Scientific Applications Systems Analyst/Programmer Responsibilities Devise...  ...in a Scientific or Computer Science/Engineering discipline. Experience: 5+ years of related...  ...-premise and cloud-based HPC systems. Performance analysis and optimization tuning... 
    Performance
    Remote work

    Seneca Resources

    New York, NY
    19 hours ago
  • $80 per hour

     ...testing, evaluating, and improving AI systems. Participation is project-based, not permanent...  ...quality 5+ years of experience as a Software Engineer (primarily Python ) Deep experience...  ...to up to $80/hour* depending on performance and volume Opportunity to contribute... 
    Performance
    Permanent employment
    Temporary work
    Freelance
    Remote work
    Flexible hours

    Mindrift

    New York, NY
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, Model Performance Systems. Be the first to apply!