Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, Model Performance Systems

Baseten

ABOUT BASETEN

Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products.

THE OPPORTUNITY

We are looking for early-career Software Engineers to join our team. This is a specialized role sitting at the intersection of high-performance computing (HPC) and Large Language Model (LLM) engineering. You will be responsible for building the automated "speedometer and diagnostic" suite for our next-generation AI infrastructure.

In this role, you won't just be using models; you will be tearing them apart to see how they run on the metal. You will build tools that measure GPU FLOPS, stress-test InfiniBand clusters, and define the benchmarks that ensure our systems are production-ready.

RESPONSIBILITIES
  • Performance Benchmarking: Run and automate standard LLM quality benchmarks (GSM8K, MMLU) alongside custom performance suites for specific workloads (e.g., long-context window, KV cache reuse).
  • Infrastructure Validation: Create automated acceptance tests for new GPU clusters across x86 and ARM systems, measuring GPU memory bandwidth, networking throughput, and multi-node networking performance.
  • Model Dev Experience: Develop and maintain internal GPU-enabled development environments (similar to GitHub Codespaces). You will ensure the team has seamless, high-performance "dev machines" optimized for model experimentation.
  • Tool Development: Build and contribute to tools such as InferenceMAX and genai-bench to automate model evaluation and optimization.
  • Deep Hardware Profiling: Use PyTorch Profiler and NVIDIA Nsight Systems to collect performance profiles, identify bottlenecks, and debug the NVIDIA compute/networking stack.
  • Monitoring & Observability: Develop real-time dashboards and alerts to monitor system health, model startup times, and runtime performance.
  • Continuous Integration: Automate performance testing via CI/CD pipelines to catch regressions in model setups before they hit production.
  • Optimization Automation: Build tools to find the "Pareto frontier"-identifying the absolute best configuration (latency vs. cost vs. quality) for a given model and workload.
WHAT WE'RE LOOKING FOR

This is a fresher-friendly role. We care more about your trajectory, curiosity, and technical depth than your years of experience. We want to talk to you if you have:
  • A Love for Systems & Hardware: You aren't just interested in the AI; you want to understand GPU memory subsystems, InfiniBand, and how data moves across a cluster.
  • An Automation Mindset: You believe that if a task has to be done twice, it should be scripted. You have a passion for stress-testing and fuzzy testing to find the "breaking point" of a system.
  • Mathematical Curiosity: A desire to understand the underlying math of Transformers and how it translates into FLOPs and memory requirements.
  • Interest in Optimization: You are excited to learn about (or already play with) quantization, speculative decoding, disaggregated serving, and kernel-level optimizations.
  • Technical Toolkit: Familiarity with Python, and an eagerness to master the NVIDIA software stack. C++ familiarity is good to have.
WHY THIS ROLE
  • Direct Impact: Your tools will be the gatekeeper for what defines "good" performance for our customers.
  • Deep Learning (Literally): You will gain world-class expertise in GPU orchestration and LLM inference that few engineers in the industry possess.
  • High Ownership: As a small team of freshers led by experts, you will have the autonomy to build tools from scratch and contribute to open-source projects.
BENEFITS
  • Competitive compensation, including meaningful equity.
  • 100% coverage of medical, dental, and vision insurance for employee and dependents
  • Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
  • Paid parental leave
  • Fertility and family-building stipend through Carrot
  • Company-facilitated 401(k)
  • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you.

At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.

We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Software Engineer, Model Performance Systems in New York, NY vacancy
  • $40 per hour

    A data-focused technology company is seeking a Systems Developer to join their team in training AI models. This role involves measuring chatbot performance and solving coding challenges. Candidates should be proficient in at least one programming language like Python or... 
    Performance
    Hourly pay
    Contract work
    Remote work
    Flexible hours

    DataAnnotation

    New York, NY
    2 days ago
  •  ...Software Engineer, Model Routing & Inference Engineering · Full-time · New York; San Francisco...  ...-throughput, low-latency distributed systems, especially in inference serving, traffic...  ...'re comfortable reasoning about cost/performance tradeoffs at scale (GPU utilization,... 
    Performance
    Full time
    Work at office

    Anysphere

    New York, NY
    4 days ago
  •  ...of AI to bring cutting-edge models into production. We're growing...  ...and help build the platform engineers turn to to ship AI products....  ...THE ROLE: Baseten's Model Performance (MP) team is responsible for...  ...This work spans distributed systems, model serving, and developer... 
    Performance
    Flexible hours

    Baseten

    New York, NY
    4 days ago
  • $405k

     ...Model Performance Software Engineer, Claude Code San Francisco, CA | New York City, NY About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a... 
    Performance
    Work at office
    Visa sponsorship
    Flexible hours

    anthropic

    New York, NY
    3 days ago
  •  ...AI to bring cutting-edge models into production. We're growing...  ...help build the platform engineers turn to to ship AI...  ...intelligence? We are looking for a Software Engineer focused on ML performance to join our dynamic team....  ...performance of software systems, particularly in the... 
    Performance
    Flexible hours

    Baseten

    New York, NY
    3 days ago
  •  ...inventive research, design, and engineering. Our organization is very...  .... About the Role As a Software Engineer on the Model Routing & Inference team...  ...low-latency distributed systems, especially in inference...  ...reasoning about cost/performance tradeoffs at scale (GPU utilization... 
    Performance

    Anysphere

    New York, NY
    3 days ago
  •  ...Talentzo Delhi is looking for a talented software engineer located in the United States. The ideal candidate will contribute to high-...  ...This role offers the opportunity to build scalable and high-performance systems while collaborating with product and engineering teams. #J... 
    Performance

    Talentzo Delhi

    New York, NY
    5 days ago
  • A leading technology staffing firm is seeking a Software Engineer III to design and build large-scale systems that support millions of users. The ideal candidate...  ...-functional teams and an emphasis on optimizing performance and operational excellence. Join us to tackle industry... 
    Performance

    Andiamo

    New York, NY
    2 days ago
  • $200k - $300k

     ...Software Engineer - Trading System Developer The Firm XTX Markets is a leading algorithmic trading firm...  ...seeking new methods and ideas. The models that drive our trading strategies...  ...Technology: Consists of high-performance trading venue integrations, fair value... 
    Performance
    Currently hiring
    Work at office
    Immediate start
    Worldwide

    XTX Markets

    New York, NY
    1 day ago
  •  ...platform (vendor → internal transition) High-impact backend/system-level work: .NET 8 (LTS) services, API design, reliability, performance, and modernization of legacy areas (ASP.NET 4.8/WCF) Real engineering standards: PR-only workflow, branch protection, mandatory code... 
    Performance
    Long term contract
    Remote work
    Flexible hours

    Safe City Group

    New York, NY
    5 days ago
  • $135k - $250k

     ...About the Role As a Software Engineer at Alchemy, you’ll be focused on building one of the most...  ...and high-throughput distributed systems that power the global backbone powering...  ...complex design, scaling, latency, or performance problems in high-throughput, low‑latency... 
    Performance
    Work at office
    Home office

    Framework Ventures

    New York, NY
    5 days ago
  •  ...actually needs. We enable teams developing models in complex domains, such as weather &...  ...ideas. We're looking for exceptional engineers with deep experience in low-level, high-performance software and cloud-scale storage systems. What we're looking for: ~5+ years... 
    Performance
    Work at office

    Spiral Inc.

    New York, NY
    4 days ago
  • $135k - $200k

     ...Forward Deployed Software Engineer - Edge Autonomous Systems Title of Role: Forward Deployed Software Engineer - Edge Autonomous Systems Location...  ...operational settings, ensuring high reliability and performance. Work closely with hardware teams and data engineers... 
    Performance
    Work at office

    Recruiting from Scratch

    New York, NY
    10 days ago
  •  ...Senior Rust Software Engineer - Distributed Systems (AI Infrastructure) About the Role What if your...  ...powering the world's most advanced AI models? We're looking for Senior Rust...  ...Engineers to build and optimize the high-performance data pipelines, annotation tooling,... 
    Performance
    Hourly pay
    Contract work
    Freelance
    Remote work
    Flexible hours

    Alignerr

    New York, NY
    4 days ago
  • $200k - $245k

     ...Senior Software Engineer/Algorithmic Trading Platform Global electronic trading industry leader...  ...testing of trading platforms, systems, and execution algorithms. This person...  ...frameworks covering all functionality, performance and stability Triage critical production... 
    Performance
    Full time
    Immediate start
    Remote work

    Harris Allied

    New York, NY
    4 days ago
  •  ...Zapier, Vercel, and Ramp use Braintrust to compare models, test prompts, and catch regressions - turning...  .... About the role We're looking for a software engineer who loves to build high performance data processing systems. Our customers are scaled up companies who... 
    Performance
    Flexible hours

    Brain Trust Inc

    New York, NY
    1 day ago
  •  ..., NJ Our client seeks a senior software engineer to lead design and development of next-generation electronic trading systems. The role focuses on low-latency, high-throughput...  ...teams, mentor engineers, and drive performance, scalability, and resilience... 
    Performance

    Eliassen Group

    Jersey City, NJ
    2 days ago
  •  ...developers save time by accelerating software builds and tests. Our cloud-...  ...we build tools that empower engineering teams—from startups to...  ...velocity and improve build performance. Learn more about our...  ...Engineer with a focus on build systems, compilers, and languages ,... 
    Performance
    Remote work

    GrabJobs

    New York, NY
    7 days ago
  •  ...digital assets. Our novel clearing system allows solvers to net flows...  ...looking for an experienced engineer who is passionate about Defi/...  ...Role We're seeking a Senior Software Engineer to develop and...  ...Ensure high reliability and performance of cross‑chain transaction processing... 
    Performance
    Contract work
    Remote work
    Flexible hours

    Framework Ventures

    New York, NY
    5 days ago
  • $200k - $300k

     ...able to talk both high-level distributed systems design trade-offs and low-level OS-...  ...operating systems, low-level systems-level performance issues, and networking The estimated...  ...and computer science, physics and engineering, media and tech. We’re a community of self... 
    Performance
    Work at office
    Local area
    Immediate start

    Hudson River Trading

    New York, NY
    2 days ago
  • $130k - $230k

     ...world's most advanced hardware systems, from spacecraft and...  .... Our platform gives hardware engineering teams a single place to ingest data, analyze performance, automate test execution, and...  ...intersection of hardware and software. We serve top-tier commercial... 
    Performance
    Permanent employment

    Nominal

    New York, NY
    2 days ago
  • $125k - $160k

     ...Job Title: Software Engineer II - Distributed Systems Location: Remote (US Based Only) *We cannot sponsor or transfer any visas, of any kind, at this...  ...ideal candidate will have: Experience architecting high performance, distributed systems Ability to own and manage all... 
    Performance
    Local area
    Remote work
    Visa sponsorship

    Buoyant Inc

    New York, NY
    5 days ago
  • $150k - $250k

     ...Senior Software Engineer (Agentic AI / Healthcare) Location: New York, NY (Manhattan), Hybrid...  ...architecture, shipping production systems, and balancing speed with quality. The...  ...databases) and how to apply them for performance and reliability ~ Cloud experience (... 
    Performance
    Work at office
    3 days per week

    NxT Level

    New York, NY
    9 days ago
  • $170k - $210k

     ...About the job Software Engineer - Full Stack (Marketplace Systems) Software Engineer - Full Stack (Marketplace...  ...into a clean backend domain model with well-defined API contracts...  ...filtering, pagination, versioning, performance constraints) Improve system reliability... 
    Performance

    Essence Coaching Group

    New York, NY
    3 days ago
  • $150k - $190k

     ...Software Engineer - Investment Systems New York Company Overview KKR is a leading global investment...  ...scheduling), PySpark (ETL / Medallion model), with AI-assisted development...  ...factors such as individual and team performance. Base Salary Range $150,000 -... 
    Performance
    Local area

    KKR

    New York, NY
    1 day ago
  • $300k - $405k

     ...Software Engineer, Sandboxing (Systems) San Francisco, CA | New York City, NY About Anthropic Anthropic...  ...training and serving frontier AI models. Responsibilities:...  ...our virtualization stack, improving performance, reliability, and efficiency of our... 
    Performance
    Work at office
    Visa sponsorship
    Flexible hours

    anthropic

    New York, NY
    4 days ago
  • $185k

    About the Role The Engineering Acceleration team builds and operates the foundational systems that engineers use to build, test, and...  ...distributed infrastructure, and software quality. You will work on the...  ...to adopt. Improve CI performance and reliability across Buildkite... 
    Performance
    Local area
    Remote work

    OpenAI

    New York, NY
    4 days ago
  • $175k - $250k

     ...excellence. The Role: We are looking for an elite Software Engineer to join our options trading team with a heavy focus on technology...  ..., advanced statistics, massive time-series data, and high-performance systems. You will work side-by-side with traders and... 
    Performance
    Work at office

    Summit Securities Group

    New York, NY
    3 days ago
  • $128k - $204k

     ...eligible US locations About the role and about you: As Senior Software Engineer, Game Systems, you'll work with a team of talented engineers to ensure...  ...high-quality, testable code, improving application performance, creating instrumentation and metrics, and ensuring... 
    Performance
    Full time
    Remote work
    Worldwide

    GrabJobs

    New York, NY
    2 days ago
  • $212.5k - $250k

     ...Senior Software Engineer II, Design Systems San Francisco, CA; Seattle, WA; New York, NY The Company You'll Join Carta connects founders,...  ...focus is to make Financial Technology safe, intuitive, performant, and fun (yes, we said that). As a Senior Design Systems... 
    Performance
    Full time
    Work at office

    Carta

    New York, NY
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, Model Performance Systems. Be the first to apply!