Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer - Model Performance

Baseten

ABOUT BASETEN

Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products.

THE ROLE

Are you passionate about advancing the application of artificial intelligence? We are looking for a Software Engineer focused on ML performance to join our dynamic team. This role is ideal for someone who thrives in a fast-paced startup environment and is eager to make significant contributions to the exciting field of LLM Inference. If you are a backend engineer who thrives on making things faster and is excited about open-source ML models, we look forward to your application.

EXAMPLE INITIATIVES

You'll get to work on these types of projects as part of our Model Performance team:

  • Baseten Embeddings Inference: The fastest embeddings solution available
  • The Baseten Inference Stack
  • Driving model performance optimization
RESPONSIBILITIES
  • Implement, refine, and productionize cutting-edge techniques (quantization, speculative decoding, kv cache reuse, chunked prefill and LoRA) for ML model inference and infrastructure.
  • Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vllm, sglang, CUDA, and other libraries to debug ML performance issues.
  • Apply and scale optimization techniques across a wide range of ML models, particularly large language models.
  • Collaborate with a diverse team to design and implement innovative solutions.
  • Own projects from idea to production.
REQUIREMENTS
  • Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field.
  • Experience with one or more general-purpose programming languages, such as Python or C++.
  • Familiarity with LLM optimization techniques (e.g., quantization, speculative decoding, continuous batching).
  • Strong familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM.
  • Demonstrated interest and experience in LLM's.
  • Deep understanding of GPU architecture.
  • Bonus:
    • Proficiency in enhancing the performance of software systems, particularly in the context of large language models (LLMs).
    • Experience with CUDA or similar technologies.
    • Deep understanding of software engineering principles and a proven track record of developing and deploying AI/ML inference solutions.
    • Experience with Docker and Kubernetes.
BENEFITS
  • Competitive compensation, including meaningful equity.
  • 100% coverage of medical, dental, and vision insurance for employee and dependents
  • Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
  • Paid parental leave
  • Fertility and family-building stipend through Carrot
  • Company-facilitated 401(k)
  • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you.

At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.

We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Software Engineer - Model Performance in New York, NY vacancy
  •  ...Software Engineer, Model Routing & Inference Engineering · Full-time · New York; San Francisco Our mission is to automate coding. The...  ...data pipelines. You're comfortable reasoning about cost/performance tradeoffs at scale (GPU utilization, provider economics, capacity... 
    Performance
    Full time
    Work at office

    Anysphere

    New York, NY
    4 days ago
  •  ...at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently...  .... Join us and help build the platform engineers turn to to ship AI products. THE ROLE: Baseten's Model Performance (MP) team is responsible for ensuring the... 
    Performance
    Flexible hours

    Baseten

    New York, NY
    4 days ago
  • $405k

     ...Model Performance Software Engineer, Claude Code San Francisco, CA | New York City, NY About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole... 
    Performance
    Work at office
    Visa sponsorship
    Flexible hours

    anthropic

    New York, NY
    3 days ago
  •  ...of inventive research, design, and engineering. Our organization is very flat, and...  ...shipping code. About the Role As a Software Engineer on the Model Routing & Inference team at Cursor,...  ...comfortable reasoning about cost/performance tradeoffs at scale (GPU utilization... 
    Performance

    Anysphere

    New York, NY
    3 days ago
  •  ...of AI to bring cutting-edge models into production. We're growing...  ...and help build the platform engineers turn to to ship AI products....  ...are looking for early-career Software Engineers to join our team....  ...at the intersection of high-performance computing (HPC) and Large Language... 
    Performance
    Flexible hours

    Baseten

    New York, NY
    4 days ago
  • $40 per hour

     ...We are looking for a Software Developer to join our team to train AI models. You will measure the progress of these AI chatbots, evaluate their logic, and...  ...quality produced by AI models for correctness and performance Qualifications Fluency in English (native or... 
    Performance
    Hourly pay
    Full time
    Contract work
    Part time
    Remote work

    DataAnnotation

    New York, NY
    2 days ago
  •  ...Developer - Murex Flex / Front Office Quant Model Integration Location: NYC, NY (Hybrid...  ...Contribute to platform stability, performance optimization, and risk reduction Work...  ...Office systems Strong understanding of software quality, performance, and stability in... 
    Performance
    Flexible hours

    Lorven Technologies

    New York, NY
    1 day ago
  • $197.3k - $225.1k

     ...Lead Software Engineer, Back End (Kubernetes, Golang, Foundation Model Gateways) Do you love building and pioneering in the technology space? Do you enjoy solving...  ...salary information is solely for candidates hired to perform work within one of these locations, and refers to... 
    Performance
    Full time
    Part time
    Internship
    H1b
    Local area

    Capital One

    New York, NY
    3 days ago
  • $40 per hour

    A data-focused technology company is seeking a Systems Developer to join their team in training AI models. This role involves measuring chatbot performance and solving coding challenges. Candidates should be proficient in at least one programming language like Python or... 
    Performance
    Hourly pay
    Contract work
    Remote work
    Flexible hours

    DataAnnotation

    New York, NY
    2 days ago
  • $40 per hour

    A leading AI training company is seeking a DevOps Engineer to join their remote team. In this role, you will provide coding challenges...  ...to AI chatbots and evaluate their outputs for correctness and performance. Candidates should be proficient in Python or JavaScript and... 
    Performance
    Hourly pay
    Remote work

    DataAnnotation

    New York, NY
    2 days ago
  • $60 per hour

    A technology company is seeking a Cloud Platform Engineer to enhance AI models. Responsibilities include evaluating AI outputs through coding challenges and ensuring model correctness and performance. The ideal candidate possesses strong programming skills, particularly... 
    Performance
    Hourly pay
    Full time
    Contract work
    Part time
    Remote work

    DataAnnotation

    New York, NY
    2 days ago
  •  ...Flex experience for a hybrid role in New York. This position focuses on integrating proprietary Quant models into Murex while ensuring platform stability and performance. Responsibilities include developing Murex Flex components, collaborating with Quants, and supporting... 
    Performance
    Flexible hours

    Luxoft

    New York, NY
    4 days ago
  • A technology consulting firm in the United States seeks a Cloud Platform Engineer to enhance AI models by evaluating their performance and logic. This remote position requires proficiency in at least one programming language and a detail-oriented mindset. Responsibilities... 
    Performance
    Remote work
    Flexible hours

    DataAnnotation

    New York, NY
    3 days ago
  • $216k - $270k

     ...As a Software Engineer on the ML Infrastructure team, you will design and build platforms for...  ...Build and maintain fault-tolerant, high-performance systems for serving LLMs workloads at...  ...and engineers to integrate and optimize models for production and research use cases.... 
    Performance
    Full time

    Scale AI

    New York, NY
    1 day ago
  • $100k - $170k

     ...Role Summary/Purpose: The AVP, Model Validation is responsible for model validation and ensure they are meeting Model Risk Management...  ...: Serve as a key contributor and lead analyst performing model validation with minimal supervision for credit acquisition... 
    Performance
    Work experience placement
    Work from home
    Visa sponsorship
    Work visa
    Monday to Friday

    Synchrony Financial

    New York, NY
    2 days ago
  • $96.13k - $155.95k

     ...Crime Risk Management (FCRM) Modeling & Advanced Analytics team is...  ...(e.g., math, physics, engineering, finance or computer science...  ...data and business analysts, software engineers, data engineers, and...  ...improvement activities that enhances performance * Adheres to enterprise... 
    Performance
    Work experience placement
    H1b
    Work at office
    Local area
    Work from home
    Flexible hours

    TD Bank

    New York, NY
    3 days ago
  •  ...architectures to the implementation of intelligent models in key business processes. If you’re...  ...data models, ensuring data quality, performance, and scalability. Translate business...  ...and mentoring to less experienced engineers. ParticipSpectrum decision‑making, planning... 
    Performance
    2 days per week
    3 days per week

    Derevo

    New York, NY
    2 days ago
  • $50 per hour

     ...AI training organization is looking for experienced software engineers to help train generative AI models. This flexible freelance role allows you to work...  ...Compensation is competitive, with rates up to $50 per hour depending on experience and performance. #J-18808-Ljbffr... 
    Performance
    Hourly pay
    Freelance
    Remote work
    Flexible hours

    Brain Trust Inc

    New York, NY
    5 days ago
  • $60 per hour

     ...involves evaluating AI-generated quantitative work, designing quantitative problems for AI training, and providing feedback on model performance. Applicants should have a minimum of 2 years in a relevant field, coding experience, and strong analytical skills. This... 
    Performance
    Hourly pay
    Remote work
    Flexible hours

    DataAnnotation

    Brooklyn, NY
    3 days ago
  •  ...their team in the United States. This role involves evaluating models, conducting data analysis, and collaborating with stakeholders...  ...package, including a salary range of $214,000 to $257,000, performance bonuses, and comprehensive benefits. #J-18808-Ljbffr... 
    Performance

    Framework Ventures

    New York, NY
    5 days ago
  •  ...AI Architect – Model Lifecycle We are looking for a highly skilled AI Architect to design scalable, secure, and high performance AI architecture for end-to-end model lifecycle workflows...  ...orchestration. Work with engineering teams to integrate AI components into... 
    Performance

    Merican

    New York, NY
    5 days ago
  • $57 - $60 per hour

     ...insight. Job Title: AI Architect Model Lifecycle Location: NYC, NY, 10003...  ...Architect to design scalable, secure, and high performance AI architecture for end-to-end model...  ...workflow orchestration. Work with engineering teams to integrate AI components into... 
    Performance
    Temporary work

    Merican

    New York, NY
    20 days ago
  • $75 - $150 per hour

     ...Treliant is looking for Credit Risk Modelers for remote, project-based opportunities. Responsibilities Perform thorough model validation of...  ...e. statistics, econometrics, engineering). Advanced degree a plus. 5+...  ...using SAS and/or Stata software packages a plus. Experience using... 
    Performance
    Work experience placement
    Work at office
    Remote work
    Flexible hours

    Treliant (Acquired by Huron - 2025)

    New York, NY
    5 days ago
  • JPMorgan Chase is seeking a Quant Model Risk Associate to join its Model Risk Governance and Review team in New York, NY. This role...  ...Collaborating within a dynamic team environment, you will evaluate model performance and guide on model risk management. Competitive salary and... 
    Performance

    TwinThread

    New York, NY
    3 days ago
  • $60 - $80 per hour

     .... We're building the largest foundation model in oncology and pairing it with proprietary...  ...evaluation frameworks to assess model performance, safety, and clinical relevance....  ...production systems. Experience with prompt engineering and instruction tuning. Contributions... 
    Performance
    Hourly pay
    Internship
    Work at office
    3 days per week

    PATHOS

    New York, NY
    3 days ago
  • $80k - $209.3k

     ...contribute to the company's success. As a Quantitative Analytics and Model Consultant Senior within PNC's Model Risk Management...  ...Tysons Corner, VA / New York City. As a senior validator, you will perform rigorous independent reviews of PNC's Capital Markets models, including... 
    Performance
    Full time
    Temporary work
    Part time
    Work experience placement
    Work at office

    PNC Financial Services Group

    New York, NY
    4 days ago
  •  ...Function / major duties and responsibilities of the job Strategic The Model Validator is responsible for validating CLS models, maintaining...  ...based on internal MRM policy and procedures. Conduct and perform quality assurance for model risk reporting. Communicate effectively... 
    Performance

    Sept 2017 Branding

    New York, NY
    4 days ago
  • $100k

    Join us as a Model Validation AVP, where you will play a key role in independently reviewing and challenging models across financial...  ...and market surveillance domains. You will assess model design, performance, data quality, and governance to ensure alignment with internal... 
    Performance
    Hourly pay

    Barclays

    New York, NY
    5 days ago
  • $160k - $190k

    Model Risk - Investment Management Vice President Risk Management New York or Philadelphia The pay range for this position at commencement...  ...execution, as well as models used in risk management and performance reporting. Evaluate model conceptual soundness, ongoing... 
    Performance
    Relocation package

    Nomura Holdings, Inc.

    New York, NY
    1 day ago
  •  ...overfitting-proof mechanism that rewards genuine model improvements. Our vision is to...  ...RL environment requires high-performance, fault-tolerant systems that can scale with global participation. As a Senior Software Engineer / Architect , you’ll design and build the... 
    Performance

    Framework Ventures

    New York, NY
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer - Model Performance. Be the first to apply!