Software Engineer - Model Performance
Baseten
ABOUT BASETEN Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products. THE ROLE Are you passionate about advancing the application of artificial intelligence? We are looking for a Software Engineer focused on ML performance to join our dynamic team. This role is ideal for someone who thrives in a fast-paced startup environment and is eager to make significant contributions to the exciting field of LLM Inference. If you are a backend engineer who thrives on making things faster and is excited about open-source ML models, we look forward to your application. EXAMPLE INITIATIVES You'll get to work on these types of projects as part of our Model Performance team:
Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you. At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
- Baseten Embeddings Inference: The fastest embeddings solution available
- The Baseten Inference Stack
- Driving model performance optimization
- Implement, refine, and productionize cutting-edge techniques (quantization, speculative decoding, kv cache reuse, chunked prefill and LoRA) for ML model inference and infrastructure.
- Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vllm, sglang, CUDA, and other libraries to debug ML performance issues.
- Apply and scale optimization techniques across a wide range of ML models, particularly large language models.
- Collaborate with a diverse team to design and implement innovative solutions.
- Own projects from idea to production.
- Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field.
- Experience with one or more general-purpose programming languages, such as Python or C++.
- Familiarity with LLM optimization techniques (e.g., quantization, speculative decoding, continuous batching).
- Strong familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM.
- Demonstrated interest and experience in LLM's.
- Deep understanding of GPU architecture.
- Bonus:
- Proficiency in enhancing the performance of software systems, particularly in the context of large language models (LLMs).
- Experience with CUDA or similar technologies.
- Deep understanding of software engineering principles and a proven track record of developing and deploying AI/ML inference solutions.
- Experience with Docker and Kubernetes.
- Competitive compensation, including meaningful equity.
- 100% coverage of medical, dental, and vision insurance for employee and dependents
- Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
- Paid parental leave
- Fertility and family-building stipend through Carrot
- Company-facilitated 401(k)
- Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you. At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Software Engineer - Model Performance in New York, NY vacancy
- ...Software Engineer, Model Routing & Inference Engineering · Full-time · New York; San Francisco Our mission is to automate coding. The... ...data pipelines. You're comfortable reasoning about cost/performance tradeoffs at scale (GPU utilization, provider economics, capacity...PerformanceFull timeWork at office
- ...at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently... .... Join us and help build the platform engineers turn to to ship AI products. THE ROLE: Baseten's Model Performance (MP) team is responsible for ensuring the...PerformanceFlexible hours
$405k
...Model Performance Software Engineer, Claude Code San Francisco, CA | New York City, NY About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole...PerformanceWork at officeVisa sponsorshipFlexible hours- ...of inventive research, design, and engineering. Our organization is very flat, and... ...shipping code. About the Role As a Software Engineer on the Model Routing & Inference team at Cursor,... ...comfortable reasoning about cost/performance tradeoffs at scale (GPU utilization...Performance
- ...of AI to bring cutting-edge models into production. We're growing... ...and help build the platform engineers turn to to ship AI products.... ...are looking for early-career Software Engineers to join our team.... ...at the intersection of high-performance computing (HPC) and Large Language...PerformanceFlexible hours
$40 per hour
...We are looking for a Software Developer to join our team to train AI models. You will measure the progress of these AI chatbots, evaluate their logic, and... ...quality produced by AI models for correctness and performance Qualifications Fluency in English (native or...PerformanceHourly payFull timeContract workPart timeRemote work- ...Developer - Murex Flex / Front Office Quant Model Integration Location: NYC, NY (Hybrid... ...Contribute to platform stability, performance optimization, and risk reduction Work... ...Office systems Strong understanding of software quality, performance, and stability in...PerformanceFlexible hours
$197.3k - $225.1k
...Lead Software Engineer, Back End (Kubernetes, Golang, Foundation Model Gateways) Do you love building and pioneering in the technology space? Do you enjoy solving... ...salary information is solely for candidates hired to perform work within one of these locations, and refers to...PerformanceFull timePart timeInternshipH1bLocal area$40 per hour
A data-focused technology company is seeking a Systems Developer to join their team in training AI models. This role involves measuring chatbot performance and solving coding challenges. Candidates should be proficient in at least one programming language like Python or...PerformanceHourly payContract workRemote workFlexible hours$40 per hour
A leading AI training company is seeking a DevOps Engineer to join their remote team. In this role, you will provide coding challenges... ...to AI chatbots and evaluate their outputs for correctness and performance. Candidates should be proficient in Python or JavaScript and...PerformanceHourly payRemote work$60 per hour
A technology company is seeking a Cloud Platform Engineer to enhance AI models. Responsibilities include evaluating AI outputs through coding challenges and ensuring model correctness and performance. The ideal candidate possesses strong programming skills, particularly...PerformanceHourly payFull timeContract workPart timeRemote work- ...Flex experience for a hybrid role in New York. This position focuses on integrating proprietary Quant models into Murex while ensuring platform stability and performance. Responsibilities include developing Murex Flex components, collaborating with Quants, and supporting...PerformanceFlexible hours
- A technology consulting firm in the United States seeks a Cloud Platform Engineer to enhance AI models by evaluating their performance and logic. This remote position requires proficiency in at least one programming language and a detail-oriented mindset. Responsibilities...PerformanceRemote workFlexible hours
$216k - $270k
...As a Software Engineer on the ML Infrastructure team, you will design and build platforms for... ...Build and maintain fault-tolerant, high-performance systems for serving LLMs workloads at... ...and engineers to integrate and optimize models for production and research use cases....PerformanceFull time$100k - $170k
...Role Summary/Purpose: The AVP, Model Validation is responsible for model validation and ensure they are meeting Model Risk Management... ...: Serve as a key contributor and lead analyst performing model validation with minimal supervision for credit acquisition...PerformanceWork experience placementWork from homeVisa sponsorshipWork visaMonday to Friday$96.13k - $155.95k
...Crime Risk Management (FCRM) Modeling & Advanced Analytics team is... ...(e.g., math, physics, engineering, finance or computer science... ...data and business analysts, software engineers, data engineers, and... ...improvement activities that enhances performance * Adheres to enterprise...PerformanceWork experience placementH1bWork at officeLocal areaWork from homeFlexible hours- ...architectures to the implementation of intelligent models in key business processes. If you’re... ...data models, ensuring data quality, performance, and scalability. Translate business... ...and mentoring to less experienced engineers. ParticipSpectrum decision‑making, planning...Performance2 days per week3 days per week
$50 per hour
...AI training organization is looking for experienced software engineers to help train generative AI models. This flexible freelance role allows you to work... ...Compensation is competitive, with rates up to $50 per hour depending on experience and performance. #J-18808-Ljbffr...PerformanceHourly payFreelanceRemote workFlexible hours$60 per hour
...involves evaluating AI-generated quantitative work, designing quantitative problems for AI training, and providing feedback on model performance. Applicants should have a minimum of 2 years in a relevant field, coding experience, and strong analytical skills. This...PerformanceHourly payRemote workFlexible hours- ...their team in the United States. This role involves evaluating models, conducting data analysis, and collaborating with stakeholders... ...package, including a salary range of $214,000 to $257,000, performance bonuses, and comprehensive benefits. #J-18808-Ljbffr...Performance
- ...AI Architect – Model Lifecycle We are looking for a highly skilled AI Architect to design scalable, secure, and high performance AI architecture for end-to-end model lifecycle workflows... ...orchestration. Work with engineering teams to integrate AI components into...Performance
$57 - $60 per hour
...insight. Job Title: AI Architect Model Lifecycle Location: NYC, NY, 10003... ...Architect to design scalable, secure, and high performance AI architecture for end-to-end model... ...workflow orchestration. Work with engineering teams to integrate AI components into...PerformanceTemporary work$75 - $150 per hour
...Treliant is looking for Credit Risk Modelers for remote, project-based opportunities. Responsibilities Perform thorough model validation of... ...e. statistics, econometrics, engineering). Advanced degree a plus. 5+... ...using SAS and/or Stata software packages a plus. Experience using...PerformanceWork experience placementWork at officeRemote workFlexible hours- JPMorgan Chase is seeking a Quant Model Risk Associate to join its Model Risk Governance and Review team in New York, NY. This role... ...Collaborating within a dynamic team environment, you will evaluate model performance and guide on model risk management. Competitive salary and...Performance
$60 - $80 per hour
.... We're building the largest foundation model in oncology and pairing it with proprietary... ...evaluation frameworks to assess model performance, safety, and clinical relevance.... ...production systems. Experience with prompt engineering and instruction tuning. Contributions...PerformanceHourly payInternshipWork at office3 days per week$80k - $209.3k
...contribute to the company's success. As a Quantitative Analytics and Model Consultant Senior within PNC's Model Risk Management... ...Tysons Corner, VA / New York City. As a senior validator, you will perform rigorous independent reviews of PNC's Capital Markets models, including...PerformanceFull timeTemporary workPart timeWork experience placementWork at office- ...Function / major duties and responsibilities of the job Strategic The Model Validator is responsible for validating CLS models, maintaining... ...based on internal MRM policy and procedures. Conduct and perform quality assurance for model risk reporting. Communicate effectively...Performance
$100k
Join us as a Model Validation AVP, where you will play a key role in independently reviewing and challenging models across financial... ...and market surveillance domains. You will assess model design, performance, data quality, and governance to ensure alignment with internal...PerformanceHourly pay$160k - $190k
Model Risk - Investment Management Vice President Risk Management New York or Philadelphia The pay range for this position at commencement... ...execution, as well as models used in risk management and performance reporting. Evaluate model conceptual soundness, ongoing...PerformanceRelocation package- ...overfitting-proof mechanism that rewards genuine model improvements. Our vision is to... ...RL environment requires high-performance, fault-tolerant systems that can scale with global participation. As a Senior Software Engineer / Architect , you’ll design and build the...Performance
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer - Model Performance. Be the first to apply!
Related searches
- graduate software developer New York, NY
- rust software engineer New York, NY
- senior software design engineer New York, NY
- software engineer student New York, NY
- software engineer amazon New York, NY
- software developer positions New York, NY
- software engineer full time New York, NY
- software qa engineer New York, NY
- new graduate software engineer New York, NY
- junior software developer New York, NY

