Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, Model Performance Systems

Baseten

Software Engineer Opportunity

Baseten powers mission-critical inference for the world's most dynamic AI companies. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. Join us and help build the platform engineers turn to to ship AI products.

We are looking for early-career Software Engineers to join our team. This is a specialized role sitting at the intersection of high-performance computing (HPC) and Large Language Model (LLM) engineering. You will be responsible for building the automated "speedometer and diagnostic" suite for our next-generation AI infrastructure.

In this role, you won't just be using models; you will be tearing them apart to see how they run on the metal. You will build tools that measure GPU FLOPS, stress-test InfiniBand clusters, and define the benchmarks that ensure our systems are production-ready.

Responsibilities
  • Performance Benchmarking: Run and automate standard LLM quality benchmarks (GSM8K, MMLU) alongside custom performance suites for specific workloads (e.g., long-context window, KV cache reuse).
  • Infrastructure Validation: Create automated acceptance tests for new GPU clusters across x86 and ARM systems, measuring GPU memory bandwidth, networking throughput, and multi-node networking performance.
  • Model Dev Experience: Develop and maintain internal GPU-enabled development environments (similar to GitHub Codespaces). You will ensure the team has seamless, high-performance "dev machines" optimized for model experimentation.
  • Tool Development: Build and contribute to tools such as InferenceMAX and genai-bench to automate model evaluation and optimization.
  • Deep Hardware Profiling: Use PyTorch Profiler and NVIDIA Nsight Systems to collect performance profiles, identify bottlenecks, and debug the NVIDIA compute/networking stack.
  • Monitoring & Observability: Develop real-time dashboards and alerts to monitor system health, model startup times, and runtime performance.
  • Continuous Integration: Automate performance testing via CI/CD pipelines to catch regressions in model setups before they hit production.
  • Optimization Automation: Build tools to find the "Pareto frontier"—identifying the absolute best configuration (latency vs. cost vs. quality) for a given model and workload.
What We're Looking For
  • A Love for Systems & Hardware: You aren't just interested in the AI; you want to understand GPU memory subsystems, InfiniBand, and how data moves across a cluster.
  • An Automation Mindset: You believe that if a task has to be done twice, it should be scripted. You have a passion for stress-testing and fuzzy testing to find the "breaking point" of a system.
  • Mathematical Curiosity: A desire to understand the underlying math of Transformers and how it translates into FLOPs and memory requirements.
  • Interest in Optimization: You are excited to learn about (or already play with) quantization, speculative decoding, disaggregated serving, and kernel-level optimizations.
  • Technical Toolkit: Familiarity with Python, and an eagerness to master the NVIDIA software stack. C++ familiarity is good to have.
Why This Role
  • Direct Impact: Your tools will be the gatekeeper for what defines "good" performance for our customers.
  • Deep Learning (Literally): You will gain world-class expertise in GPU orchestration and LLM inference that few engineers in the industry possess.
  • High Ownership: As a small team of freshers led by experts, you will have the autonomy to build tools from scratch and contribute to open-source projects.
Benefits
  • Competitive compensation, including meaningful equity.
  • 100% coverage of medical, dental, and vision insurance for employee and dependents
  • Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
  • Paid parental leave
  • Fertility and family-building stipend through Carrot
  • Company-facilitated 401(k)
  • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you.

At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.

We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Software Engineer, Model Performance Systems in United States vacancy
  • $40 per hour

     ...A tech company specializing in AI is seeking a Systems Developer for a remote position. The role involves training AI models, providing coding challenges, and evaluating their performance. Candidates should be proficient in at least one programming language, including... 
    Performance
    Hourly pay
    Remote work
    Flexible hours

    DataAnnotation

    United States
    4 days ago
  • $40 per hour

    A data-focused technology company is seeking a Systems Developer to join their team in training AI models. This role involves measuring chatbot performance and solving coding challenges. Candidates should be proficient in at least one programming language like Python or... 
    Performance
    Hourly pay
    Contract work
    Remote work
    Flexible hours

    DataAnnotation

    United States
    2 days ago
  • $70k - $95k

     ...Associate Model-Based Systems Engineer (Software Concentration) Join Our Team! An empowering environment of problem solving. G2 Ops is growing. We...  ...support and model-driven engineering tasks Ability to perform routine engineering design tasks using standard... 
    Performance
    Temporary work
    For contractors
    Work at office
    Local area
    Flexible hours

    Navstar

    Virginia Beach, VA
    3 days ago
  • $122.8k - $184.2k

     ...:Qualcomm Technologies, Inc.Job Area:Engineering Group, Engineering Group > Machine Learning...  ...Summary:We are looking for an AI Performance System Software Engineer to work on performance and...  ...methodsKnowledge of state of the art in AI models for one or more of the domains such... 
    Performance
    Work from home

    Nutanix

    San Diego, CA
    1 day ago
  • $230k - $385k

    About the Team We're hiring software engineers to make OpenAI's Model Performance teams more productive. These teams work on the systems, tooling, and infrastructure that help improve model performance across OpenAI's training and inference workloads at frontier scale... 
    Performance

    OpenAI

    San Francisco, CA
    5 days ago
  • $220k - $320k

     ...squeezing every last drop of performance out of GPUs, diving...  ...into production systems, we'd love to meet you...  ...specialized language models for companies that need...  ...funded ten-person team of engineers who work in-person in...  ...and run their own software companies. We are high... 
    Performance
    Work at office

    Inference

    San Francisco, CA
    3 days ago
  •  ...Baseten Model Performance Engineer Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion...  ...the latest open-source models. This work spans distributed systems, model serving, and developer experience. You'll join a small... 
    Performance
    Remote work
    Flexible hours

    Baseten

    United States
    5 days ago
  •  ...our start-of-the-art AI models, allowing them to do...  ...to before. We focus on performant and efficient model...  ...We are looking for an engineer who wants to take the...  ...years of professional software engineering experience...  ...production distributed systems. Bonus point if worked... 
    Performance

    OpenAI

    San Francisco, CA
    1 day ago
  • $45 per hour

     ...! You will work on improving the performance and efficiency of large-scale AI models across training, inference, and deployment...  ...in high-performance ML systems and contribute to enhancing user...  ...Responsibilities: - Support research and engineering efforts to optimize deep learning... 
    Performance
    Hourly pay
    Full time
    Summer work
    Internship
    Local area

    Tik Tok

    San Jose, CA
    1 day ago
  •  ...About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems that ensure multimodal AI behaves reliably, consistently, and predictably...  ...benchmarking, dataset-driven testing, and performance validation pipelines. You will work at... 
    Performance

    SPREEAI

    San Francisco, CA
    1 day ago
  • Core Model Software Development Engineer Hyundai America Technical Center, Inc. (HATCI) is currently...  ...models for fuel economy, linear performance, grade performance, and trailer tow...  ...~ Proficiency with version control systems such as GitHub or GitLab ~ Familiarity... 
    Performance
    For contractors
    Flexible hours

    Hyundai America Technical Center

    Superior, MI
    2 days ago
  • $405k

     ...Model Performance Software Engineer, Claude Code San Francisco, CA | New York City, NY About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a... 
    Performance
    Work at office
    Visa sponsorship
    Flexible hours

    anthropic

    New York, NY
    3 days ago
  • $172.43k - $230.95k

     ...Senior Software Engineer For The Ai Model Lifecycle Team Crusoe is on a mission to accelerate the abundance...  ...strategies, and be part of a high-performing team that believes in each other,...  ...Working On Manage fine-tuning systems for large foundation models (SFT, PEFT... 
    Performance
    Temporary work

    Crusoe

    Sunnyvale, CA
    5 days ago
  • $165.2k - $223.6k

     ...builds AWS Neuron, the software development kit used...  ...inference and training performance. The Inference Enablement...  ...a wide range of models and supporting novel architecture...  ...boundary, our engineers build systematic...  ...across the stack from system level optimizations through... 
    Performance
    Work experience placement
    Internship
    Local area
    Flexible hours

    Amazon

    Cupertino, CA
    5 days ago
  • $140k - $390k

     ...AI ASIC). This role sits at the intersection of ML modeling and hardware-aware systems engineering - you will architect and train state-of-the-art models...  ...underlying silicon and compiler stack to maximize performance. You will drive the full lifecycle from model research... 
    Performance
    Hourly pay
    Full time
    Temporary work
    Flexible hours

    Tesla

    Palo Alto, CA
    3 days ago
  •  ...AI to bring cutting‑edge models into production. We're growing...  ...help build the platform engineers turn to to ship AI...  ...intelligence? We are looking for a Software Engineer focused on ML performance to join our dynamic team....  ...performance of software systems, particularly in the... 
    Performance
    Flexible hours

    Baseten

    San Francisco, CA
    2 days ago
  • $173.11k - $234.39k

     ...Type Hybrid Department Engineering Compensation $173,113...  ..., interview performance, and work location. We...  ...and run AI agents and models directly in their workflows...  ...QUALIFICATIONS 3+ years of software engineering or...  ...experience with backend systems and infrastructure for... 
    Performance
    Full time
    Work at office
    Local area
    Flexible hours
    Shift work
    3 days per week

    Menlo Ventures

    San Francisco, CA
    2 days ago
  • $166k - $225k

     ...their business. Databricks’ Model Serving product provides enterprises...  ...efficiency. As a Senior Engineer, you’ll play a critical role...  .... You will design and build systems that enable high-throughput,...  ...decisions and trade-offs to optimize performance, throughput, autoscaling, and... 
    Performance
    Local area
    Worldwide

    Cacheflow

    San Francisco, CA
    4 days ago
  •  ...Senior AI Engineer In Pre-training Evaluation Aleph Alpha Research...  ...develops foundational models and next-generation methods that...  ...whether it predicts downstream performance. Other weeks you'll be optimizing...  ...predict downstream and system-level performance. Your... 
    Performance
    Remote work
    Relocation
    Flexible hours

    Aleph Alpha

    United States
    3 days ago
  • $212.8k

     ...Convert and compile ML models for execution on edge NPUs...  ...and analyze model performance and power consumption on...  ...Computer Science, Electrical Engineering, Computer Engineering,...  ...in machine learning software engineering, model deployment, or ML systems for production... 
    Performance
    Temporary work
    Local area

    ByteDance

    San Jose, CA
    4 days ago
  •  ...Department Summary: The Emerging Systems Engineering Technologies Department is charged with...  ...and expand one’s systems architecture modeling, and trade space analysis skills. Areas...  ...Mathematics, Physics, Systems Engineering, Software Engineering, or related field • Must... 
    Internship
    Work at office
    Local area

    The MITRE Corporation

    McLean, VA
    4 days ago
  • $254k - $349.25k

     ...development of next-generation AI systems for cybersecurity ,...  ...requires deep expertise in model architecture, training, fine-...  ...to continuously improve model performance and reliability Productionization...  ...EDR, CASB, etc.) Systems & Engineering Experience designing high... 
    Performance
    Flexible hours

    Proofpoint

    Sunnyvale, CA
    3 days ago
  • $147.7k - $255.99k

     ...RMS (Rotary & Mission Systems), we are driving a OneLM...  .... The Systems Engineering Lead interfaces with product...  ...deployment across the model-based environment....  ...systems engineering, software engineering, hardware...  ...Familiarity with program performance requirements and metrics... 
    Performance
    Full time
    Temporary work
    Work experience placement
    Work at office
    Remote work
    Relocation
    Flexible hours
    Shift work
    3 days per week

    Lockheed Martin Corporation

    Liverpool, NY
    2 days ago
  •  ...Responsibilities PeopleTec is currently seeking a Model Based Systems Engineer to support our Huntsville, AL / Fort Belvoir, VA locations...  ...architecture. Capture, analyze, and validate functional and performance requirements to ensure mission success. Conduct trade-off... 
    Performance
    Local area

    PeopleTec

    Huntsville, AL
    3 days ago
  •  ...Technology Solutions is seeking a Model-Based Systems Engineer (MBSE) to support an exciting program...  ...methods and tools to ensure product performance, traceability, and compliance across...  ...with cross-functional teams including software, hardware, and test engineers, as... 
    Performance
    For contractors

    Input Technology Solutions

    Huntsville, AL
    3 days ago
  •  ...Software Engineer, Apple Intelligence Model Platform The Proactive Intelligence Platform is at the heart of an intelligent system experience that understands you and anticipates your needs. We...  ...develop and improve unit tests, performance tests, and diagnose and resolve... 
    Performance
    Worldwide

    Apple

    Cupertino, CA
    5 days ago
  •  ...Position Title: Model-Based Systems Engineer (MBSE) Position Type: Full-time, Onsite Location: Dayton, OH Clearance: Active...  ...to solve complex system challenges and improve system performance. Knowledge of aerospace or defense systems and... 
    Performance
    Full time
    For contractors

    Waypoint Human Capital

    Dayton, OH
    5 days ago
  •  ...Michigan, and work on state-of-the-art systems that are shaping the future! We are...  ...their lifecycle. Utilize model-based systems engineering (MBSE) tools to define system architectures...  ...program requirements, track performance metrics, and manage the systems engineering... 
    Performance
    For contractors

    Arcfield

    Sterling Heights, MI
    3 days ago
  •  ...About the job Model-Based Systems Engineer Title: Model-Based Systems Engineer Status:...  ...with multidisciplinary engineering and software development teams. The position...  ...design decisions and improve system performance Contribute to modeling and simulation... 
    Performance
    Full time
    Visa sponsorship

    FastTrack Staffing

    Dayton, OH
    3 days ago
  •  ...leadership and oversite of a programs Model Based System Engineer (MBSE) development activities and...  ...technical direction to MBSE Systems and Software Engineers for the execution of...  ...management, trade studies, technical performance measures, testing, interface control... 
    Performance
    Contract work
    Work at office

    Detroit Engineered Products

    Sterling Heights, MI
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, Model Performance Systems. Be the first to apply!