Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior ML Performance Engineer - GPU & Inference

Modal Labs

About Us: Modal provides the infrastructure foundation for AI teams. With instant GPU access, sub-second container startups, and native storage, Modal makes it simple to train models, run batch jobs, and serve low-latency inference. We have thousands of customers who rely on us for production AI workloads, including Lovable, Scale AI, Substack, and Suno. We're a fast-growing team based out of NYC, SF, and Stockholm. We've hit 9-figure ARR and recently raised a Series B at a $1.1B valuation. Our investors include Lux Capital, Redpoint Ventures, Amplify Partners, and Elad Gil. Working at Modal means joining one of the fastest-growing AI infrastructure organizations at an early stage, with many opportunities to grow within the company. Our team includes creators of popular open-source projects (e.g. Seaborn, Luigi), academic researchers, international olympiad medalists, and experienced engineering and product leaders with decades of experience. The Role We are looking for strong engineers with experience in making ML systems performant at scale. If you are interested in contributing to open-source projects and Modal’s container runtime to push language and diffusion models towards higher throughput and lower latency, we’d love to hear from you! Requirements 5+ years of experience writing high-quality, high-performance code. Experience working with torch, high-level ML frameworks, and inference engines (vLLM or TensorRT). Familiarity with Nvidia GPU architecture and CUDA. Experience with ML performance engineering (tell us a story about boosting GPU performance — debugging SM occupancy issues, rewriting an algorithm to be compute-bound, eliminating host overhead, etc). Nice-to-have: familiarity with low-level operating system foundations (Linux kernel, file systems, containers, etc). #J-18808-Ljbffr Modal Labs

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Senior ML Performance Engineer - GPU & Inference in New York, NY vacancy
  • $128.7k - $261.3k

     ...to model export, kernel development, and performance engineering so that every cycle on our accelerators...  ...AI Kernels team builds high‑performance GPU kernels and custom libraries that sit at the heart of our on‑vehicle ML inference for ADAS and autonomous driving. We own... 
    Senior
    Performance
    Flexible hours

    General Motors

    New York, NY
    5 hours ago
  • $128.7k - $261.3k

     ...Team The Model Deployment & Inference Solutions team in GM AV...  ...mission is two-fold: build the ML deployment platform that...  ...automating workflows currently performed manually by engineers. Build the developer...  ...Familiarity with the NVIDIA GPU stack at the integration level... 
    Senior
    Performance
    Flexible hours
    Shift work

    General Motors

    New York, NY
    5 hours ago
  •  ...professional in New York to design and operate large-scale GPU infrastructure for model inference and reinforcement learning. The role demands several...  ...experience in deploying GPU systems, optimizing model performance, and working with frameworks like SGLang and Megatron.... 
    Senior
    Performance

    Reflection

    New York, NY
    2 days ago
  • $128.7k - $261.3k

     ...seeks a skilled professional to develop its ML deployment platform within the...  ...deployment from training to on-vehicle inference and enhancing developer experience through...  ...from $128,700 to $261,300 with additional performance bonuses and a comprehensive benefits package... 
    Senior
    Performance

    General Motors

    New York, NY
    5 hours ago
  • $200k - $250k

     ...we’re building the top-performing AI Shopping Agent that...  ..., and trust. Our ML models power the core...  ...seeking an experienced Senior MLOps Engineer to take ownership of how...  ...– for a custom-built inference platform powering a live...  ...latency, availability, GPU utilization, TTFT, ITL... 
    Senior
    Performance
    Remote work
    Flexible hours

    Wizard

    New York, NY
    5 hours ago
  • $128.7k - $261.3k

     ...kernel development, and performance engineering so that every cycle on...  ...into fast, reliable inference across GPUs powering GM...  ..., systems, and GPU engineers who enjoy working...  ...driving. The Role As a Senior Compiler Engineer on the...  ..., and effortless for ML engineers across the... 
    Senior
    Performance
    Flexible hours

    General Motors

    New York, NY
    5 hours ago
  • $175k - $250k

     ...Senior Machine Learning Engineer (ML Infrastructure & Data Systems) Our client is an...  ...continuously improving system performance through tight feedback...  ...scaled ML training and inference systems in production environments...  ...at scale (e.g., large GPU workloads) Familiarity... 
    Senior
    Performance

    Right Hand Talent

    Brooklyn, NY
    4 days ago
  • $144.7k - $261.3k

     ...Role We are looking for an ML tooling engineer to build tools to analyze and...  ...distillation, training, and inference of ML models. You will develop...  ...ML tooling for high performance software by leveraging state...  ...deploying machine learning models GPU programming (CUDA) and familiarity... 
    Senior
    Performance
    Flexible hours

    General Motors

    New York, NY
    5 hours ago
  • $220k

     ...Senior Machine Learning Engineer Location: Remote (with optional hybrid...  ...building scalable ML infrastructure and deploying...  ...practices in a high-performance environment....  ...You will build multi-GPU training pipelines...  ...from training to batch inference, ensuring automation... 
    Senior
    Performance
    Remote work
    Flexible hours

    Harnham

    New York, NY
    3 days ago
  •  ...looking for an experienced AI Model Engineer with deep expertise in kernel...  ...optimization, fine‑tuning, and GPU acceleration. The engineer will extend the inference framework to support inference and...  ...testing, fine‑tuned adapter performance). Conduct GPU testing across desktop... 
    Senior
    Performance
    Remote work

    Framework Ventures

    New York, NY
    5 hours ago
  • $165k - $225k

     ...one of its clients a Senior Machine Learning Engineer - this is a fully...  ...experienced Senior ML Engineer to join our...  ...evaluate algorithm performance, validate research hypotheses...  ...Experience with GPU acceleration and...  ...TensorRT/ONNX export, and inference serving frameworks... 
    Senior
    Performance
    Remote work
    Worldwide

    Career Renew

    New York, NY
    4 days ago
  • Darwin Recruitment is seeking a Senior GPU Systems / AI Infrastructure Engineer in New York City. This senior-level...  ...large-scale model training and inference. Candidates should have 5-10+ years...  ..., directly impacting performance and scalability of frontier AI models... 
    Senior
    Performance

    Darwin Recruitment

    New York, NY
    1 day ago
  •  ...please visit . Job Title: Senior ML Data Engineer Work Location: Lyndhurst,...  ...MLflow for experiment tracking and performance monitoring Query Optimization...  ...to support real-time and batch model inference CICD Practices Apply CICD principles... 
    Senior
    Performance
    Remote work

    Futran Tech Solutions Pvt. Ltd.

    Lyndhurst, NJ
    1 day ago
  •  ...Reddit, Inc. is seeking a Staff Machine Learning Engineer to lead the development of a large-scale ML Inference Platform. Responsibilities include designing...  ...on Kubernetes and ensuring reliable, low-latency performance. Candidates should have 7+ years of experience in... 
    Performance

    Reddit

    New York, NY
    5 hours ago
  • $216.7k - $303.4k

     ...Senior Machine Learning Systems Engineer Remote - United States Reddit is a community of...  ...You’ll Do: As a Senior ML Infrastructure Engineer,...  ...Collaborate with ML engineers on performance tuning, including...  ...training time, efficiency, and GPU training costs in a large,... 
    Senior
    Performance
    For contractors
    Work experience placement
    Remote work

    Reddit

    New York, NY
    4 days ago
  •  ...infrastructure company based in New York is seeking experienced engineers to enhance the performance of ML systems and contribute to open-source projects. Ideal...  ...writing high-quality code and familiarity with Nvidia GPU architecture and ML frameworks. This role offers... 
    Performance

    Modal

    New York, NY
    5 days ago
  • $160k - $240k

    Senior MLOps Engineer - Artificial Intelligence Location New York...  ...of Machine Learning (ML) and Software...  ...processes, enhance the performance of our systems and more...  ...disk / network / CPU / GPU) usage Work closely...  ...continuous model training, inference, and monitoring... 
    Senior
    Performance
    Temporary work
    For contractors
    Work experience placement

    Bloomberg L.P.

    New York, NY
    5 days ago
  • $144.7k - $261.3k

     ...Senior ML Validation Research Engineer will lead applied machine learning research focused on improving verification...  ...Prototype research concepts into performant tools integrated into CI/CD and...  .... Knowledge of Bayesian ML, causal inference, and sequential testing. Experience... 
    Senior
    Performance
    Flexible hours

    General Motors

    New York, NY
    5 hours ago
  • $144.7k - $261.3k

     ...developer environments, cloud infrastructure, and ML/AI GPU platforms for AV research and development teams...  ...run faster in GM. The Role GM is looking for a Senior Capacity Engineer to join the AV Capacity and Performance Engineering team in the AV Infrastructure org to... 
    Senior
    Performance
    Work experience placement
    Local area
    Remote work
    Work from home
    Flexible hours

    General Motors

    New York, NY
    5 hours ago
  •  ...A leading cloud technology company in the United States seeks an ML Performance Engineer Principal Lead to optimize inference performance across its platforms. The role involves evaluating techniques like quantization and hardware-aware scheduling. Ideal candidates will... 
    Performance

    Akamai

    New York, NY
    5 hours ago
  • $150k - $300k

     ...Senior AI Engineering Expert At Goldman Sachs, our Engineers don't just make things – we...  ...and microservices. Scalability & Performance: Optimize inference latency and manage token costs for...  ...with at least 3 years focused on AI/ML integration in production. Domain... 
    Senior
    Performance
    Full time
    Temporary work
    Part time
    Immediate start

    The Goldman Sachs Group, Inc.

    New York, NY
    5 days ago
  •  ...Department: Engineering & Technology Function: Artificial...  ...build, ship, and own AI and ML systems that perform in production -- not in a notebook...  ...number: personally. The Senior AI/ML & Data Engineer...  ...language models for contextual inference, personalization, and... 
    Senior
    Performance
    Price work
    Full time
    Casual work
    Work at office
    Remote work
    Day shift

    Gesture

    New York, NY
    4 days ago
  •  ...healthcare management company seeks a Machine Learning Engineer to develop and deliver end-to-end ML solutions. The ideal candidate will have strong...  ...partners to enhance ML products and ensure robust model performance. Join us to help improve patient outcomes and contribute... 
    Senior
    Performance

    InterWell Health

    New York, NY
    5 hours ago
  •  ...leading AI technology company is seeking a Senior Machine Learning Engineer to enhance their speech recognition and...  ...frameworks and improving model accuracy and performance. The ideal candidate will have extensive experience in ML model deployment and evaluation, along... 
    Senior
    Performance
    Remote work

    Cresta

    New York, NY
    5 days ago
  •  ...technology company in the USA is looking for a Senior Machine Learning Engineer to develop and deploy machine learning models for performance analytics. The ideal candidate will have at least 2 years of experience in production ML systems, a strong background in Python, and... 
    Senior
    Performance
    Remote work

    Raceon Gmbh

    New York, NY
    5 hours ago
  • $160k - $240k

     ...Senior Software Engineer - AI Inference Location New York Business Area Engineering and CTO Ref...  ...applications with guaranteed scalability, performance, and governance. Our platform is...  .... Familiarity with PyTorch and GPU software stacks such as CUDA and... 
    Senior
    Performance
    Temporary work
    For contractors
    Work experience placement

    Bloomberg

    New York, NY
    4 days ago
  •  ...candidate to work on data engineering pipelines using Spark...  ...to develop, run and infer Machine Learning Models...  ...project discussions as a senior member of the team....  ...machine learning tools ML Flow, Databricks, Snowflake...  ...designing and coding for performance. ~ Knowledge and... 
    Senior
    Performance
    Worldwide

    TriOptus LLC

    New York, NY
    5 hours ago
  •  ...Ocient is seeking a Senior Software Engineer - Machine Learning & Geospatial to enhance its ML capabilities. This fully remote position focuses on identifying feature...  ...to popular ML frameworks, ensuring efficient performance at scale. The ideal candidate has over 5... 
    Senior
    Performance
    Remote work

    Ocient

    New York, NY
    5 days ago
  • $140k - $150k

     ...Athletic Media Company is seeking a senior machine learning operations engineer to work remotely from the United...  ...ensuring their successful deployment and performance monitoring. With a focus on...  ...familiarity with cloud platforms and ML frameworks. A competitive salary range... 
    Senior
    Performance
    Remote work

    The Athletic Media Company

    New York, NY
    51 minutes ago
  •  ...business needs. Collaborate with data scientists and software engineers to design and implement scalable and efficient solutions. Clean...  ...models into production environments and monitor their performance. Continuously improve model accuracy and performance through experimentation... 
    Senior
    Performance

    Resolve Tech Solutions

    New York, NY
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior ML Performance Engineer - GPU & Inference. Be the first to apply!