Machine Learning Engineer- Inference Optimization | Experienced Hire

Susquehanna International Group LLP

Overview We are looking for a Machine Learning Engineer focused on low-latency inference optimization to help build, tune, and productionize high-performance model serving systems. This role sits at the intersection of machine learning, systems engineering, and GPU performance. You will work on inference workloads where latency, throughput, reliability, and hardware efficiency all matter, and where a deep understanding of modern inference runtimes can meaningfully improve production outcomes. You will work closely with quantitative researchers and engineers to understand model structure, identify inference bottlenecks, and turn research ideas into efficient production systems. The work may involve other types of models, but focuses on transformer-style architectures, and structured inference workloads. You will evaluate and tune frameworks and related serving or compilation systems, while also reasoning about GPU execution, memory layout, batching strategies, precision tradeoffs, and end-to-end latency. What you'll do Design, build, and optimize low-latency inference systems for production machine learning workloads. Profile model inference pipelines across model execution, runtime configuration, batching, memory movement, serialization, networking, and I/O. Evaluate, integrate, and tune inference runtime systems. Improve latency, throughput, GPU utilization, for production inference workloads. Build and support benchmarking and profiling tools to compare model variants, hardware targets, runtime configurations, and deployment strategies. Debug performance issues involving GPU memory, compute saturation, kernel behavior, CPU/GPU coordination, data movement, and serving-layer overhead. Help shape model and system design choices so that research models are efficient to deploy under real latency constraints. Where necessary, collaborate with lower-level systems or GPU specialists on custom operators, kernel-level optimization, or hardware-specific performance work. What we’re looking for Experience deploying, optimizing, or operating machine learning inference workloads in production or production-like environments. Programming experience in Python, Java, C# etc. and at least one systems language such as C, C++, Rust, or Go Solid understanding of modern ML frameworks such as PyTorch, including model execution, export, tracing, compilation, and performance profiling. Ability to reason about latency, throughput, batching, memory use, GPU utilization, and reliability under real workloads. Strong practical judgment around tradeoffs between model quality, latency, throughput, implementation complexity, and maintainability. Preferred qualifications Experience optimizing inference for latency-sensitive or high-throughput applications. Experience with model optimization techniques such as quantization, pruning, distillation, operator fusion, graph lowering, custom operators, or model compilation. Exposure to CUDA, Triton language, ROCm, PTX, CuTe, CUTLASS, FlashInfer, or similar low-level GPU programming tools. Experience running inference workloads on Kubernetes or GPU clusters, including scheduling, autoscaling, observability, and resource management. Background in mathematics, physics, computer science, engineering, statistics, quantitative finance, or another technical field. Demonstrated ability to improve real-world inference performance beyond a baseline framework implementation. #J-18808-Ljbffr

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the Machine Learning Engineer- Inference Optimization | Experienced Hire in New York, NY vacancy

Machine Learning Engineer - Inference / Serving
...Machine Learning Engineer - Inference / Serving Join to apply for the Machine Learning Engineer - Inference... ...serving at Yobi, you’ll design, optimize, and operate the systems that bring... ...to Python. Operational maturity: experienced with monitoring, drift detection,...
Suggested
Full time
Remote work
Yobi AI
New York, NY
13 days ago
Machine Learning Engineer, Senior Manager
$184.35k - $270.39k
...the country. Our Engineering and Analytics... ...practices that help optimize our success. Our... ...motivated and experienced Leader of ML and... ...decision science, machine learning, and generative... ...AI platform and inference pipelines for... ...than others. Our hiring team wants to make...
Suggested
Casual work
Work at office
Local area
Remote work
Work from home
Credit Acceptance
New York, NY
5 days ago
Machine Learning Engineer 5 - Decisioning & Optimization
...Team The Decisioning & Optimization engineering team owns the systems that... ...for model serving: real-time inference at 1M+ QPS, multi-model... ...unique culture and environment. Learn more here. Inclusion is... ...other reason during the hiring process, please send a request...
Suggested
Hourly pay
Full time
Immediate start
Flexible hours
Shift work
Netflix Inc
New York, NY
6 days ago
Senior Machine Learning Engineer (Inference Platform)
$200k - $250k
...and we’re seeking an experienced Senior MLOps Engineer to take ownership of how our machine learning systems run reliably... ...monitoring, observability, optimization and scaling – for a custom-built inference platform powering a... ...does not affect hiring decisions. #J-18808-Ljbffr...
Suggested
Remote work
Flexible hours
Wizard
New York, NY
6 days ago
Senior Machine Learning Engineer (Remote)
$151.04k - $234.11k
...responsibility. We are looking for experienced ML engineers to join our team of 35+ engineers... ...PyTorch + HuggingFace for deep learning work. Model inference runs on a mix of FastAPI and Clojure... ..., transfer learning, and model optimization to improve the accuracy and...
Suggested
Remote work
Day shift
Triumph Financial
New York, NY
5 days ago
Realtime ML Inference Engineer Scalable Serving
A leading Behavioral AI company is seeking a Machine Learning Engineer focused on inference and serving. In this role, you will design and optimize systems to operationalize AI models. The ideal candidate has deep expertise in model deployment, a strong low-latency mindset...
Remote work
Yobi AI
New York, NY
13 days ago
Senior Machine Learning Engineer
$128k - $160k
...looking for a Senior Machine Learning Engineer to drive... ...-impact role for an experienced builder who thrives... ...valuation and search optimization. This key role will... ...statistical modeling, causal inference, experiment/test design... ...other relevant KPIs. Hiring Range Tier 1 (...
Work experience placement
Local area
GOAT Group
New York, NY
6 days ago
Senior Machine Learning Engineer
$120k - $240k
...simulation software stack for engineering and manufacturing... ...through AI inference across the entire engineering... ...new levels of optimization and automation in design... ...Looking For As a Senior Machine Learning Engineer in Delivery, you are an experienced problem solver and...
Work at office
Remote work
Flexible hours
PhysicsX
New York, NY
2 days ago
Senior Machine Learning Engineer (Ads R&D)
$184.05k - $262.93k
.... We are seeking a Senior Machine Learning Engineer to join the Supply Personalization... ...focuses on optimizing the volume, timing, and types... ...machine learning, causal inference, and large scale online experimentation... ...Python, Java, or Scala. Experienced in Tensorflow or PyTorch...
Flexible hours
Spotify
New York, NY
2 days ago
Machine Learning Engineer, Presentation and Visual Optimization
$130.2k - $195.3k
...and aim to leave a positive mark on culture. Machine Learning Engineer, Presentation and Visual Optimization(45540) Overview: We are seeking a Machine... ...including SHOWTIME®. ADDITIONAL INFORMATION Hiring Salary Range: $130,200.00 - 195,300.00....
Paramount
New York, NY
2 days ago
Staff Machine Learning Engineer, Ads Auction (Ads Marketplace Quality)
$230k - $322k
...Staff Machine Learning Engineer, Ads Auction (Ads Marketplace Quality... ...are looking for an experienced machine learning... ...class marketplace and optimizing for users, advertisers... ...model training, and inference. Proficiency with... ...promptly after making a hiring decision. For more...
For contractors
Work experience placement
Work at office
Remote work
Home office
Flexible hours
Reddit
New York, NY
6 days ago
Principal Machine Learning Engineer, Presentation & Visual Optimization
$234k - $250k
...Principal Machine Learning Engineer, Presentation and Visual Optimization We are seeking a Principal Machine Learning Engineer to lead our Presentation pod. The... ...outside the workplace. Explore life at Paramount: Hiring salary range: $234,000.00 - $250,000.00. Paramount...
Shift work
Paramount Pictures
New York, NY
2 days ago
Low-Latency ML Inference Engineer (GPU Systems)
...SIG Susquehanna is seeking a Machine Learning Engineer focused on optimizing low-latency inference systems. This role bridges machine learning and systems engineering to enhance model serving efficiency. Ideal candidates will have experience in deploying inference workloads...
SIG Susquehanna
New York, NY
2 days ago
Lead ML Inference Performance Engineer
...A leading cloud technology company in the United States seeks an ML Performance Engineer Principal Lead to optimize inference performance across its platforms. The role involves evaluating techniques like quantization and hardware-aware scheduling. Ideal candidates will...
Akamai
New York, NY
6 days ago
Open-Source Machine Learning Engineer - US Remote
...600k stars on GitHub. About the Role As an Open-Source Machine Learning Engineer, you'll work to improve the open-source machine learning... ...libraries Familiarity with distributed training, inference optimization, or GPU/accelerator performance work Experience training...
Work at office
Remote work
Flexible hours
Hugging Face
New York, NY
3 days ago
Senior ML Infra Engineer - Real-Time Inference & MLOps
$175k - $250k
...Point72 Asset Management, L.P in New York, NY is seeking an experienced ML Engineer to join their Knowledge Graph Intelligence team. You will... ...design and implement mission-critical infrastructure for machine learning, focusing on data processing, model training, and...
Point72 Asset Management, L.P
New York, NY
2 days ago
Machine Learning Engineer
$140k - $210k
...highly skilled and motivated engineer to join our team. You will... ...deploying state-of-the-art machine learning solutions to advance our... ...If you are a passionate and experienced engineer eager to contribute... ...using cloud-based training and inference pipelines. 5+ years of...
Full time
Work experience placement
Work at office
2 days per week
Treeswift Inc
New York, NY
2 days ago
Sr Machine Learning Engineer
$156.77k - $198.27k
...Island City-Corp Job Summary Machine Learning Engineers work to deploy end-to-end... ...(NLP), experiments, and optimization. Hands‑on experience with... ...Ability to apply Bayesian inference, frequentist statistics, causal... ...pay rate/range at time of hire for this position in the...
Work experience placement
Local area
Optimum Corp
New York, NY
3 days ago
Real-Time ML Inference Engineer for Scalable Serving
...A Behavioral AI company is seeking a Machine Learning Engineer to design and optimize systems for bringing their models to life. The role involves ensuring ML models are efficient and reliable, requiring experience in model deployment and robust coding skills. Candidates...
Remote work
YOBI, LLC
New York, NY
2 days ago
Senior ML Engineer RL & Optimization for FinTech
...A leading fintech company in New York is looking for a Machine Learning Engineer to tackle complex credit challenges through innovative solutions... ...and programming skills in Python. You will develop and optimize algorithms to enhance operational efficiency and drive business...
ClarityPay Program Services, LLC
New York, NY
2 days ago
Senior ML Engineer: Marketplace Optimization & Production
...Indeed, Inc. is seeking a Machine Learning Engineer III to lead the Job Reach team, focusing on optimizing marketplace efficiency through effective machine learning solutions. Candidates should possess at least 8 years of experience in relevant fields with a Bachelor'...
Indeed, Inc., c/o CT Corporation (Indeed.com)
New York, NY
6 days ago
Machine Learning Engineer
$150k - $215k
...combining world‑class engineers with veteran... ...still. About the Role Machine learning is core to Vannevar's... ...deploying high‑performance inference services, and we operate... ...Face, to deploying optimized inference services using... ...be considered in the hiring process or thereafter...
Permanent employment
Contract work
For contractors
For subcontractor
Work at office
Remote work
Vannevar Labs
New York, NY
6 days ago
Machine Learning Engineer (Personalization - Samba)
$148.9k - $212.72k
...Spotify’s personalization engine, powering experiences like... ...on complex sequencing and optimization problems—balancing what users... ...business. Our team blends machine learning, backend engineering, and data... ...engineering You are experienced with production-grade systems...
Flexible hours
Spotify
New York, NY
2 days ago
Senior Machine Learning Engineer
$153k - $198k
...have a good time doing it. As a Senior Machine Learning Engineer, you will own the end to end ML... ...training workflows, model deployment, inference services, monitoring, and retraining.... ...scoring, and online inference. Build and optimize machine learning models including...
Local area
Button
New York, NY
6 days ago
Senior Machine Learning Engineer
...We are looking for a Senior Machine Learning Engineer, MLOps to help operationalize and scale our machine... ...that support model training and inference Build tooling and processes for monitoring... ..., and reproducibility of ML systems Optimize ML infrastructure for speed,...
Flexible hours
ExaCare AI
New York, NY
2 days ago
Machine Learning Engineer
..., we leverage cutting-edge machine learning and AI to solve real-world... ...are looking for passionate engineers who are eager to design, build... ...production systems. Optimize model performance, scalability... .... Collaborate with experienced ML engineers and researchers...
Full time
Remote work
MH Techin
New York, NY
11 days ago
Senior Machine Learning Engineer
$180k - $220k
...Fraudulent Activity The Sr. Machine Learning Engineer will join our Applied Data... ...for real-time performance optimization and machine learning... ...person for this role should be experienced in crafting... ...geographic location. Candidates hired to work in other locations...
Full time
Work at office
Remote work
Flexible hours
3 days per week
Nexxen
New York, NY
3 days ago
Senior ML Engineer: Ad Personalization & Optimization
...A leading audio streaming service is looking for a Senior Machine Learning Engineer to optimize ad experiences using machine learning algorithms. The ideal candidate will design and implement data-driven solutions while collaborating with cross-functional teams. Key qualifications...
Flexible hours
Spotify
New York, NY
2 days ago
Machine Learning Engineer
$50k
...assess, onboard, and optimize new AI models through... ...tens of thousands of engineering hours and improve output... ...billions of custom inference engines running on... ...model future. As a machine learning researcher at Not Diamond... ...to 1 ~ Leadership, hiring, and management...
Work at office
Remote work
Flexible hours
Shift work
Myriad Venture Partners, LLC
New York, NY
13 days ago
Machine Learning Engineer
...York. About the Role Pangram Labs is hiring strong Machine Learning Engineers at all levels to join our team. In this role, you... ...infrastructure for multi-GPU LLM training Profiling and optimizing training and inference code Deploy efficient inference pipelines for...
Work at office
Pangram Labs
New York, NY
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Machine Learning Engineer- Inference Optimization | Experienced Hire. Be the first to apply!