Machine Learning Infrastructure Engineer
TRM Labs
Build a Safer World. TRM Labs provides blockchain analytics and AI solutions to help law enforcement, national security agencies, financial institutions, and cryptocurrency businesses detect, investigate, and disrupt crypto-related fraud and financial crime. TRM’s blockchain intelligence and AI platforms trace the source and destination of funds, identify illicit activity, build cases, and construct an operating picture of threats. TRM is trusted by leading agencies and businesses worldwide to enable a safer, more secure world for all. At TRM, we’re on a mission to build a safer financial system for billions of people around the world. Our next-generation platform, which combines threat intelligence with machine learning, enables institutions and governments to detect cryptocurrency fraud and financial crime at an unprecedented scale. As a Senior Software Engineer, ML Infrastructure at TRM Labs, you will collaborate with data scientists, engineers, and product managers to design and operate scalable GPU-backed infrastructure that powers TRM’s AI systems. You will work at the intersection of distributed systems, cloud infrastructure, GPU performance engineering, and applied machine learning — building the foundation that enables high‑throughput, production‑grade ML workloads. The Impact You’ll Have Here Design and operate GPU cluster infrastructure. Build and manage GPU‑backed environments in cloud settings, including orchestration, autoscaling, resource isolation, and workload management across multiple concurrent models and users. Optimize high‑throughput inference. Implement and tune serving systems that maximize token throughput, batching efficiency, GPU occupancy, and cost effectiveness across interactive and batch workloads. Enable distributed inference strategies. Support and operationalize model parallelism, tensor parallelism, and other distributed serving patterns for large‑scale models. Implement model optimization and compilation workflows. Integrate and optimize acceleration stacks such as TensorRT, ONNX Runtime, vLLM, FlashAttention, and related tooling to improve performance and reduce inference cost. Schedule heterogeneous workloads. Design systems that manage multiple models, multiple users, and mixed workload types across heterogeneous accelerators (e.g., NVIDIA GPUs, Inferentia), ensuring predictable performance under varying demand. Build observability into ML infrastructure. Instrument systems to measure GPU load, memory utilization, batching efficiency, queue depth, and token throughput, and use data to continuously improve performance and reliability. Partner across engineering teams. Work closely with infrastructure, ML, and product teams to ensure models transition smoothly from experimentation to production‑grade, highly available services. What We’re Looking For Bachelor’s degree (or equivalent) in Computer Science or related field. 5+ years of experience building and operating distributed systems or infrastructure in production environments. Experience deploying and operating ML/LLM inference workloads on GPU clusters in cloud environments (AWS and/or GCP). Deep understanding of high‑throughput inference systems, including batching strategies, token throughput optimization, and the trade‑offs between latency, throughput, and cost. Experience with one or more ML serving frameworks such as Triton Inference Server, vLLM, Ray Serve, ONNX Runtime, or HuggingFace Optimum. Experience optimizing GPU load, memory efficiency, and performance bottlenecks in production systems. Familiarity with distributed inference strategies including model parallelism and tensor parallelism. Experience working with Kubernetes or equivalent orchestration systems in cloud environments. Familiarity with heterogeneous accelerators (e.g., Inferentia) is a plus. CUDA familiarity and experience debugging GPU‑related issues is a plus. Adaptable. Goals can change fast. You anticipate and react quickly. Autonomous. You own what you work on. You move fast and get things done. Excellent communication. You communicate complex ideas effectively to both technical and non‑technical audiences, verbally and in writing. Collaborative. You work effectively in a cross‑functional team and with people at all levels in an organization. Life at TRM We are building a safer world. That promise shows up in how we work every day. TRM runs fast. Really fast. We’re a high‑velocity, high‑ownership team that expects clarity, follow‑through, and impact. People who thrive here are energized by hard problems, experimentation, and direct feedback. If something takes months elsewhere, it often ships here in days. That pace isn’t for everyone. If you are optimizing primarily for consistent work‑life balance, use the interview process to pressure‑test fit. We want teammates who thrive here, not just survive here. AI Fluency at TRM AI fluency is a baseline expectation at TRM. We believe AI meaningfully changes how top performers operate. We expect every team member to use AI to accelerate and reimagine their craft, not just automate surface tasks. At TRM, AI Fluency Means You Are Among The Top 10 Percent Of Operators In Your Function In How You Apply AI To Accelerate repeatable workflows Structure and solve problems Improve output quality Increase speed and leverage You will be evaluated on applied AI fluency during the interview process. Leadership Principles Impact‑Oriented Trailblazer: We put customers first and move with speed, focus, and adaptability. We treat every plan like an experiment – test, ship, measure, and iterate quickly. Master Craftsperson: We care deeply about our craft. We balance speed with high standards, own outcomes end‑to‑end, and invest in getting better everyday. Inspiring Colleague: We add clarity and energy, not noise. We bring humility, candor, and a one‑team mindset — giving and receiving feedback to make the team stronger. The impact you will have Driving critical investigations that can’t wait for typical business hours. Shipping products in days when others would schedule quarters. Partnering with teams across time zones to deliver insights while the story is still unfolding. Building new solutions from first principles when the playbook doesn’t yet exist. Protecting victims and customers by tracing illicit activity and disrupting criminal networks. Join our Mission At TRM we care deeply about our craft. We are looking for individuals who want their work to matter, who experiment with speed and rigor, and who take pride in building a safer world for billions of people. If you’re excited by TRM’s mission but don’t check every box, we encourage you to apply — we hire for slope, judgment, and the will to learn fast. TRM is a Series C company with $220M in total funding, backed by Blockchain Capital, Goldman Sachs, Bessemer, Y Combinator, Thoma Bravo, and others. Headquartered in San Francisco, TRM operates as a distributed‑first company with hubs in Los Angeles, San Francisco, New York, Washington D.C., London, and Singapore. Privacy Policy And Additional Information By submitting your application, you are agreeing to allow TRM to process your personal information in accordance with the TRM Privacy Policy. Our typical hiring cycles for specialized roles span 24 to 36 months. Accordingly, we retain your personal information for up to 36 months to evaluate your application and to consider you for current and future employment opportunities, unless you request earlier deletion or a different retention period is required or permitted by law. To notify TRM Labs that you believe this job posting is non‑compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance. We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this form. Recruitment agencies TRM Labs does not accept unsolicited agency resumes. Please do not forward resumes to TRM employees. TRM Labs is not responsible for any fees related to unsolicited resumes and will not pay fees to any third‑party agency or company without a signed agreement. Learn More Company Values | Interviewing | FAQs #J-18808-Ljbffr
- A leading blockchain analytics firm is seeking a Senior Software Engineer for ML Infrastructure to collaborate with diverse teams in designing and operating GPU-backed infrastructure for AI systems. This role involves optimizing inference systems and implementing model...Suggested
- ...global presence.Responsibilities:Design, develop, and implement machine learning models to predict competitive bidding landscape, conversion... ...as modeling in the context of privacy restrictionsWork with engineering and operations teams to build machine learning models,...SuggestedLocal areaRemote work
$235k - $275k
...Liftoff has a diverse, global presence. About The Revenue Engine Team The Revenue Engine team works to understand the fundamental... ...of demand and the effects of competition. The team of machine learning engineers, software engineers, and data analysts develops theories...SuggestedFull timeRemote work$208k - $300k
...Machine Learning Engineer - Model Evaluations, Public Sector The Public Sector ML team at Scale deploys advanced AI systems—including LLMs... ...Build evaluation frameworks for LLM agents, including infrastructure for scenario‑based and environment‑based testing. Conduct...SuggestedFull time- ...About the RoleWe are seeking an experienced Senior ML Inference Engineer to join our team, focusing on optimizing and deploying our... ...edge devices with NVIDIA hardware, and ensuring our inference infrastructure meets FDA and SOC2 compliance requirements. This role offers...SuggestedRemote workWorldwide
- ...Pluralis Research carries out foundational research on Protocol Learning: multi-participant training of foundation models where no... ...self-sustaining economics. We’re looking for Senior/Staff engineers with 5+ years of experience in distributed systems and ML large...Remote workVisa sponsorship
- ...impact that matters. Responsibilities Develop state-of-the-art machine learning models for AI applications. Own the ML lifecycle from... ...Science or related field. Experience with MLOps and cloud infrastructure. Knowledge of containerization and orchestration (Docker, Kubernetes...
- ...cybersecurity, physics, mathematics, medicine, engineering, and other specialties. The company... ...is seeking a highly accomplished Machine Learning Engineer to take ownership of the end... ...to): Data Acquisition and Curation, Infrastructure, Pre-Training, Evaluations, and Fine-...Seasonal workFlexible hours
- Job Description Job Description Role: Machine Learning Product Engineer Location: San Jose, CA Contract /FTE Responsibilities Understand current quality problems and how it impacts various user groups, define success metrics, and prioritize solutions based on user...Contract work
$181.1k - $318.4k
AI/ML - Machine Learning Research Engineer, Machine Translation San Francisco Bay Area, California, United States Machine Learning and AI We are seeking an experienced machine learning engineer to help bring the next generation of core machine translation (MT) technology...Relocation- ...A leading technology company, Thumbtack, is hiring a Staff ML Infrastructure Engineer to drive the architectural vision for their machine learning infrastructure. This role requires 8+ years in engineering and a strong focus on distributed systems. You'll architect solutions...Remote work
- A leading performance marketing platform seeks a Machine Learning Engineer to build statistical models and production systems that balance advertiser performance. The ideal candidate will have a PhD in a relevant field and industry experience applying machine learning to...Full timeRemote work
- 6AM City, LLC is seeking an experienced engineer to join the Machine Learning Foundations team. This role focuses on building self-service tooling for model lifecycle management while working across the entire machine learning lifecycle. Candidates should have extensive...
- A data-focused technology company is seeking an individual to finetune small language models and enhance existing data quality. The role involves data scrubbing, normalization, and pushing solutions into production environments. Ideal candidates should have Python experience...
- A leading AI solutions provider is looking for a Machine Learning Engineer focused on model evaluations in the public sector. This role involves designing and implementing evaluation pipelines for advanced AI systems, ensuring they function reliably in critical environments...
- A high-growth AI startup in Austin is seeking a Founding Engineer to lead the development of their machine learning-powered document extraction engine. You'll work directly with the co-founders, owning backend architecture decisions and expanding core functionalities....
$40 - $60 per hour
A leading generative AI startup is looking for Machine Learning interns to join their innovative Text Content Generation team. This hybrid role is based in California, requiring three days in the office and offers competitive pay ranging from $40 to $60 per hour. Interns...Hourly payInternshipWork at office- A leading AI solutions company is seeking a Machine Learning Engineer for its AI Generation Engine (SAIGE) team. This role requires ownership of the entire ML lifecycle, focusing on designing and rapidly building AI-first products. Key skills include strong Python expertise...
- A cutting-edge AI company in Austin is seeking a Machine Learning Engineer who excels in foundation-model research and production engineering. The role involves training Vision-Language Models to enhance understanding of complex video motion and developing robust APIs for...
$40 - $60 per hour
...ll Make an Impact We are looking for passionate and talented Machine Learning interns to join our Text Content Generation team. As an... ...closely with designers, product managers, marketing teams, and engineers to bring innovative ideas to life. Investigate, prototype, and...Hourly paySummer workInternshipWork at officeFlexible hours3 days per week- PICTOR LABS INC is seeking a Senior ML Inference Engineer based in the United States to optimize and deploy production virtual staining models. This role demands deep expertise in ML inference optimization, proficiency in Python, and experience with PyTorch and NVIDIA...Remote job
- A mission-driven technology company in California is seeking experienced Senior/Staff Engineers proficient in building distributed ML systems. Applicants should possess strong experience in optimizing large-scale training under low-bandwidth conditions, with expertise in...Remote work
$108.91k - $112.17k
A technology firm specializing in advanced analytics is seeking a Software Engineer focusing on transforming research prototypes into reliable software. The ideal candidate will have over 5 years of experience in software engineering, proficient in Python and Rust, and...Remote work- ...experiments to improve model behaviors across various domains. Candidates should have 1-4 years of experience in software engineering, machine learning, or applied research, with a strong inclination towards creative problem solving and improving model performance. #J-188...
- ...Job Overview We are seeking an experienced MLOps Engineer to design, build, and maintain scalable machine learning operations pipelines that support the full... ...teams to build and optimize data pipelines and ML infrastructure . Support engineering teams in provisioning scalable...Long term contractLocal area
- Job Description: We are seeking a versatile and pragmatic Applied ML Engineer to contribute across a broad range of machine learning and perception tasks that power our edge‑intelligent maritime systems. This role requires someone comfortable wearing many hats—from working...Remote workFlexible hoursShift work
- Airbnb, Inc. is seeking a Staff Software Engineer for their Communication Products team to drive the technical strategy for integrating machine learning into their messaging products. The ideal candidate will possess over 9 years of experience, a relevant degree, and a...Remote job
- ...A prominent footwear and apparel company is seeking a Sr. Manager for Data & ML Engineering to lead their modern data platform initiatives. This role involves building reliable data pipelines on AWS, utilizing dbt for transformations, and mentoring engineering teams. Candidates...Remote work
$170.5k - $228.6k
A leading entertainment company is looking for an experienced Data Engineer to optimize data pipelines for AI/ML research in Nicasio, CA. This hybrid role involves designing scalable data processing systems and collaborating with AI/ML researchers. Candidates should have...- 6AM City, LLC is seeking a Machine Learning Product Engineer to improve feed quality in their products. The candidate should have over 5 years of product management experience and strong AI/ML knowledge. Responsibilities include understanding and resolving quality issues...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Machine Learning Infrastructure Engineer. Be the first to apply!
- machine learning engineer California, MO
- data infrastructure engineer California, MO
- remote infrastructure engineer California, MO
- senior infrastructure engineer California, MO
- security infrastructure engineer California, MO
- infrastructure engineer California, MO
- infrastructure developer California, MO
- machine learning research scientist California, MO
- machine learning part time California, MO
- artificial intelligence - machine learning intern California, MO

