Senior or Staff ML Systems Engineer, LLMs

TRM Labs

Build a Safer World TRM Labs provides AI-powered intelligence solutions that help public and private sector agencies investigate and disrupt crime. TRM's platforms enable investigators to trace illicit activity, build cases, and construct operating pictures of threat networks. Leading agencies and businesses worldwide rely on TRM to make the world safer and more secure. The AI Engineering Team is chartered with enabling next-generation AI applications , with a special focus on Large Language Models (LLMs) and agentic systems. Our mission is to build robust pipelines, high-performance infrastructure, and operational tooling that allow AI systems to be deployed with speed, safety, and scale. We manage petabyte-scale pipelines, serve models with millisecond-level latency, and provide the observability and governance needed to make AI production-ready. We’re also deeply involved in evaluating and integrating cutting-edge tools in the LLM and agent space — including open-source stacks, vector databases, evaluation frameworks, and orchestration tools that unlock TRM’s ability to innovate faster than the market. About the Role As a Senior or Staff ML Systems Engineer – LLM , you’ll be at the core of building and scaling the technical infrastructure for AI/ML systems. You will: Build reusable CI/CD workflows for model training, evaluation, and deployment — integrating Langfuse, GitHub Actions, and experiment tracking, etc. Automate model versioning, approval workflows, and compliance checks across environments. Build out a modular and scalable AI infrastructure stack — including vector databases, feature stores, model registries, and observability tooling. Partner with engineering and data science to embed AI models and agents into real-time applications and workflows. Continuously evaluate and integrate state-of-the-art AI tools (e.g. LangChain, LlamaIndex, vLLM, MLflow, BentoML, etc.). Drive AI reliability and governance, enabling experimentation while ensuring compliance, security, and uptime. Build and enhance AI/ML Model Performance Ensure data accuracy, consistency and reliability, leading to better model training and inferencing Deploy infrastructure to support offline and online evaluation of LLMs and agents — including regression testing, cost monitoring, and human-in-the-loop workflows. Enable researchers to iterate quickly by providing sandboxes, dashboards, and reproducible environments. What We’re Looking For Write high-quality, maintainable software — primarily in Python, but we value engineering ability over language familiarity. Have a strong background in scalable infrastructure , including: Containerization and orchestration (e.g. Docker, Kubernetes) Infrastructure-as-code and deployment (e.g. Terraform, CI/CD pipelines) Monitoring and logging frameworks (e.g. Datadog, Prometheus, OpenTelemetry) Understand and implement ML Ops best practices , including: Model versioning and rollback strategies Automated evaluation and drift detection Scalable model and agent serving infrastructure (e.g. vLLM, Triton, BentoML) Deploy and maintain LLM and agentic workflows in production, including: Monitoring cost, latency, and performance Capturing traces for analysis and debugging Optimizing prompt/response flows with real-time data access Demonstrate strong ownership and pragmatism , balancing infrastructure elegance with iterative delivery and measurable impact. Learn about TRM Speed in this position Rapid Issue Resolution. TRM Engineers identify and resolve critical onsite issues in minutes to hours, not weeks. We create virtual war rooms, implement fixes, and share lessons with both customer stakeholders and internal teams within 48 hours. Navigating Bureaucracy. We anticipate and address procedural hurdles, build trust with key stakeholders, and find alternative pathways to approvals. This keeps projects moving even in complex environments. Efficient Knowledge Transfer. Engineers document and share updates in real time, ensuring the entire team—onsite and remote—has full visibility into plans, blockers, and resolutions. Knowledge sharing sessions and clear documentation reduce friction and accelerate delivery. About TRM's Engineering Levels Engineer: Responsible for helping to define project milestones and executing small decision decisions independently with the appropriate tradeoffs between simplicity, readability, and performance. Provides mentorship to junior engineers, and enhances operational excellence through tech debt reduction and knowledge sharing. Senior Engineer: Successfully designs and documents system improvements and features for an OKR/project from the ground up. Consistently delivers efficient and reusable systems, optimizes team throughput with appropriate tradeoffs, mentors team members, and enhances cross-team collaboration through documentation and knowledge sharing. Staff Engineer: Drives scoping and execution of one or more OKRs/projects that impact multiple teams. Partners with stakeholders to set the team vision and technical roadmaps for one or more products. Is a role model and mentor to the entire engineering organization. Ensures system health and quality with operational reviews, testing strategies, and monitoring rigor. Life at TRM We are building a safer world. That promise shows up in how we work every day. TRM moves quickly. We are a high velocity, high ownership team that expects clarity, follow-through, and impact. People who thrive here are energized by hard problems, experimentation, and continuous feedback. If something takes months elsewhere, it will ship here in days. Our work sits at the intersection of AI, national security, and fighting crime. The problems are complex, the stakes are real, and the environment evolves quickly. The pace and intensity of the work reflect the importance of the mission. As a result, the way we operate requires a high level of ownership, adaptability, collaboration, and creative problem-solving. At TRM, you should expect: Priorities and targets to change quickly as we experiment and iterate Work that often requires operating with a high degree of ambiguity A high level of personal ownership and accountability Close collaboration across teams and functions Frequent, high-touch communication Creative problem solving and out-of-the-box thinking A pace that rewards urgency, adaptability, and outcomes This environment is energizing for people who enjoy building, solving hard problems, and making progress in situations that are not always fully defined. It also requires comfort navigating ambiguity, adjusting course as new information emerges #J-18808-Ljbffr TRM Labs

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the Senior or Staff ML Systems Engineer, LLMs in San Francisco, CA vacancy

Senior ML Research Engineer: Production LLMs & Multimodal
A leading AI startup is seeking senior ML research engineers for their office in San Francisco. The role involves developing models for customer support... ...engineering, specifically with fine-tuning and deploying LLMs. This opportunity offers a chance to contribute to...
Senior
Work at office
DRH Search
San Francisco, CA
4 days ago
Senior Distributed ML Systems Engineer (Remote Equity)
A leading AI research company in San Francisco seeks Senior/Staff Engineers skilled in distributed systems and large-scale ML training. Responsibilities include designing systems optimized for low-bandwidth conditions and implementing robust training strategies. Ideal...
Senior
Remote work
Pluralis Research
San Francisco, CA
3 days ago
Senior ML Systems Engineer - LLM Infra & Governance
A tech-driven company focused on blockchain solutions is seeking a Senior ML Systems Engineer. In this role, you will build reusable workflows, automate model versioning, and deploy scalable AI systems. Candidates should have strong programming skills, experience with...
Senior
TRM Labs
San Francisco, CA
3 days ago
Senior ML Training Systems Engineer - Distributed GPU Infra
...company in San Francisco is looking for a Senior Software Engineer to build scalable infrastructure for... .... You will design distributed training systems and optimize GPU utilization while collaborating... ...have over 5 years of experience in ML infrastructure and a strong background...
Senior
Baseten
San Francisco, CA
3 days ago
Senior ML Inference Systems Engineer
...AI workloads is seeking a Member of Technical Staff to design and optimize inference systems. The role involves managing KV cache... ...Ideal candidates should have strong software engineering skills and experience with ML inference systems, particularly in Python and...
Senior
Gimlet Labs
San Francisco, CA
1 day ago
Senior ML Systems Engineer, LLM Infra & AI Ops
TRM Labs is looking for a Senior or Staff ML Systems Engineer to focus on building and scaling the technical infrastructure for AI/ML systems in San Francisco. This position involves developing reusable CI/CD workflows and automating model versioning to ensure compliance...
Senior
TRM Labs
San Francisco, CA
3 days ago
Senior ML Engineer — NLP, LLMs & Production Deployment
A tech-driven company in San Francisco is seeking an experienced Machine Learning Engineer to develop and implement machine learning solutions. The ideal candidate should have a Master's degree or PhD in Computer Science and at least 5 years of experience in machine learning...
Senior
Hamster
San Francisco, CA
1 day ago
Senior Staff Regulatory and Compliance Systems Engineer
$272k - $336k
...Senior Staff Regulatory and Compliance Systems Engineer Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo...
Senior
Odd job
Full time
Remote work
Waymo
San Francisco, CA
1 day ago
Senior ML Engineer, AI/ML Systems & Leadership
Integral Ad Science, Inc. is seeking a Senior Machine Learning Engineer in San Francisco to oversee sophisticated data science systems for business predictions in advertising. This role involves contributions to AI/ML services design and mentorship of data scientists....
Senior
Integral Ad Science, Inc.
San Francisco, CA
1 day ago
Senior ML Engineer - Ads Systems, Production Impact
$216.7k - $303.4k
Tensec is seeking a Machine Learning Engineer to join the Ads team in San Francisco. In this role, you will design, build, and deploy machine... ...processes. Ideal candidates should have at least 3 years of ML experience and be proficient in Python and frameworks like TensorFlow...
Senior
Tensec
San Francisco, CA
1 day ago
Senior/Staff Real-Time 3D Perception ML Engineer
$180k - $265k
...logistics technology company in South San Francisco is hiring senior and staff perception engineers to join their Droid team. This role involves... ...-time 3D perception models and optimizing deep learning systems used in autonomous logistics. You will collaborate closely...
Senior
Zipline
South San Francisco, CA
2 days ago
Senior ML Infrastructure Engineer — AI Systems & Pipelines
A frontier research laboratory in San Francisco is seeking a Senior / Principal ML Engineer to enhance their ML infrastructure. The role involves designing experimental frameworks for data scientists, collaborating with various teams, and ensuring rigorous practices in...
Senior
Merge Labs
San Francisco, CA
4 days ago
Senior ML Engineer
...Senior ML Engineer Highlight is building a shared intelligence layer for the modern workforce... ...Senior ML Engineer to help build the AI systems that power Highlight. You will work across... ...enjoys building great products with LLMs at scale. You are a good fit if you...
Senior
Work at office
Relocation
Relocation package
Flexible hours
Highlight AI
San Francisco, CA
4 days ago
Senior ML Infra Engineer - Real-Time Data Systems
Arena Intelligence, Inc. in San Francisco, CA, is seeking a Senior Software Engineer (Infrastructure) to lead the design of scalable data and API systems. The role involves architecting real-time data pipelines, ensuring performance and reliability, and mentoring engineers...
Senior
Arena Intelligence, Inc.
San Francisco, CA
2 days ago
ML Engineer: LLMs & RAG Knowledge Graphs (In‑Person SF)
Onyx is seeking an AI/ML engineer based in San Francisco, CA, to enhance its knowledge layer on top of LLMs. You will evaluate LLM knowledge graphs and improve user experience through innovative features. The ideal candidate has over 3 years of experience in AI/ML, strong...
Onyx
San Francisco, CA
3 days ago
Senior ML Performance Engineer
Position: Senior ML Performance Engineer Location: SF Bay Area (US) or Toronto (Canada) - Hybrid Employment... ...: AI Infrastructure / Compiler Systems Overview A venture-backed AI infrastructure... ...of large language models (LLMs) before and after compiler optimization...
Senior
Full time
Amadeus Search
San Francisco, CA
12 hours ago
Hybrid Senior ML Engineer, AI Agent — Equity
...company in San Francisco is seeking a Senior Machine Learning Engineer to join their core AI team. This role... ...technical roadmap for AI avatar chat systems, implementing advanced solutions, and... ...have strong experience with LLMs, scalable backend development, and a...
Senior
Voiceflow
San Francisco, CA
3 days ago
Senior ML Engineer - Personalization & Core Systems
$189.72k - $332.01k
A leading social media platform based in Palo Alto is seeking a Machine Learning Engineer. The role involves building innovative systems using deep learning and machine learning, improving their models across various product areas, and utilizing data-driven methods for...
Senior
Pinterest
San Francisco, CA
1 day ago
Senior/Staff ML Research Engineer
...Senior/Staff ML Research Engineer We're assisting a profitable Enterprise AI Customer Support startup with their search for senior/staff ML research... ...or research ~ Experience fine-tuning and deploying LLMs in production environments. ~ Experience at early-stage...
Senior
Work at office
DRH Search
San Francisco, CA
3 days ago
Senior ML Engineer
$180k - $240k
...recruiting for one of its clients a Senior Machine Learning Engineer - this is a fully remote... ...join our small but mighty ML team building production-... ..., and can ship LLM-powered systems that handle real, high-... ...production systems in Python with LLMs or NLP (Mandatory)...
Senior
Remote work
Flexible hours
Career Renew
San Francisco, CA
4 days ago
Senior ML Engineer
$152k - $228k
...Job Description Job Description Senior ML Engineer About Invoca Invoca is an AI-powered... ...PEFT) to adapt transformer-based SLMs and LLMs for high-impact NLP applications in... ...and data lake foundations to keep the systems powering our models reliable and scalable...
Senior
Currently hiring
Remote work
Flexible hours
Invoca
San Francisco, CA
15 days ago
Senior Distributed Systems Engineer (Go/Redis)
Paradigm is seeking a Senior Software Engineer in San Francisco, California. This role requires 7+ years... ...be skilled in debugging complex systems and should possess familiarity with tools... ...innovation in a cutting-edge workspace focused on LLMs. #J-18808-Ljbffr Paradigm
Senior
Paradigm
San Francisco, CA
3 days ago
Senior Agentic Systems Engineer
$124.8k - $156k
Job Description Senior Agentic Systems Engineer Natera San Carlos, CA On-site, Remote $124,800—$156,000... ...based on data. Partner directly with ML engineers, data engineers, and product... ...GenAI Ecosystem: Deep familiarity with LLMs, orchestration frameworks, MCP, memory...
Senior
Immediate start
Remote work
ChatGPT Jobs
San Francisco, CA
3 days ago
Senior AI/ML Engineer LLM & Agent Stack
...Senior AI/ML Engineer — LLM & Agent Stack Every production AI system, whether it's powering customer support, writing code, analyzing financial data, or diagnosing... ...improve tracing, benchmarking and observability for LLMs and agents — token/cost accounting, latency p95,...
Senior
TrueFoundry
San Francisco, CA
5 days ago
Senior AI ML Engineer - Remote
$120.1k - $214.5k
...Lead AI/ML Engineer At Optum AI Optum is a global organization that delivers care, aided... ..., you'll be designing and developing AI systems to improve patient care. We often deliver... ...leveraging Large Language Models (LLMs) ~ Proven track record of engaging with...
Senior
Minimum wage
Full time
Work experience placement
Work at office
Local area
Remote work
Genoa Telepsychiatry
San Francisco, CA
2 days ago
Senior Machine Learning Engineer, Prediction & Planning, System Architecture
$213k - $263k
...navigate complex environments safely and efficiently. The system architecture team handles the onboard contract of the model with... ...You will: Tackle challenging real-world problems with ML and engineering solutions. Use state of the art techniques to design and...
Senior
Full time
Contract work
Internship
Remote work
Waymo
San Francisco, CA
3 days ago
Senior Spacecraft Systems Engineer (San Francisco Office)
$162.6k - $203.2k
...data processing, and software engineering, our office is a truly inspiring... ...and Planet's teams Assess system level budgets for engineering... ...Privacy Notice for California Staff Members and Applicants, and hereby... ...-e.g. Large Language Models (LLMs), deep fake technology, etc.-...
Senior
Full time
Temporary work
For contractors
Work at office
Local area
Remote work
Home office
Planet
San Francisco, CA
2 days ago
Senior Spacecraft Systems Engineer
$144.5k - $180.6k
...data processing, and software engineering, our office is a truly... ...We are currently seeking a Senior Spacecraft Systems Engineer to lead system development... ...Notice for California Staff Members and Applicants, and... ....g. Large Language Models (LLMs), deep fake technology, etc...
Senior
Full time
Temporary work
For contractors
Work at office
Local area
Remote work
Home office
3 days per week
Planet
San Francisco, CA
4 days ago
Senior Machine Learning Engineer - System Experience Personalization
$181.1k - $272.1k
...Senior Machine Learning Engineer - System Experience Personalization Our team is looking for you to help make iOS more intelligent, proactive and personal... ...! You will work closely with talented Software and ML engineers on our team, and across Apple to design,...
Senior
Relocation
Apple
San Francisco, CA
5 days ago
Senior ML Engineer — Production AI Systems
An innovative AI startup is seeking a Senior Machine Learning Engineer to join a small, senior team dedicated to building AI systems for high-consequence environments. This role involves improving production ML systems, optimizing models for latency and cost, and collaborating...
Senior
Rational Dynamics
Berkeley, CA
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior or Staff ML Systems Engineer, LLMs. Be the first to apply!