ML Ops Engineer — Agentic AI Lab (Founding Team)

Fabrion

ML Ops Engineer — Agentic AI Lab (Founding Team) Location: San Francisco Bay Area Type: Full-Time Compensation: Competitive salary + meaningful equity (founding tier) Backed by 8VC, we’re building a world‑class team to tackle one of the industry’s most critical infrastructure problems. About the Role Our AI Lab is pioneering the future of intelligent infrastructure through open‑source LLMs, agent‑native pipelines, retrieval‑augmented generation (RAG), and knowledge‑graph‑grounded models. We’re hiring an ML Ops Engineer to be the glue between ML research and production systems — responsible for automating the model training, deployment, versioning, and observability pipelines that power our agents and AI data fabric. You’ll work across compute orchestration, GPU infrastructure, fine‑tuned model lifecycle management, model governance, and security e Responsibilities Build and maintain secure, scalable, and automated pipelines for: LLM fine‑tuning, SFT, LoRA, RLHF, DPO training RAG embedding pipelines with dynamic updates Model conversion, quantization, and inference rollout Manage hybrid compute infrastructure (cloud, on‑prem, GPU clusters) for training and inference workloads using Kubernetes, Ray, and Terraform Containerize models and agents using Docker, with reproducible builds and CI/CD via GitHub Actions or ArgoCD Implement and enforce model governance: versioning, metadata, lineage, reproducibility, and evaluation capture Create and manage evaluation and benchmarking frameworks (e.g. OpenLLM‑Evals, RAGAS, LangSmith) Integrate with security and access control layers (OPA, ABAC, Keycloak) to enforce model policies per tenant Instrument observability for model latency, token usage, performance metrics, error tracing, and drift detection Support deployment of agentic apps with LangGraph, LangChain, and custom inference backends (e.g. vLLM, TGI, Triton) Desired Experience Model Infrastructure: 4+ years in MLOps, ML platform engineering, or infra‑focused ML roles Deep familiarity with model lifecycle management tools: MLflow, Weights & Biases, DVC, HuggingFace Hub Experience with large model deployments (open‑source LLMs preferred): LLaMA, Mistral, Falcon, Mixtral Comfortable with tuning libraries (HuggingFace Trainer, DeepSpeed, FSDP, QLoRA) Familiarity with inference serving: vLLM, TGI, Ray Serve, Triton Inference Server Automation + Infra: Proficient with Terraform, Helm, K8s, and container orchestration Experience with CI/CD for ML (e.g. GitHub Actions + model checkpoints) Managed hybrid workloads across GPU cloud (Lambda, Modal, HuggingFace Inference, Sagemaker) Familiar with cost optimization (spot instance scaling, batch prioritization, model sharding) Agent + Data Pipeline Support Familiarity with LangChain, LangGraph, LlamaIndex or similar RAG/agent orchestration tools Built embedding pipelines for multi‑source documents (PDF, JSON, CSV, HTML) Integrated with vector databases (Weaviate, Qdrant, FAISS, Chroma) Security & Governance Implemented model‑level RBAC, usage tracking, audit trails Integrated with API rate limits, tenant billing, and SLA observability Experience with policy‑as‑code systems (OPA, Rego) and access layers Preferred Stack LLM Ops : HuggingFace, DeepSpeed, MLflow, Weights & Biases, DVC Infra : Kubernetes (GKE/EKS), Ray, Terraform, Helm, GitHub Actions, ArgoCD Serving : vLLM, TGI, Triton, Ray Serve Pipelines : Prefect, Airflow, Dagster Monitoring : Prometheus, Grafana, OpenTelemetry, LangSmith Security : OPA (Rego), Keycloak, Vault Languages : Python (primary), Bash, optionally Rust or Go for tooling Mindset & Culture Fit Builder's mindset with startup autonomy: you automate what slows you down Obsessive about reproducibility , observability , and traceability Comfortable with a hybrid team of AI researchers, DevOps, and backend engineers Interested in aligning ML systems to product delivery, not just papers Bonus: experience with SOC2, HIPAA, or GovCloud‑grade model operations What We’re Looking For Experience 5+ years as a full stack or backend engineer Experience owning and delivering production systems end‑to‑end Prior experience with modern frontend frameworks (React, Next.js) Familiarity with building APIs, databases, cloud infrastructure, or deployment workflows at scale Comfortable working in early‑stage startups or autonomous roles, prior experience as a founder, founding engineer, or a 0‑1 pre‑seed startup is a big plus Mindset Comfortable with ambiguity, eager to prototype and iterate quickly Strong sense of ownership — prefers to build systems rather than wait for tickets Enjoys thinking about architecture, performance, and tradeoffs at every level Clear communicator and pragmatic team player Values equity and impact over prestige or hierarchy Prior startup or founding team experience Why This Role Matters Your work will enable models and agents to be trained, evaluated, deployed, and governed at scale — across many tenants, models, and tasks. This is the backbone of a secure, reliable, and scalable AI‑native enterprise system. If you dream about using AI to solve some really hard real world problems – we would love to hear from you. #J-18808-Ljbffr Fabrion

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the ML Ops Engineer — Agentic AI Lab (Founding Team) in San Francisco, CA vacancy

ML/AI Research Engineer Agentic AI Lab (Founding Team)
...ML/AI Research Engineer — Agentic AI Lab (Founding Team) Location: San Francisco Bay Area Type: Full-Time Compensation: Competitive salary + meaningful equity... ...-correction, multi-agent communication, and agent ops logging Optimization: Strong background in...
Suggested
Full time
Fabrion
San Francisco, CA
3 days ago
Founding ML Ops Engineer — Equity & AI Infra
Fabrion is looking for an ML Ops Engineer for its Agentic AI Lab in San Francisco. Your role will be pivotal in bridging ML research with production systems, focusing on automating model training and deployment. You will establish secure, scalable pipelines and manage...
Suggested
Fabrion
San Francisco, CA
1 day ago
Founding ML infrastructure Engineer
The problem we saw Most AI infrastructure is built... ...investors, and are founded by Keegan McCallum, who... ...build the layer that model labs, builders, and research teams ship on top of. Where... ...infrastructure. As our ML Infrastructure and Platform Engineer, you will own the architecture...
Suggested
Flexible hours
Shift work
URun
San Francisco, CA
1 day ago
Founding Forward Deployed Machine Learning Engineer
Founding Forward Deployed Machine Learning Engineer Most AI is frozen in place - it doesn't adapt to the world... ...short, and build the ML solutions that close the... ...distributed global‑first team, and quarterly off‑sites... ...generous paid time off. #J-18808-Ljbffr Adaption Labs
Suggested
Flexible hours
Adaption Labs
San Francisco, CA
1 day ago
Founding Machine Learning Engineer - San Francisco, CA (onsite)
$200k - $250k
...Description Job Description Founding Machine Learning Engineer - On-site - San... ...well-funded seed-stage AI company is building advanced agentic systems that automate... ...enterprise software. The team is developing... ...publications or notable ML projects. Background...
Suggested
Work at office
Immediate start
Connect Staffing Professional
San Francisco, CA
1 day ago
Data Engineer (Founding Team)
...Data/ETL Engineer (Founding Team) Location: San Francisco Bay Area Type: Full... ...'re building a multi-tenant, AI-native platform where... ...based knowledge models that fuel agentic applications. If you've worked... ...data Collaborate with ML/LLM teams to feed high-quality...
Full time
Fabrion
San Francisco, CA
3 days ago
Senior or Staff ML Systems Engineer, LLMs
Build a Safer World TRM Labs provides AI-powered intelligence... ...and more secure. The AI Engineering Team is chartered with... ...Language Models (LLMs) and agentic systems. Our mission... ...As a Senior or Staff ML Systems Engineer - LLM... ...Understand and implement ML Ops best practices ,...
Remote work
Worldwide
TRM Labs
San Francisco, CA
1 day ago
Founding ML Engineer
...Founding Ml Engineer Weave (YC W25) is building the definitive platform for understanding and improving how engineering teams work. We believe the way engineering output is measured today is fundamentally... ...broken and that modern AI can give teams a far more accurate...
Weave, Inc.
San Francisco, CA
3 days ago
Founding Machine Learning Engineer
...Founding ML Architect San Francisco, CA About This... ...architect and own our AI roadmap. This is a foundational... ...'ll lead a growing ML team and work cross-... ...usable tools for chip engineers. You should bring deep... ...Hardware Problem Reasoning, Agentic Systems, Fine-tuning Models...
Brahma Consulting Group
San Francisco, CA
23 hours ago
Founding ML Engineer
...Founding Ml Engineer Skills: Python, PyTorch, NLP, LLMs, Information Retrieval, Entity Resolution... ...building the gateway to the internet for AI agents. Our APIs already power hundreds... ...org chart — who reports to whom, what the team structure looks like, how the...
Crustdata (YC F24)
San Francisco, CA
23 hours ago
Founding Data Scientist / Machine Learning Engineer
...Seeking Founding Data Scientists and Machine Learning Engineers Imagine Multiplying Your... ...You can help product teams iterate faster,... ...next move . Palladio AI is the intelligence... ...inference, forecasting, agentic platforms, and more... ...domains: building ML and AI models to...
Remote work
Palladio AI, Inc
San Francisco, CA
4 days ago
Data Science & ML Ops Engineer
...Data Science & ML Ops Engineer Location: Bay Area, CA Tax Term (W2, C2C): W2, C2C We are... ...Leverage AutoML tools (e.g., Vertex AI AutoML, H2O Driverless AI) for low-code/... ...explainability) Collaborate with engineering teams to provision containerized environments...
Apolis
San Francisco, CA
3 days ago
Founding ML Performance Engineer - Sub-50ms Inference
uRun is seeking an ML Performance Engineer to build high-performance infrastructure for interactive AI. You will write custom CUDA kernels and optimize model inference for... ...role involves working closely with the founding team on critical performance challenges in production...
URun
San Francisco, CA
1 day ago
Founding ML Engineer — Real-Time In-Browser AI
...startup in San Francisco seeks founding Machine Learning Engineers (MLEs) to enhance core... ...You will work on low-latency AI solutions in browser... .... This role demands strong ML skills and experience with... ...and character fit on a small team. Ideal candidates must have...
Composite
San Francisco, CA
1 day ago
ML Infrastructure Engineer
...ML Infrastructure Engineer Spectral Labs is a spatial intelligence company building reasoning models... ...from the ground up. Our team is small and talent dense. We have founded quantitative trading firms and... ...the cutting edge of applied AI at Meta, Autodesk Research and...
Spectral Labs
San Francisco, CA
8 days ago
Founding Software Engineer, Data Infrastructure
$120k - $160k
...Founding Engineer For Airweave's Data And Infrastructure We'... ...platform that thousands of AI agents depend on. That... ...with the product team, but your focus is on the... ...strategies for large-scale agentic search Orchestrate... ...the world's leading AI labs Competitive salary (...
Airweave (yc X25)
San Francisco, CA
3 days ago
Founding Applied ML Engineer
...Founding Applied ML Engineer Title of Role: Founding Applied ML Engineer Location: San Francisco,... ...We're representing an early-stage AI company that operates at the intersection... ...speech recognition systems. As a founding team member, you will play a crucial role...
Work at office
Recruiting from Scratch
San Francisco, CA
3 days ago
Lead Agentic Data Systems Engineer
$172.5k - $260.1k
...Category Software Engineering About Salesforce Salesforce... ...is the #1 AI CRM, where humans with... ...in the agentic era? You’re in the... ...decision‑making. Our team is composed of Architects... ...company benefits can be found at the following... ...as applicable. #J-18808-Ljbffr Centaur Labs
Shift work
Centaur Labs
San Francisco, CA
2 days ago
Senior ML Systems Engineer, LLM Infra & AI Ops
TRM Labs is looking for a Senior or Staff ML Systems Engineer to focus on building and scaling the technical infrastructure for AI/ML systems in San Francisco. This position involves developing reusable CI/CD workflows and automating model versioning to ensure compliance...
TRM Labs
San Francisco, CA
1 day ago
Applied Audio ML Engineer
About David AI David AI is the first audio... ...with the same rigor AI labs bring to models. Our... ...David AI excels. Founded in 2024 by former Scale AI engineers and operators, David... ...Round Capital. Our team is sharp, humble, ambitious... ...manage the complete ML lifecycle, from...
David AI
San Francisco, CA
2 days ago
Founding ML Systems Engineer - End-to-End Infrastructure AI
A leading AI infrastructure firm based in San Francisco is looking for engineers to join their founding core team. You will work directly with the founders to develop AI models that optimize network operations and anticipate failures. This unique position offers the opportunity...
Meter
San Francisco, CA
1 day ago
Sr. Staff Machine Learning Engineer, Agentic Ads
$227.87k
...Possible. At Pinterest, AI isn't just a... ..., applying AI- and agentic-driven development... ...architecture across the ads ML stack, and... ...science, and infra teams to translate business... ...partners across product, engineering and research.... ...this position can be found here. US based...
Full time
Work at office
Local area
Relocation
Relocation package
Pinterest
San Francisco, CA
2 days ago
Founding AI/ML Engineer ($200-250K + Equity) at Generalcatalyst.com
$200k - $250k
...This is a job that Jill, our AI Recruiter, is recruiting for on behalf of one of... ...step is to speak to Jack. Job Title: Founding AI/ML Engineer Salary: $200-250K + Equity... ...Job Description: Join the founding team at Curium to build Generative Engine Optimization...
Jack and Jill AI
San Francisco, CA
23 hours ago
Senior Cloud/ML Ops Engineer
$250k - $325k
...anything. We're building the AI that finally changes... ...the last 12 months. Engineering at Ivo Engineers at... ...models in favor of agentic RAG [2023] • Large-scale... ...part of Infrastructure team to: Own and... ...strategies to isolate ML vs API workloads while...
Contract work
Work at office
Remote work
IVO Inc
San Francisco, CA
3 days ago
ML Ops Engineer
...MLOps Engineer At Hayden AI, we are on a mission to harness the power of computer vision to transform... ...within the Perception Deep Learning team, you will lead the design and evolution... ...models and drive best practices across the ML lifecycle. You will play a key role...
Work at office
3 days per week
Hayden AI
San Francisco, CA
14 days ago
ML Engineer
...building native actions - an agentic framework that... ...reliably. We're a team of AI researchers, designers, growth experts, and engineers rethinking human-... ...About the Role As a ML engineer at Wispr, you'... ...looking for? Previous founding or startup experience...
Wispr Flow
San Francisco, CA
1 day ago
Founding Machine Learning Engineer
$150k - $220k
...Founding Machine Learning Engineer San Francisco Compensation ~ Estimated base... ...Bonus We invest in our team's success with comprehensive... ...barriers, or consumer-focused "AI browsers," we run AI... ...architecture creates unique ML challenges. This is a high...
H1b
Work at office
Visa sponsorship
Sleeping nights
Composite.ai
San Francisco, CA
3 days ago
Founding Machine Learning Engineer
...first Machine Learning Engineer, embedded in the Fully Autonomous... ...Underwriting (FAU) team. This is a high-... ...role. There is no existing ML platform to inherit, no... ...auditable, and improvable agentic workflows across the underwriting... ...that keeps AI underwriting safe as autonomy...
Shepherd
San Francisco, CA
23 days ago
Founding Machine Learning Engineer
...Company Overview We're a team of engineers, neuroscientists, and... ...untethered to the lab. It's this advancement... ...computer interfaces and AI, join us. We're backed... ...building a generational founding team which is truly... ...~3+ years of applied ML research or development...
Orbit
San Francisco, CA
3 days ago
Gentoro | Senior ML Engineer
...About Us Gentoro was founded by a team with deep experience in enterprise infrastructure and AI, with leadership roots at companies... ...production deployments. As agentic workflows shift from... ...looking for a visionary Senior ML Engineer who will bridge the gap between...
Shift work
Palm Venture Studios
San Francisco, CA
7 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Ops Engineer — Agentic AI Lab (Founding Team). Be the first to apply!