Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Ops Engineer — Agentic AI Lab (Founding Team)

Fabrion

ML Ops Engineer — Agentic AI Lab (Founding Team) Location: San Francisco Bay Area Type: Full-Time Compensation: Competitive salary + meaningful equity (founding tier) Backed by 8VC, we’re building a world‑class team to tackle one of the industry’s most critical infrastructure problems. About the Role Our AI Lab is pioneering the future of intelligent infrastructure through open‑source LLMs, agent‑native pipelines, retrieval‑augmented generation (RAG), and knowledge‑graph‑grounded models. We’re hiring an ML Ops Engineer to be the glue between ML research and production systems — responsible for automating the model training, deployment, versioning, and observability pipelines that power our agents and AI data fabric. You’ll work across compute orchestration, GPU infrastructure, fine‑tuned model lifecycle management, model governance, and security e Responsibilities Build and maintain secure, scalable, and automated pipelines for: LLM fine‑tuning, SFT, LoRA, RLHF, DPO training RAG embedding pipelines with dynamic updates Model conversion, quantization, and inference rollout Manage hybrid compute infrastructure (cloud, on‑prem, GPU clusters) for training and inference workloads using Kubernetes, Ray, and Terraform Containerize models and agents using Docker, with reproducible builds and CI/CD via GitHub Actions or ArgoCD Implement and enforce model governance: versioning, metadata, lineage, reproducibility, and evaluation capture Create and manage evaluation and benchmarking frameworks (e.g. OpenLLM‑Evals, RAGAS, LangSmith) Integrate with security and access control layers (OPA, ABAC, Keycloak) to enforce model policies per tenant Instrument observability for model latency, token usage, performance metrics, error tracing, and drift detection Support deployment of agentic apps with LangGraph, LangChain, and custom inference backends (e.g. vLLM, TGI, Triton) Desired Experience Model Infrastructure: 4+ years in MLOps, ML platform engineering, or infra‑focused ML roles Deep familiarity with model lifecycle management tools: MLflow, Weights & Biases, DVC, HuggingFace Hub Experience with large model deployments (open‑source LLMs preferred): LLaMA, Mistral, Falcon, Mixtral Comfortable with tuning libraries (HuggingFace Trainer, DeepSpeed, FSDP, QLoRA) Familiarity with inference serving: vLLM, TGI, Ray Serve, Triton Inference Server Automation + Infra: Proficient with Terraform, Helm, K8s, and container orchestration Experience with CI/CD for ML (e.g. GitHub Actions + model checkpoints) Managed hybrid workloads across GPU cloud (Lambda, Modal, HuggingFace Inference, Sagemaker) Familiar with cost optimization (spot instance scaling, batch prioritization, model sharding) Agent + Data Pipeline Support Familiarity with LangChain, LangGraph, LlamaIndex or similar RAG/agent orchestration tools Built embedding pipelines for multi‑source documents (PDF, JSON, CSV, HTML) Integrated with vector databases (Weaviate, Qdrant, FAISS, Chroma) Security & Governance Implemented model‑level RBAC, usage tracking, audit trails Integrated with API rate limits, tenant billing, and SLA observability Experience with policy‑as‑code systems (OPA, Rego) and access layers Preferred Stack LLM Ops : HuggingFace, DeepSpeed, MLflow, Weights & Biases, DVC Infra : Kubernetes (GKE/EKS), Ray, Terraform, Helm, GitHub Actions, ArgoCD Serving : vLLM, TGI, Triton, Ray Serve Pipelines : Prefect, Airflow, Dagster Monitoring : Prometheus, Grafana, OpenTelemetry, LangSmith Security : OPA (Rego), Keycloak, Vault Languages : Python (primary), Bash, optionally Rust or Go for tooling Mindset & Culture Fit Builder's mindset with startup autonomy: you automate what slows you down Obsessive about reproducibility , observability , and traceability Comfortable with a hybrid team of AI researchers, DevOps, and backend engineers Interested in aligning ML systems to product delivery, not just papers Bonus: experience with SOC2, HIPAA, or GovCloud‑grade model operations What We’re Looking For Experience 5+ years as a full stack or backend engineer Experience owning and delivering production systems end‑to‑end Prior experience with modern frontend frameworks (React, Next.js) Familiarity with building APIs, databases, cloud infrastructure, or deployment workflows at scale Comfortable working in early‑stage startups or autonomous roles, prior experience as a founder, founding engineer, or a 0‑1 pre‑seed startup is a big plus Mindset Comfortable with ambiguity, eager to prototype and iterate quickly Strong sense of ownership — prefers to build systems rather than wait for tickets Enjoys thinking about architecture, performance, and tradeoffs at every level Clear communicator and pragmatic team player Values equity and impact over prestige or hierarchy Prior startup or founding team experience Why This Role Matters Your work will enable models and agents to be trained, evaluated, deployed, and governed at scale — across many tenants, models, and tasks. This is the backbone of a secure, reliable, and scalable AI‑native enterprise system. If you dream about using AI to solve some really hard real world problems – we would love to hear from you. #J-18808-Ljbffr Fabrion

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the ML Ops Engineer — Agentic AI Lab (Founding Team) in San Francisco, CA vacancy
  •  ...ML/AI Research Engineer — Agentic AI Lab (Founding Team) Location: San Francisco Bay Area Type: Full-Time Compensation: Competitive salary + meaningful equity...  ...-correction, multi-agent communication, and agent ops logging Optimization: Strong background in... 
    Suggested
    Full time

    Fabrion

    San Francisco, CA
    3 days ago
  • Fabrion is looking for an ML Ops Engineer for its Agentic AI Lab in San Francisco. Your role will be pivotal in bridging ML research with production systems, focusing on automating model training and deployment. You will establish secure, scalable pipelines and manage... 
    Suggested

    Fabrion

    San Francisco, CA
    1 day ago
  • The problem we saw Most AI infrastructure is built...  ...investors, and are founded by Keegan McCallum, who...  ...build the layer that model labs, builders, and research teams ship on top of. Where...  ...infrastructure. As our ML Infrastructure and Platform Engineer, you will own the architecture... 
    Suggested
    Flexible hours
    Shift work

    URun

    San Francisco, CA
    1 day ago
  • Founding Forward Deployed Machine Learning Engineer Most AI is frozen in place - it doesn't adapt to the world...  ...short, and build the ML solutions that close the...  ...distributed global‑first team, and quarterly off‑sites...  ...generous paid time off. #J-18808-Ljbffr Adaption Labs
    Suggested
    Flexible hours

    Adaption Labs

    San Francisco, CA
    1 day ago
  • $200k - $250k

     ...Description Job Description Founding Machine Learning Engineer - On-site - San...  ...well-funded seed-stage AI company is building advanced agentic systems that automate...  ...enterprise software. The team is developing...  ...publications or notable ML projects. Background... 
    Suggested
    Work at office
    Immediate start

    Connect Staffing Professional

    San Francisco, CA
    1 day ago
  •  ...Data/ETL Engineer (Founding Team) Location: San Francisco Bay Area Type: Full...  ...'re building a multi-tenant, AI-native platform where...  ...based knowledge models that fuel agentic applications. If you've worked...  ...data Collaborate with ML/LLM teams to feed high-quality... 
    Full time

    Fabrion

    San Francisco, CA
    3 days ago
  • Build a Safer World TRM Labs provides AI-powered intelligence...  ...and more secure. The AI Engineering Team is chartered with...  ...Language Models (LLMs) and agentic systems. Our mission...  ...As a Senior or Staff ML Systems Engineer - LLM...  ...Understand and implement ML Ops best practices ,... 
    Remote work
    Worldwide

    TRM Labs

    San Francisco, CA
    1 day ago
  •  ...Founding Ml Engineer Weave (YC W25) is building the definitive platform for understanding and improving how engineering teams work. We believe the way engineering output is measured today is fundamentally...  ...broken and that modern AI can give teams a far more accurate... 

    Weave, Inc.

    San Francisco, CA
    3 days ago
  •  ...Founding ML Architect San Francisco, CA About This...  ...architect and own our AI roadmap. This is a foundational...  ...'ll lead a growing ML team and work cross-...  ...usable tools for chip engineers. You should bring deep...  ...Hardware Problem Reasoning, Agentic Systems, Fine-tuning Models... 

    Brahma Consulting Group

    San Francisco, CA
    23 hours ago
  •  ...Founding Ml Engineer Skills: Python, PyTorch, NLP, LLMs, Information Retrieval, Entity Resolution...  ...building the gateway to the internet for AI agents. Our APIs already power hundreds...  ...org chart — who reports to whom, what the team structure looks like, how the... 

    Crustdata (YC F24)

    San Francisco, CA
    23 hours ago
  •  ...Seeking Founding Data Scientists and Machine Learning Engineers Imagine Multiplying Your...  ...You can help product teams iterate faster,...  ...next move . Palladio AI is the intelligence...  ...inference, forecasting, agentic platforms, and more...  ...domains: building ML and AI models to... 
    Remote work

    Palladio AI, Inc

    San Francisco, CA
    4 days ago
  •  ...Data Science & ML Ops Engineer Location: Bay Area, CA Tax Term (W2, C2C): W2, C2C We are...  ...Leverage AutoML tools (e.g., Vertex AI AutoML, H2O Driverless AI) for low-code/...  ...explainability) Collaborate with engineering teams to provision containerized environments... 

    Apolis

    San Francisco, CA
    3 days ago
  • uRun is seeking an ML Performance Engineer to build high-performance infrastructure for interactive AI. You will write custom CUDA kernels and optimize model inference for...  ...role involves working closely with the founding team on critical performance challenges in production... 

    URun

    San Francisco, CA
    1 day ago
  •  ...startup in San Francisco seeks founding Machine Learning Engineers (MLEs) to enhance core...  ...You will work on low-latency AI solutions in browser...  .... This role demands strong ML skills and experience with...  ...and character fit on a small team. Ideal candidates must have... 

    Composite

    San Francisco, CA
    1 day ago
  •  ...ML Infrastructure Engineer Spectral Labs is a spatial intelligence company building reasoning models...  ...from the ground up. Our team is small and talent dense. We have founded quantitative trading firms and...  ...the cutting edge of applied AI at Meta, Autodesk Research and... 

    Spectral Labs

    San Francisco, CA
    8 days ago
  • $120k - $160k

     ...Founding Engineer For Airweave's Data And Infrastructure We'...  ...platform that thousands of AI agents depend on. That...  ...with the product team, but your focus is on the...  ...strategies for large-scale agentic search Orchestrate...  ...the world's leading AI labs Competitive salary (... 

    Airweave (yc X25)

    San Francisco, CA
    3 days ago
  •  ...Founding Applied ML Engineer Title of Role: Founding Applied ML Engineer Location: San Francisco,...  ...We're representing an early-stage AI company that operates at the intersection...  ...speech recognition systems. As a founding team member, you will play a crucial role... 
    Work at office

    Recruiting from Scratch

    San Francisco, CA
    3 days ago
  • $172.5k - $260.1k

     ...Category Software Engineering About Salesforce Salesforce...  ...is the #1 AI CRM, where humans with...  ...in the agentic era? You’re in the...  ...decision‑making. Our team is composed of Architects...  ...company benefits can be found at the following...  ...as applicable. #J-18808-Ljbffr Centaur Labs
    Shift work

    Centaur Labs

    San Francisco, CA
    2 days ago
  • TRM Labs is looking for a Senior or Staff ML Systems Engineer to focus on building and scaling the technical infrastructure for AI/ML systems in San Francisco. This position involves developing reusable CI/CD workflows and automating model versioning to ensure compliance... 

    TRM Labs

    San Francisco, CA
    1 day ago
  • About David AI David AI is the first audio...  ...with the same rigor AI labs bring to models. Our...  ...David AI excels. Founded in 2024 by former Scale AI engineers and operators, David...  ...Round Capital. Our team is sharp, humble, ambitious...  ...manage the complete ML lifecycle, from... 

    David AI

    San Francisco, CA
    2 days ago
  • A leading AI infrastructure firm based in San Francisco is looking for engineers to join their founding core team. You will work directly with the founders to develop AI models that optimize network operations and anticipate failures. This unique position offers the opportunity... 

    Meter

    San Francisco, CA
    1 day ago
  • $227.87k

     ...Possible. At Pinterest, AI isn't just a...  ..., applying AI- and agentic-driven development...  ...architecture across the ads ML stack, and...  ...science, and infra teams to translate business...  ...partners across product, engineering and research....  ...this position can be found here. US based... 
    Full time
    Work at office
    Local area
    Relocation
    Relocation package

    Pinterest

    San Francisco, CA
    2 days ago
  • $200k - $250k

     ...This is a job that Jill, our AI Recruiter, is recruiting for on behalf of one of...  ...step is to speak to Jack. Job Title: Founding AI/ML Engineer Salary: $200-250K + Equity...  ...Job Description: Join the founding team at Curium to build Generative Engine Optimization... 

    Jack and Jill AI

    San Francisco, CA
    23 hours ago
  • $250k - $325k

     ...anything. We're building the AI that finally changes...  ...the last 12 months. Engineering at Ivo Engineers at...  ...models in favor of agentic RAG [2023] • Large-scale...  ...part of Infrastructure team to: Own and...  ...strategies to isolate ML vs API workloads while... 
    Contract work
    Work at office
    Remote work

    IVO Inc

    San Francisco, CA
    3 days ago
  •  ...MLOps Engineer At Hayden AI, we are on a mission to harness the power of computer vision to transform...  ...within the Perception Deep Learning team, you will lead the design and evolution...  ...models and drive best practices across the ML lifecycle. You will play a key role... 
    Work at office
    3 days per week

    Hayden AI

    San Francisco, CA
    14 days ago
  •  ...building native actions - an agentic framework that...  ...reliably. We're a team of AI researchers, designers, growth experts, and engineers rethinking human-...  ...About the Role As a ML engineer at Wispr, you'...  ...looking for? Previous founding or startup experience... 

    Wispr Flow

    San Francisco, CA
    1 day ago
  • $150k - $220k

     ...Founding Machine Learning Engineer San Francisco Compensation ~ Estimated base...  ...Bonus We invest in our team's success with comprehensive...  ...barriers, or consumer-focused "AI browsers," we run AI...  ...architecture creates unique ML challenges. This is a high... 
    H1b
    Work at office
    Visa sponsorship
    Sleeping nights

    Composite.ai

    San Francisco, CA
    3 days ago
  •  ...first Machine Learning Engineer, embedded in the Fully Autonomous...  ...Underwriting (FAU) team. This is a high-...  ...role. There is no existing ML platform to inherit, no...  ...auditable, and improvable agentic workflows across the underwriting...  ...that keeps AI underwriting safe as autonomy... 

    Shepherd

    San Francisco, CA
    23 days ago
  •  ...Company Overview We're a team of engineers, neuroscientists, and...  ...untethered to the lab. It's this advancement...  ...computer interfaces and AI, join us. We're backed...  ...building a generational founding team which is truly...  ...~3+ years of applied ML research or development... 

    Orbit

    San Francisco, CA
    3 days ago
  •  ...About Us  Gentoro was founded by a team with deep experience in enterprise infrastructure and AI, with leadership roots at companies...  ...production deployments. As agentic workflows shift from...  ...looking for a visionary Senior ML Engineer who will bridge the gap between... 
    Shift work

    Palm Venture Studios

    San Francisco, CA
    7 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Ops Engineer — Agentic AI Lab (Founding Team). Be the first to apply!