Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Founding Engineer - ML Performance

URun

The problem we saw Most AI infrastructure is built for batch: send a query, wait, get a response, reset. Powerful, but transactional. AI is becoming interactive — sessions that hold state, models that stay alive between turns, generation that responds as it runs — and the infrastructure to deliver that at scale doesn't really exist yet. The bottleneck isn't the models anymore. It's the infrastructure underneath them. What we're building to fix it uRun is the inference cloud for interactive AI: the compute layer that makes real-time, stateful inference possible at scale. We came out of stealth in April 2026, are backed by top-tier investors, and are founded by Keegan McCallum, who scaled inference infrastructure for some of the most demanding generative AI workloads in production. We're an infrastructure company. We build the layer that model labs, builders, and research teams ship on top of. Where you come in Performance is uRun's core differentiator. We're not chasing incremental gains — we're building infrastructure that runs 10‑100x faster than the status quo. As our ML Performance Engineer, you will be the person who makes that true. This is a founding technical hire. You will write custom CUDA kernels, push GPU utilization to its limits, and own inference latency end‑to‑end across the stack. You will work directly with the founding team on the hardest performance problems in production AI infrastructure — and your fingerprints will be on everything we ship. What you'll actually be doing day‑to‑day Write custom CUDA kernels that unlock performance headroom unavailable through off‑the‑shelf frameworks Optimize model inference end‑to‑end, targeting sub‑50ms latency across our inference platform Drive 10x performance improvements across the stack: memory bandwidth, kernel fusion, operator scheduling, and beyond Implement zero‑copy distributed memory optimizations across multi‑GPU and multi‑node environments Own GPU utilization and memory management, squeezing every available FLOP out of the hardware we run Profile, benchmark, and instrument the full inference pipeline to find and eliminate bottlenecks systematically Set the performance engineering bar for the team: define what fast looks like and build the tooling to measure it What skills you need for the journey Deep, hands‑on CUDA expertise: you have written custom kernels in production, not just called into cuBLAS Strong background in model inference and post‑training optimization at scale Fluency in GPU memory hierarchy, warp scheduling, kernel fusion, and hardware‑aware algorithm design Experience profiling and benchmarking complex inference pipelines: you know where the time goes and how to get it back Able to operate at the frontier with minimal guidance — you identify the problem, design the approach, and ship the fix Things that will give you an edge Public work in GPU optimization or inference efficiency — open source contributions, a published paper, or a side project that shows your depth (vLLM, Flash‑Attention, TensorRT‑LLM, PyTorch, or equivalent) Experience with hardware‑aware optimization frameworks: CuTe, Triton, TileLang, or similar Familiarity with distributed memory and communication primitives: NCCL, InfiniBand, NVLink, RoCE Contributions to or deep familiarity with PyTorch Distributed, Ray core, or similar systems Experience optimizing for video generation or other high‑throughput, latency‑sensitive generative workloads Prior work at an inference‑focused company or research lab pushing the boundary of what GPU hardware can do What you'll get in return Competitive salary and meaningful equity in an early‑stage AI infrastructure company. The band above is our target; for an exceptional candidate we'll go higher. Equity is real — you're early, and the grant reflects that. Health, dental, and vision — full coverage 401(k) — company‑supported retirement savings FSA/HSA — flexible spending accounts for healthcare costs Paid time off — we trust you to manage your time Top‑tier tooling — access to the best AI tools available: Claude, Codex, Kimi, and whatever else helps you move faster MacBook Pro and AirPods — the hardware you need, on us How we work (and what that feels like day‑to‑day) We build the stage, not the show. We're an infrastructure company, a developer‑tools company, and a production partner for model labs, and focus is a deliberate choice we've made and hold to. Day‑to‑day, that means a small team, a high bar, and real ownership. You won't wait for permission or inherit a backlog of someone else's decisions, in a founding security role, the function is what you make it. It also means ambiguity: priorities shift, not everything is documented, and you'll often be the person who decides what "secure enough, for now" means. That suits some people and not others, and we'd rather you know that before you apply. Watch our launch party video Read the manifesto Follow us on LinkedIn Follow us on X #J-18808-Ljbffr URun

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Founding Engineer - ML Performance in San Francisco, CA vacancy
  •  ...entertainment. The Role We need a founding level AI/ML developer to work in-person full-time...  .... Improving closed source models' performance by training tuned endpoints....  ...early-stage startup or as a founding engineer. This job listing is for a W-2... 
    Performance
    Full time
    Visa sponsorship

    Pax Historia

    San Francisco, CA
    3 days ago
  •  ...focused team of YC and unicorn founders and senior engineers with deep expertise in 3D, generative video,...  ...possible. About the Role We're looking for a Founding Engineer, ML Inference with deep expertise in high-performance ML engineering. This is a highly technical, high... 
    Performance
    Relocation
    Visa sponsorship
    Relocation package

    Reactor

    San Francisco, CA
    4 days ago
  • Slope is seeking a Founding Compiler Engineer in San Francisco, responsible for designing core compiler infrastructure and optimizing AI models. You will write CUDA kernels and conduct performance reviews, contributing to Luminal's mission of making AI workloads portable... 
    Performance
    Full time

    Slope

    San Francisco, CA
    2 days ago
  • $180k - $220k

     ...in San Francisco is hiring a Head of Engineering — a founding leadership role for someone who can set...  ...only approaches to cognitive health and performance.What You'll DoPartner directly with...  ...Swift/macOS desktop, iOS, and Python for ML/data pipelines, plus cloud backend (... 
    Performance
    Work at office

    Torrey Holistics

    San Francisco, CA
    4 days ago
  • $150k - $250k

     ...network The next step is to speak to Jack. Job Title: Founding Engineer Salary: $150k - $250k + Equity Company Description:...  ...in the real world. You will design intuitive APIs and high-performance distributed systems, owning projects from 0 to 1 while processing... 
    Performance

    Jack and Jill AI

    San Francisco, CA
    10 hours ago
  • $130k - $200k

     ...Founding Engineer (Frontend) Title of Role: Founding Engineer (Frontend) Location: San Francisco, onsite Company Stage of Funding...  ...members. Troubleshoot and debug applications to enhance performance and user satisfaction. Stay updated on emerging... 
    Performance
    Work at office
    Local area

    Recruiting from Scratch

    San Francisco, CA
    10 hours ago
  • $130k - $160k

     ...Job Description Founding Engineer (AI / Full Stack) San Francisco, USA (Relocation + Visa Sponsorship) $130,000 - $160,000 + Equity...  ...for candidate engagement, scheduling, and Q&A • Improve performance of matching systems with sub-100ms query response times •... 
    Performance
    Work at office
    Relocation
    Visa sponsorship

    Revive IT Recruitment

    San Francisco, CA
    4 days ago
  •  ...Vidably is backed and supported by AI Fund.  Founded by Andrew Ng, AI Fund is backed by over...  ...We can scale customers without bespoke engineering per store Required Qualifications...  ..., and operate Strong fundamentals: performance, reliability, security, and product/UX... 
    Performance
    Odd job
    Relocation

    Career

    San Francisco, CA
    10 hours ago
  •  ...interesting technical challenge – our engineering team focuses on reverse engineering the...  ...which enables us to quantify a brand's performance relative to their peers along any dimension...  ...to talk. About The Role As a Founding Engineer, you'll own problems end-to-end... 
    Performance

    Unusual

    San Francisco, CA
    10 hours ago
  • $200k - $250k

     ...Job Description Job Description Senior Founding Engineer | San Francisco, CA | Avg $200K–$250K Role Overview A rapidly scaling...  ...think deeply about systems design, and care about building high-performance infrastructure that operates reliably at production scale.... 
    Performance
    Full time

    Direct Line Workforce Solutions

    San Francisco, CA
    5 days ago
  • $100k

     ...Job Description Job Description SENIOR FOUNDING ENGINEER Overview We now handle millions of calls per month and need an engineer...  ...engineers at Microsoft and AWS, the company makes sure AI agents perform as intended before they ever reach the real world. Backed... 
    Performance
    Permanent employment
    Full time
    H1b
    Work at office

    Thomas Talent Network

    San Francisco, CA
    18 days ago
  •  ...and Executive Search Solution. We were founded on the fact that technical recruiting is...  ...HMBL. We're seeking a Founding Software Engineer for an early staged AI & LLM startup...  ...databases, distributed systems, and backend performance measurement and optimization ~... 
    Performance
    Full time

    HMBL

    San Francisco, CA
    a month ago
  • $100k

     ...Description Job Description About the Role We're hiring a Senior Founding Engineer to help architect and scale the systems behind one of the...  ...infrastructure and database optimization to frontend performance and real-time communication systems. This role is ideal for... 
    Performance
    Permanent employment
    Full time
    H1b
    Work at office

    Firstchoicedrivers

    San Francisco, CA
    9 days ago
  • talentpluto is hiring a Founding GTM Engineer to build the company's outbound and demand-generation engine. This role involves designing a systems...  ...targeting. You will work closely with leadership in a high-performance environment, taking significant ownership and contributing... 
    Performance

    talentpluto

    San Francisco, CA
    10 hours ago
  •  ...leading identity verification platform in San Francisco is seeking an experienced Engineer to be a founding member of their Resilience Engineering function. This role focuses on enhancing performance, scalability, and observability across product teams. You will tackle... 
    Performance

    Persona

    San Francisco, CA
    10 hours ago
  • $285k - $315k

    SF Tensor is looking for a Founding GPU Kernel Engineer in San Francisco, specializing in GPU architecture and kernel optimization for machine...  ...has deep expertise, proven capabilities in hand-optimizing performance-critical kernels, and strong programming skills in C++... 
    Performance
    Full time
    Relocation package

    SF Tensor

    San Francisco, CA
    3 days ago
  • $180k - $250k

     ...Job Description Job Description Founding Engineer – Applied AI — Antes Location: San Francisco, CA (Onsite preferred — exceptional remote...  ...-training Design data interpretation: Build agents and ML models that can interpret, reason over, and manipulate complex... 
    Full time
    H1b
    Remote work
    Relocation
    Visa sponsorship

    David Joseph & Company

    San Francisco, CA
    3 days ago
  • A tech company focused on monetization infrastructure in San Francisco is seeking a Founding Engineer. In this role, you will collaborate with the CTO to build high-performance systems for platforms like Twitch and TikTok. The ideal candidate will possess strong skills... 
    Performance

    Jack & Jill/External ATS

    San Francisco, CA
    10 hours ago
  • $200k - $300k

    Job Title Founding Forward Deployed Engineer Salary $200-300k + Equity Company Description Corvera - YC-backed AI workforce for CPG brands Job...  ...portals. Improve the reliability, observability, and performance of non‑deterministic distributed systems to ensure 40%... 
    Performance

    Jack & Jill/External ATS

    San Francisco, CA
    10 hours ago
  • $225k

    Founding Engineer ($225k+ + Equity) at well-funded healthcare AI startup Company Description Well-funded healthcare AI startup Job Description...  ..., containerization, and firmware pipelines for high‑performance compute in non‑datacenter environments. Optimize GPU acceleration... 
    Performance

    Jack & Jill/External ATS

    San Francisco, CA
    10 hours ago
  •  ...personal agent powered by applied AI. As a Founding Engineer, you'll work directly with our core...  ...with their personal agent Optimize for performance, reliability, and security across our systems...  ...systems Background in working with AI/ML systems or integrating with AI APIs... 
    Performance

    Serif.ai

    San Francisco, CA
    4 days ago
  • $250k - $380k

     ...Founding Perception Engineer Salary: $250-380k + 1-3% Equity Company Description: Crewline AI - Seed-stage robotics startup backed by Initialized...  ...founding team led by an Oxford CS PhD and ex-Amazon ML engineer, backed by $5.5M from Initialized Capital and unicorn... 

    Jack and Jill AI

    San Francisco, CA
    2 days ago
  • $200k

     ...The next step is to speak to Jack. Founding Engineer - AI ($200K+ + Equity) at Muro AI - SF-...  ...RAG systems capable of processing and performing semantic search on messy, 1,500-page scanned...  ...of experience shipping production AI/ML products, with deep expertise in LLM... 

    Jack and Jill AI

    San Francisco, CA
    1 day ago
  • Founding / Early Engineer - Real-Time AI Systems (Healthcare) We’re building something that doesn’t exist today: real-time AI that runs inside...  ...the power, cooling, or infrastructure needed to run high-performance systems. We’re changing that. We’re building GPU-powered... 
    Performance
    Work at office
    Local area

    Stealth AI Infrastructure Startup

    San Francisco, CA
    2 days ago
  •  ...works as naturally as humans do. The Role We’re looking for a Founding Engineer in the US to join our core team and help build the next...  ...multimodal AI systems (voice, text, vision) with ultra-low latency performance. Work directly with the founders on product design,... 
    Performance

    Zywa

    San Francisco, CA
    3 days ago
  •  ...AI voice and compliance workflows Push performance boundaries, delivering sub-700ms latency...  ...team : Work side‑by‑side with a world‑class founding team from Verkada, NVIDIA, Apple, and...  ...training, designing, or building production AI/ML systems Previous startup or founder... 
    Performance

    Kastle

    San Francisco, CA
    1 day ago
  • $140k - $180k

    Founding Engineer Salary: $140K-$180K + Equity Company Description: Nero - VC-backed monetization infrastructure for interactive livestreaming...  .... What you will do Architect and implement high‑performance real‑time systems, including websocket‑based live queues and... 
    Performance

    Jack & Jill/External ATS

    San Francisco, CA
    10 hours ago
  •  ...intelligent applications. We are seeking a Founding Engineer - one of the very first few technical...  ...such as: Distributed systems and high-performance cloud infrastructure Large-scale data...  ..., storage, and query engines AI/ML platform infrastructure, model serving,... 
    Performance

    Harrison Clarke

    San Francisco, CA
    3 days ago
  •  ...-residence at the iconic design and innovation firm, IDEO. Founding Engineer To make this future possible, we’re hiring our first engineer...  ...: React/NextJS/TypeScript; component-driven development, performance profiling, accessibility, data fetching/state, etc. Backend... 
    Performance

    Shared Context Lab

    San Francisco, CA
    4 days ago
  •  ...would look completely different. Personalized. Preventative. Performance enhancing. With an iconic, category-defining brand and cult...  ...easily improving your personal health. THE ROLE As Superpower’s founding engineer, you will have a unique opportunity to shape the vision and... 
    Performance
    Work at office

    Zero21

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Founding Engineer - ML Performance. Be the first to apply!