Founding ML Infra Engineer: Scale Real-Time Inference

U-Run

URun in San Francisco is searching for an ML Infrastructure and Platform Engineer. In this role, you will lead the architecture and scaling of our GPU compute platform from the ground up, ensuring high availability and low-latency inference. This is a founding technical hire position, requiring end-to-end ownership across the infrastructure stack, with promising growth and significant responsibilities. At URun, competitive salary and equity, along with full health coverage and flexibility, are offered. #J-18808-Ljbffr

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the Founding ML Infra Engineer: Scale Real-Time Inference in San Francisco, CA vacancy

Edge ML Infra Engineer for Real-Time Perception
...A cutting-edge technology company in San Francisco is seeking an ML Infrastructure Engineer to build and scale machine learning systems for real-time perception and inference. This role involves designing scalable training pipelines for computer vision models, optimizing...
Suggested
Specter Services LLC
San Francisco, CA
1 day ago
Founding ML Engineer — Build Real‑Time AI Data Gateways
Crustdata is looking for a Founding ML Engineer in San Francisco to own the research and engineering... ...turn unstructured, multilingual web-scale data into actionable insights. The... ...Join us to help redefine how AI agents access real-time data. #J-18808-Ljbffr Crustdata
Suggested
Crustdata
San Francisco, CA
2 days ago
ML Infra Engineer — Scale Real‑World AI (SF On-site)
$250k - $350k
...them actually work. We’re hiring ML Infrastructure Engineers to tackle a hard, real-world problem, understanding what... ...sites using wearable devices, large-scale video, and AI. This isn’t clean... ...of hours of data Training and inference systems for multimodal / LLM-based...
Suggested
Trades Workforce Solutions
San Francisco, CA
20 hours ago
Founding ML Research Engineer Real-Time Voice AI
$225k - $400k
...A pioneering AI research firm is seeking a Founding Machine Learning Research Engineer in San Francisco to develop innovative AI systems for real-time voice agents. This high-impact role requires a strong ML research background and proficiency in PyTorch. Responsibilities...
Suggested
Retell AI
San Francisco, CA
1 day ago
ML Infra Engineer: Scale GPU Training & Inference
...Reducto, a fast-growing AI company in San Francisco, is hiring a Machine Learning Infra Engineer. This role involves building and maintaining the training and inference frameworks necessary for optimal performance. Ideal candidates should possess strong Python skills,...
Suggested
Reducto
San Francisco, CA
1 day ago
ML Ops Engineer Agentic AI Lab (Founding Team)
...About the Role ML Ops Engineer — Agentic AI Lab (Founding Team) — Location... ...Area — Type: Full-Time — Compensation:... ...quantization, and inference rollout Manage... ...engineering, or infra-focused ML roles... ...(spot instance scaling, batch prioritization... ...really hard real world problems –...
Full time
Fabrion
San Francisco, CA
5 days ago
Senior ML Infra Engineer - Real-Time Data Systems
...Arena Intelligence, Inc. in San Francisco, CA, is seeking a Senior Software Engineer (Infrastructure) to lead the design of scalable data and API systems. The role involves architecting real-time data pipelines, ensuring performance and reliability, and mentoring...
Arena Intelligence, Inc.
San Francisco, CA
1 day ago
Founding ML Inference Engineer — Ultra-Low Latency AI
...media technology company in San Francisco is seeking a Founding Engineer specializing in ML Inference. This highly technical role requires expertise in the... .... The ideal candidate will drive innovations in real-time model performance, design in-house inference runtimes...
Relocation package
Reactor
San Francisco, CA
3 days ago
Founding ML Engineer
...click to AI agents doing real-time targeted crawling from... ...boundaries of what our ML systems can do. We’re hiring a Founding ML Engineer to own the research and... ..., or data generation at scale Strong Python and PyTorch... ...Given raw people data, infer the org chart — who reports...
Shift work
Crustdata
San Francisco, CA
2 days ago
Senior ML Engineer Real-Time Personalization & Equity
$244k - $320k
...notifications, our AI-powered personalization engine delivers bespoke experiences that... ..., revenue, and loyalty through real-time behavioral insights. Recognized as... ...play a critical role in building, scaling, and operating production-grade ML systems that drive real-time...
Full time
MetaProp.vc
San Francisco, CA
1 day ago
Founding ML Engineer Real-Time In-Browser AI
...A leading tech startup in San Francisco seeks founding Machine Learning Engineers (MLEs) to enhance core action models for their proactive automation... ...improving model accuracy and speed. This role demands strong ML skills and experience with LLMs, emphasizing...
Composite.ai
San Francisco, CA
1 day ago
Senior ML Engineer - Real-Time AdTech & Bidding
$180k - $220k
...seeking a Senior Machine Learning Engineer for their Applied Data Science... ...collaboration with data scientists on real-time optimization solutions. Required... ...in AdTech, and expertise in AI/ML technologies like Java, Python, and large-scale frameworks. The position offers a...
Nexxen
San Francisco, CA
2 days ago
Founding ML infrastructure Engineer
...to deliver that at scale doesn't really exist... ...fix it uRun is the inference cloud for... ...compute layer that makes real‑time, stateful inference... ...investors, and are founded by Keegan McCallum,... ...infrastructure. As our ML Infrastructure and Platform Engineer, you will own the...
Flexible hours
Shift work
U-Run
San Francisco, CA
1 day ago
Real-Time Inference & Model Serving Engineer (Equity)
$220k - $320k
...ML Model Serving Engineer Want to build the layer that actually makes AI usable in real time? You’ll join a team focused on inference, where performance is the product. This is about delivering low... ...respond instantly, reliably, and at scale. That means solving hard...
3 days per week
Trades Workforce Solutions
San Francisco, CA
1 day ago
Founding Engineer Real-Time AI & Systems (SF Onsite)
...David Joseph & Company is seeking a Founding Engineer for Empathic in San Francisco. This role offers 1%–6% equity and involves optimizing inference costs and owning backend infrastructure, merging ML with real-time systems. The ideal candidate has 3+ years of experience...
David Joseph & Company
San Francisco, CA
2 days ago
Founding Distributed Systems Engineer — Real-Time Data
Voiceflow is seeking a Software Engineer (Distributed Systems) in San Francisco. As a founding engineer, you will focus on building a real-time database replication solution leveraging Kafka and CDC while interacting directly with customers. The ideal candidate has strong...
Voiceflow
San Francisco, CA
1 day ago
Senior Data Infra Engineer Real-Time Petabyte Pipelines
...Judgment Labs, based in San Francisco, is seeking a Senior Data Infrastructure Engineer to design and scale real-time data pipelines critical for agent behavior analysis. The ideal candidate should have over 6 years of experience managing high-throughput, petabyte-scale...
Judgment Labs
San Francisco, CA
1 day ago
Founding Machine Learning Engineer
...We're looking for founding Machine Learning Engineers (MLEs) to own and... ...of LLM inference, browser understanding... ...instant response times with zero migration... ...architecture creates unique ML challenges. This... ...that run in real time,... ...model quality at scale Experiment with retrieval...
Sleeping nights
Composite.ai
San Francisco, CA
1 day ago
ML/AI Founding Engineer
...the role This is a founding AI/ML role. You'll own... ...accuracy against real‑world CFI assessments... ..., audio ML, time‑series analysis, or... ...training, evaluation, inference, and monitoring in... ...production Strong engineering fundamentals — you... ...Force A role that scales into technical leadership...
Navi AI
San Francisco, CA
1 day ago
Machine Learning Engineer, Inference & Serving (Speech LLM) - San Francisco
$180k - $270k
...ultra-low-latency inference engines for large language... ..., throughput, and Time-To-First-Token (or... ...To-First-Audio) in real-time streaming... ...between the core ML training team and... ...ASR accuracy. Large-Scale Distributed Systems... ...Kubernetes. What We Offer Founding Team Initiative:...
Full time
Work at office
Worldwide
Plaud
San Francisco, CA
2 days ago
ML Infra Engineer: Scale Training & Inference (Hybrid)
...A leading technology company is looking for an ML Infrastructure Engineer in San Francisco. The successful candidate will build and maintain ML training pipelines and ensure low-latency model serving. Candidates should have over 4 years of experience in ML engineering...
Work at office
Lattice
San Francisco, CA
1 day ago
Remote ML Engineer - Real-Time AI
A leading AI solutions company seeks a Machine Learning Engineer to develop and optimize machine learning models in a remote-first... ...involve collaboration across teams and managing scalable ML models for real-time decision-making. Ideal candidates have 3+ years of...
Remote work
Geminus
San Francisco, CA
2 days ago
Real-Time LLM Inference & Speech Serving Engineer
$180k - $270k
...infrastructure roles in San Francisco, focusing on building high-performance inference engines for speech AI. Ideal candidates will have substantial experience in GPU architecture and real-time systems. This position offers a competitive salary range of $180K - $270K,...
Plaud
San Francisco, CA
2 days ago
Real-Time GPU Inference Optimization Engineer
$300k
...technology firm in San Francisco seeks a GPU Optimisation Engineer to maximize GPU performance in real-time AI systems. The ideal candidate will possess strong... ...of GPU execution, and a knack for optimizing inference latency for large generative models. With a competitive...
Visa sponsorship
Relocation package
Trades Workforce Solutions
San Francisco, CA
2 days ago
Founding ML Performance Engineer - Sub-50ms Inference
uRun is seeking an ML Performance Engineer to build high-performance infrastructure for interactive... ...CUDA kernels and optimize model inference for speed and efficiency. This foundational... ...involves working closely with the founding team on critical performance challenges...
U-Run
San Francisco, CA
2 days ago
Senior GPU ML Infra Engineer — Mid-Training & Inference
...company based in San Francisco is seeking a specialist to design and operate large-scale GPU infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization. The ideal candidate will have hands-on...
Reflection AI
San Francisco, CA
2 days ago
Founding MLOps Engineer Scale LLMs & Secure AI Infra
$250k
...Alldus International Consulting Ltd is looking for a talented ML/AI Research Engineer to join their San Francisco team. You will be responsible... ...infrastructure that powers training, deployment, and governance of large-scale AI systems. The ideal candidate has a strong background in...
Alldus International Consulting Ltd
San Francisco, CA
1 day ago
Founding Engineer: Real-Time Voice AI & Systems
$100k - $200k
...technology company in San Francisco is seeking a Founding Engineer to develop innovative voice-first technologies. In this full-time, on-site role, you will shape the technical foundation by designing, developing, and scaling systems that enhance Human-Computer...
Full time
Voice Cursor
San Francisco, CA
4 days ago
Staff ML Infrastructure Engineer: Scale Training & Inference
$300k - $430k
...as a team. About the Team The ML Infrastructure team builds the... ...the routing layer that manages inference across multiple providers. We... ...hiring a Staff ML Infrastructure Engineer to own the platforms powering... ...-tuning and post-training at scale Implement and integrate state-...
Work at office
Decagon
San Francisco, CA
1 day ago
Staff ML Inference Systems Engineer - Scalable GPU Infra (SF)
...Technical Staff focused on building and optimizing ML inference systems in San Francisco. The role involves... ...pipelines and enhancing performance under real-world workloads. Candidates should have strong software engineering skills, experience with ML inference systems,...
Acceler8 Talent
San Francisco, CA
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Founding ML Infra Engineer: Scale Real-Time Inference. Be the first to apply!