Founding ML Infra Engineer: Scale Real-Time Inference
U-Run
URun in San Francisco is searching for an ML Infrastructure and Platform Engineer. In this role, you will lead the architecture and scaling of our GPU compute platform from the ground up, ensuring high availability and low-latency inference. This is a founding technical hire position, requiring end-to-end ownership across the infrastructure stack, with promising growth and significant responsibilities. At URun, competitive salary and equity, along with full health coverage and flexibility, are offered. #J-18808-Ljbffr
- ...A cutting-edge technology company in San Francisco is seeking an ML Infrastructure Engineer to build and scale machine learning systems for real-time perception and inference. This role involves designing scalable training pipelines for computer vision models, optimizing...Suggested
- Crustdata is looking for a Founding ML Engineer in San Francisco to own the research and engineering... ...turn unstructured, multilingual web-scale data into actionable insights. The... ...Join us to help redefine how AI agents access real-time data. #J-18808-Ljbffr CrustdataSuggested
$250k - $350k
...them actually work. We’re hiring ML Infrastructure Engineers to tackle a hard, real-world problem, understanding what... ...sites using wearable devices, large-scale video, and AI. This isn’t clean... ...of hours of data Training and inference systems for multimodal / LLM-based...Suggested$225k - $400k
...A pioneering AI research firm is seeking a Founding Machine Learning Research Engineer in San Francisco to develop innovative AI systems for real-time voice agents. This high-impact role requires a strong ML research background and proficiency in PyTorch. Responsibilities...Suggested- ...Reducto, a fast-growing AI company in San Francisco, is hiring a Machine Learning Infra Engineer. This role involves building and maintaining the training and inference frameworks necessary for optimal performance. Ideal candidates should possess strong Python skills,...Suggested
- ...About the Role ML Ops Engineer — Agentic AI Lab (Founding Team) — Location... ...Area — Type: Full-Time — Compensation:... ...quantization, and inference rollout Manage... ...engineering, or infra-focused ML roles... ...(spot instance scaling, batch prioritization... ...really hard real world problems –...Full time
- ...Arena Intelligence, Inc. in San Francisco, CA, is seeking a Senior Software Engineer (Infrastructure) to lead the design of scalable data and API systems. The role involves architecting real-time data pipelines, ensuring performance and reliability, and mentoring...
- ...media technology company in San Francisco is seeking a Founding Engineer specializing in ML Inference. This highly technical role requires expertise in the... .... The ideal candidate will drive innovations in real-time model performance, design in-house inference runtimes...Relocation package
- ...click to AI agents doing real-time targeted crawling from... ...boundaries of what our ML systems can do. We’re hiring a Founding ML Engineer to own the research and... ..., or data generation at scale Strong Python and PyTorch... ...Given raw people data, infer the org chart — who reports...Shift work
$244k - $320k
...notifications, our AI-powered personalization engine delivers bespoke experiences that... ..., revenue, and loyalty through real-time behavioral insights. Recognized as... ...play a critical role in building, scaling, and operating production-grade ML systems that drive real-time...Full time- ...A leading tech startup in San Francisco seeks founding Machine Learning Engineers (MLEs) to enhance core action models for their proactive automation... ...improving model accuracy and speed. This role demands strong ML skills and experience with LLMs, emphasizing...
$180k - $220k
...seeking a Senior Machine Learning Engineer for their Applied Data Science... ...collaboration with data scientists on real-time optimization solutions. Required... ...in AdTech, and expertise in AI/ML technologies like Java, Python, and large-scale frameworks. The position offers a...- ...to deliver that at scale doesn't really exist... ...fix it uRun is the inference cloud for... ...compute layer that makes real‑time, stateful inference... ...investors, and are founded by Keegan McCallum,... ...infrastructure. As our ML Infrastructure and Platform Engineer, you will own the...Flexible hoursShift work
$220k - $320k
...ML Model Serving Engineer Want to build the layer that actually makes AI usable in real time? You’ll join a team focused on inference, where performance is the product. This is about delivering low... ...respond instantly, reliably, and at scale. That means solving hard...3 days per week- ...David Joseph & Company is seeking a Founding Engineer for Empathic in San Francisco. This role offers 1%–6% equity and involves optimizing inference costs and owning backend infrastructure, merging ML with real-time systems. The ideal candidate has 3+ years of experience...
- Voiceflow is seeking a Software Engineer (Distributed Systems) in San Francisco. As a founding engineer, you will focus on building a real-time database replication solution leveraging Kafka and CDC while interacting directly with customers. The ideal candidate has strong...
- ...Judgment Labs, based in San Francisco, is seeking a Senior Data Infrastructure Engineer to design and scale real-time data pipelines critical for agent behavior analysis. The ideal candidate should have over 6 years of experience managing high-throughput, petabyte-scale...
- ...We're looking for founding Machine Learning Engineers (MLEs) to own and... ...of LLM inference, browser understanding... ...instant response times with zero migration... ...architecture creates unique ML challenges. This... ...that run in real time,... ...model quality at scale Experiment with retrieval...Sleeping nights
- ...the role This is a founding AI/ML role. You'll own... ...accuracy against real‑world CFI assessments... ..., audio ML, time‑series analysis, or... ...training, evaluation, inference, and monitoring in... ...production Strong engineering fundamentals — you... ...Force A role that scales into technical leadership...
$180k - $270k
...ultra-low-latency inference engines for large language... ..., throughput, and Time-To-First-Token (or... ...To-First-Audio) in real-time streaming... ...between the core ML training team and... ...ASR accuracy. Large-Scale Distributed Systems... ...Kubernetes. What We Offer Founding Team Initiative:...Full timeWork at officeWorldwide- ...A leading technology company is looking for an ML Infrastructure Engineer in San Francisco. The successful candidate will build and maintain ML training pipelines and ensure low-latency model serving. Candidates should have over 4 years of experience in ML engineering...Work at office
- A leading AI solutions company seeks a Machine Learning Engineer to develop and optimize machine learning models in a remote-first... ...involve collaboration across teams and managing scalable ML models for real-time decision-making. Ideal candidates have 3+ years of...Remote work
$180k - $270k
...infrastructure roles in San Francisco, focusing on building high-performance inference engines for speech AI. Ideal candidates will have substantial experience in GPU architecture and real-time systems. This position offers a competitive salary range of $180K - $270K,...$300k
...technology firm in San Francisco seeks a GPU Optimisation Engineer to maximize GPU performance in real-time AI systems. The ideal candidate will possess strong... ...of GPU execution, and a knack for optimizing inference latency for large generative models. With a competitive...Visa sponsorshipRelocation package- uRun is seeking an ML Performance Engineer to build high-performance infrastructure for interactive... ...CUDA kernels and optimize model inference for speed and efficiency. This foundational... ...involves working closely with the founding team on critical performance challenges...
- ...company based in San Francisco is seeking a specialist to design and operate large-scale GPU infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization. The ideal candidate will have hands-on...
$250k
...Alldus International Consulting Ltd is looking for a talented ML/AI Research Engineer to join their San Francisco team. You will be responsible... ...infrastructure that powers training, deployment, and governance of large-scale AI systems. The ideal candidate has a strong background in...$100k - $200k
...technology company in San Francisco is seeking a Founding Engineer to develop innovative voice-first technologies. In this full-time, on-site role, you will shape the technical foundation by designing, developing, and scaling systems that enhance Human-Computer...Full time$300k - $430k
...as a team. About the Team The ML Infrastructure team builds the... ...the routing layer that manages inference across multiple providers. We... ...hiring a Staff ML Infrastructure Engineer to own the platforms powering... ...-tuning and post-training at scale Implement and integrate state-...Work at office- ...Technical Staff focused on building and optimizing ML inference systems in San Francisco. The role involves... ...pipelines and enhancing performance under real-world workloads. Candidates should have strong software engineering skills, experience with ML inference systems,...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Founding ML Infra Engineer: Scale Real-Time Inference. Be the first to apply!
- machine learning software engineer San Francisco, CA
- ai ml engineer San Francisco, CA
- graduate machine learning engineer San Francisco, CA
- computer vision machine learning engineer San Francisco, CA
- machine learning engineer San Francisco, CA
- senior ml engineer San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- machine learning ai engineer San Francisco, CA
- data scientist machine learning engineer San Francisco, CA
- machine learning intern San Francisco, CA

