Software Engineer, ML Infrastructure

Realm Labs LLC

Role Overview

We are hiring a Founding ML Infrastructure Engineer to own the end-to-end deployment, optimization, and operation of our suits of models in production.

This is a core founding role focused on building and operating production-grade LLM systems . You will apply deep knowledge of model internals to deploy, optimize, and run modern LLMs at scale , owning performance end-to-end across latency, throughput, and reliability .

You will design and operate the full ML serving stack from model artifacts to GPU execution, and work closely with Product and ML teams to ensure our models can support high QPS, strict SLAs, and production correctness .

This role is ideal for someone who deeply understands how LLMs work internally but chooses to specialize in making them fast, stable, and production-ready .

About Realm Labs

Realm Labs is an AI trust and security startup. We help enterprises detect, debug, and prevent AI’s misbehaviors in production. We are backed by top VCs and serve some of the most iconic global enterprises.

Key Responsibilities

Own the end-to-end LLM inference stack , including:
- Model loading and execution
- GPU utilization and memory efficiency
- Runtime performance tuning
- Production deployment and scaling
Design and operate high-performance LLM serving systems using technologies such as:
- vLLM, TensorRT / TensorRT-LLM, Triton Inference Server, SGLang
Optimize inference across:
- Latency
- Throughput (QPS)
- GPU memory footprint
- Cost efficiency
Work hands-on with PyTorch and TensorFlow models , including:
- Model graph understanding
- Attention mechanisms, KV cache behavior, batching strategies
- Precision tradeoffs (FP16, BF16, INT8, etc.)
Build and maintain production-grade GPU services :

Multi-model serving
Autoscaling strategies
Fault isolation and graceful degradation

Collaborate with application and platform teams to:
- Define serving APIs
- Ensure correctness and safety of outputs
- Debug production issues end-to-end
Build a reproducible model training and versioning system for customer deployments
Establish best practices for:
- Model versioning
- Rollouts and rollbacks
- Performance benchmarking
- Production validation

Expected Qualifications

5+ years of professional experience in ML infrastructure, systems engineering, or production ML roles.
Strong software engineering fundamentals; ability to write robust, maintainable production code .
Deep hands-on experience with LLM inference infrastructure , including:
- PyTorch (required)
- TensorFlow (working knowledge)
Proven experience with GPU inference optimization , including:
- TensorRT / TensorRT-LLM
- vLLM
- Triton Inference Server
- SGLang or similar serving runtimes
Strong understanding of LLM internals , such as:
- Transformer architectures
- Attention and KV caching
- Batching, streaming, and token-level generation
Experience running ML systems in production with high traffic and SLAs.
Comfortable working in Linux-based, cloud production environments .

Preferred Qualifications

Experience deploying LLMs on Kubernetes and GPU clusters.
Familiarity with CUDA, NCCL , or low-level GPU performance concepts.
Experience with:
- Model sharding and parallelism strategies
- Multi-GPU inference
- Streaming inference systems
Knowledge of observability for ML systems (metrics, latency breakdowns, GPU monitoring).
Experience working at startups or owning systems with minimal abstraction layers.

Additional Information

This is a founding, high-ownership role with direct impact on core product capabilities.
You will be expected to build, run, and own systems end-to-end .
The role may include limited on-call responsibilities aligned with production ownership.

Compensation & Benefits

Market aligned compensation and benefits.
Founding engineer equity ( Equity is a significant component of this role and will be discussed ).
Medical, Dental, Vision, Life insurance, 401-K, In-office lunch etc.

Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and candidate. But if we make you an offer, we will make all reasonable effort to get you a visa, and we retain an immigration lawyer to help with this. #J-18808-Ljbffr

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the Software Engineer, ML Infrastructure in Sunnyvale, CA vacancy

Software Engineer, ML Infrastructure
$160.36k - $240.54k
...Software Engineer, ML Infrastructure Mountain View, California (HQ) Who We Are Nuro is a self-driving technology company on a mission to make autonomy accessible to all. Founded in 2016, Nuro is building the world's most scalable driver, combining cutting-edge...
Suggested
Nuro
Mountain View, CA
12 hours ago
Software Engineer, Performance Tooling and Infrastructure
$152k - $228k
...Software Engineer, Performance Tooling and Infrastructure Mountain View, California (HQ) Who We Are Nuro is a self-driving technology company on a mission... ...At Nuro, every autonomy code change, from ML model updates to radius of map around the robot to number...
Suggested
Temporary work
Nuro
Mountain View, CA
2 days ago
Senior Software Engineer - ML Infrastructure
$153k - $222k
...Machine Learning Engineer Applied Intuition, Inc. is powering the... ...is creating the digital infrastructure needed to bring intelligence... ...machine learning pipelines and ML engineers that want to work beyond... ...degree in Computer Science, Software Engineering, or equivalent...
Suggested
Full time
For contractors
For subcontractor
Casual work
Work at office
Remote work
Day shift
Applied Intuition
Sunnyvale, CA
3 days ago
Software Engineer Intern — ML & Search Infrastructure
$19 - $65 per hour
Medium is looking for a Software Engineer Intern to work on AI-based virtual driver software for... .... The role involves optimizing search infrastructures, developing sampling strategies, and... ...candidates will have a solid foundation in ML and programming languages like Python...
Suggested
Hourly pay
Internship
Medium
Santa Clara, CA
1 day ago
Senior Software Engineer, Infrastructure AI/ML, Google Cloud
$174k - $252k
Senior Software Engineer, Infrastructure AI/ML, Google Cloud Google Sunnyvale, CA, USA Apply Bachelor’s degree or equivalent practical experience. 5 years of experience programming in C++ or Java. 3 years of experience testing, maintaining, or launching software products...
Suggested
Full time
Google Inc.
Sunnyvale, CA
2 days ago
Associate Software Engineer Search Infrastructure Moveworks
Responsibilities Build out core infrastructure services and microservices that impact our machine... ...that is interdependent with other engineering teams. Own features end-to-end, and regularly... ...cross functionally with Core and ML engineering teams, and more. Qualifications...
Immediate start
Centaur Labs
Mountain View, CA
3 days ago
Senior Software Engineer, ML Infrastructure, Core Infra
$174k - $252k
Senior Software Engineer, ML Infrastructure, Core Infra corporate_fare Google place Sunnyvale, CA, USA Apply Qualifications Bachelor’s degree or equivalent practical experience. 5 years of experience with software development in one or more programming languages....
Full time
Temporary work
Google Inc.
Sunnyvale, CA
4 days ago
Software Engineer, Data Infrastructure
$180k - $250k
...Lead Data Infrastructure Engineer Cupertino, CA Company Gridmatic is a startup trying to help... ...increasingly important. We use machine learning (ML) forecasting and optimization to trade... ...Flyte, Airflow, or Temporal. Strong software engineering skills. Being able to write...
Work at office
Remote work
Work from home
Home office
Flexible hours
3 days per week
Gridmatic
Cupertino, CA
2 days ago
Software Engineer, ML Data Infrastructure
$160.36k - $240.54k
...Software Engineer, ML Data Infrastructure Mountain View, California (HQ) Nuro is a self-driving technology company on a mission to make autonomy accessible to all. Founded in 2016, Nuro is building the world's most scalable driver, combining cutting-edge AI with...
Work experience placement
Nuro
Mountain View, CA
10 days ago
Senior Software Engineer - GenAI Infrastructure & Agent Systems for Engineering Efficiency
$160.36k - $240.54k
...About the Team We empower engineers to build the future of transportation by transforming how software is developed, tested, and... ...systems. Our team builds the AI infrastructure that enables autonomous... ...debugging, knowledge discovery, and ML model improvement. We sit...
Nuro
Mountain View, CA
2 days ago
Senior Software Engineer, Data Infrastructure
$213k - $263k
...simulation across 15+ U.S. states. The ML Ops team, part of Waymo ML Platform team, builds tools and infrastructure to realize the ML flywheel at Waymo. This includes... ...of professional experience in the field of software engineering ~ Experience programming in C++ ~...
Full time
Remote work
Waymo
Mountain View, CA
2 days ago
Sr. Software Engineer, Perception Data Infrastructure
$193.93k - $291.15k
...Sr. Software Engineer, Perception Data Infrastructure Mountain View, California (HQ) About the Role We are a team of high-output generalists where ML and systems engineering converge to push autonomy performance forward. As a Senior Perception ML Data Infrastructure...
Nuro
Mountain View, CA
2 days ago
Software Engineer, System Validation - EDA Infrastructure
$224k - $356.5k
NVIDIA is hiring engineers to scale up the introduction of next generation... ...architecture into its EDA Infrastructure. We expect you to have a deep... ...systems, familiarity with software testing and deployment, and excellent... ...from the crowd Developing ML/AI infrastructure. Developing...
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Software Engineer III, Infrastructure, Cloud AI
$141k - $202k
...2 years of experience with software development in C++. 2 years... ...with developing large-scale infrastructure, distributed systems or networks... ..., and software test engineering. About the job The XLA (Accelerated... ...(infra) gaps to help with ML stack maturation (e.g.,...
Full time
Worldwide
Google Inc.
Sunnyvale, CA
4 days ago
Senior Software Engineer - Data Infrastructure
$153k - $222k
....) About the role We are looking for infrastructure engineers with expertise in scaling open-source... ...data infrastructure to join the Data & ML infra group. This role will work across... .... Develop and deploy high-quality software using modern tooling and frameworks, especially...
Full time
For contractors
For subcontractor
Casual work
Work at office
Remote work
Day shift
Decisive Point
Mountain View, CA
4 days ago
Graduate Software Engineer, Inference Infrastructure
$136.8k - $259.2k
...A leading technology company is looking for a Software Engineer Graduate to join the Inference Infrastructure team in San Jose. This role involves designing and building... ...should have a strong background in systems and ML, with a competitive salary range of $136,800 - $25...
Pangleglobal
San Jose, CA
2 days ago
Software Engineer, AI Compute Infrastructure
...Software Engineer, AI Compute Infrastructure About HeyGen At HeyGen, our mission is to make visual storytelling accessible to all. Over the last decade... ...such as Kubernetes and Ray . Experience with core ML frameworks such as PyTorch, TensorFlow, or JAX ....
Full time
HeyGen
Palo Alto, CA
2 days ago
Software Engineer Graduate (Inference Infrastructure) - 2026 Start (PHD)
$136.8k - $259.2k
...Software Engineer Graduate (Inference Infrastructure) - 2026 Start (PHD) Location: San Jose Team: Technology Employment Type: Regular The Inference... ...infrastructure to deliver cost-efficient and secure ML platforms. Collaborate across teams to deliver world...
Temporary work
Pangleglobal
San Jose, CA
2 days ago
Senior Software Engineer - Compute Infrastructure (Orchestration & Scheduling) Technology - Inf[...]
$136.8k - $359.72k
...Senior Software Engineer - Compute Infrastructure (Orchestration & Scheduling) Location: San Jose Team: Infrastructure Employment Type: Regular... ...innovate, powering global platforms like TikTok and various AI/ML & LLM initiatives, we face the challenge of enhancing...
Temporary work
Overseas
ByteDance
San Jose, CA
2 days ago
Software Engineer - AI Compute Infrastructure
$156k - $387.6k
...Responsibilitie About the Team The Inference Infrastructure team is the creator and open-source... ...new AI workloads, and are looking for engineers passionate about cloud-native systems,... ...infrastructure to deliver cost-efficient and secure ML platforms. - Collaborate across teams to...
Temporary work
Local area
ByteDance
San Jose, CA
12 hours ago
Senior Software Infrastructure Engineer
$175k - $290k
...This role is part of the Software Infrastructure team , responsible for building and scaling the... ...infrastructure that supports the entire software engineering organization. You will work on... ...platforms that enable development of ML accelerator systems across both...
Remote work
Phizenix
Santa Clara, CA
2 days ago
Senior Software Engineer - Test Infrastructure
...Senior Software Engineer - Test Infrastructure Latitude AI develops automated driving technologies, including L3, for Ford vehicles at scale. We're... ...environments, including Linux-based edge devices, robotics, or ML-driven applications Experience with C++ & Bazel...
Work at office
Immediate start
Latitude AI
Palo Alto, CA
12 hours ago
Software Engineer, Infrastructure Performance
$2,000 per month
...would be impossible with GPUs, like real-time video generation models and extremely deep chain-of-thought reasoning. Software Engineer, Infrastructure Performance Designing and writing software for new ASICs is hard, and requires a huge amount of software and tooling....
Work at office
Relocation package
OpenReq
Cupertino, CA
2 days ago
Software Engineer, ML Infrastructure, Level 4
$157k - $235k
...and its AR glasses, Spectacles. Snap Engineering teams build fun and technically... ...ll play a critical role in scaling our ML Infrastructure, optimizing training and inference systems... ...and impactful. We're looking for a Software Engineer, ML Infrastructure to join Snap...
Live in
Work at office
Local area
Snapchat
Palo Alto, CA
3 days ago
Software Engineer, Ads ML Infrastructure
$156k - $316.8k
...Software Engineer, Ads ML Infrastructure Location: San Jose Employment Type: Regular Job Code: A217691 Responsibilities About the team The ads system at TikTok operates on a massive scale and serves millions of advertisers, clients and influencers across...
Temporary work
Local area
Tik Tok
San Jose, CA
12 days ago
Software Engineer, LLM Infrastructure
$2,000 per month
...would be impossible with GPUs, like real-time video generation models and extremely deep chain-of-thought reasoning. Software Engineer, LLM Infrastructure Transformer ASICs, like those built by Etched, dramatically improve time-to-first-token latency. For a large model...
Work at office
Relocation package
OpenReq
Cupertino, CA
2 days ago
Software Engineer - Recommendation Infrastructure, TikTok Video - Professionally Generated Content
$156k - $316.8k
...recommendations. We bridge the gap between complex ML models and high-performance systems,... .... By joining us, you'll build the infrastructure backbone that connects premium creators... ...areas: personalized recommendations, search engine, machine learning, distributed storage...
Temporary work
Local area
Tik Tok
San Jose, CA
1 day ago
Software Engineer - TikTok AI Search Infrastructure
$244.8k
...highly performant, scalable and stable infrastructures that serve billions of search requests everyday... ...users globally. We apply cutting edge ML/NLP/LLM/VLM technology for end-to-end... ...TikTok's AI Search multi-agent LLM engine, supporting ReAct + Tool calling, DAG-based...
Temporary work
Local area
Tik Tok
San Jose, CA
12 hours ago
Senior Software Engineer (Backend) - AI Infrastructure San Jose, Ca
$158.8k - $190.55k
...Senior Software Engineer (Backend) - AI Infrastructure San Jose, California, United States Working here means you become part of a vision-driven team that... ...in Python, Java, or similar, along with experience in ML frameworks (e.g., PyTorch) and cloud services (e.g.,...
Remote work
Flexible hours
ESR Healthcare
San Jose, CA
2 days ago
Software Engineer, Agent Infrastructure
...Join to apply for the Software Engineer, Infrastructure role at Simular . Get AI-powered advice on this job and more exclusive features. Where... ...end. Bonus: you’ve touched GPU scheduling, large-scale ML infra, or scaling SaaS systems. Seniority level...
Full time
Simular
Palo Alto, CA
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, ML Infrastructure. Be the first to apply!