Software Engineer, ML Infrastructure
Realm Labs LLC
Role Overview
We are hiring a Founding ML Infrastructure Engineer to own the end-to-end deployment, optimization, and operation of our suits of models in production.
This is a core founding role focused on building and operating production-grade LLM systems . You will apply deep knowledge of model internals to deploy, optimize, and run modern LLMs at scale , owning performance end-to-end across latency, throughput, and reliability .
You will design and operate the full ML serving stack from model artifacts to GPU execution, and work closely with Product and ML teams to ensure our models can support high QPS, strict SLAs, and production correctness .
This role is ideal for someone who deeply understands how LLMs work internally but chooses to specialize in making them fast, stable, and production-ready .
About Realm Labs
Realm Labs is an AI trust and security startup. We help enterprises detect, debug, and prevent AI’s misbehaviors in production. We are backed by top VCs and serve some of the most iconic global enterprises.
Key Responsibilities
- Own the end-to-end LLM inference stack , including:
- Model loading and execution
- GPU utilization and memory efficiency
- Runtime performance tuning
- Production deployment and scaling
- Design and operate high-performance LLM serving systems using technologies such as:
- vLLM, TensorRT / TensorRT-LLM, Triton Inference Server, SGLang
- Optimize inference across:
- Latency
- Throughput (QPS)
- GPU memory footprint
- Cost efficiency
- Work hands-on with PyTorch and TensorFlow models , including:
- Model graph understanding
- Attention mechanisms, KV cache behavior, batching strategies
- Precision tradeoffs (FP16, BF16, INT8, etc.)
- Build and maintain production-grade GPU services :
- Multi-model serving
- Autoscaling strategies
- Fault isolation and graceful degradation
- Collaborate with application and platform teams to:
- Define serving APIs
- Ensure correctness and safety of outputs
- Debug production issues end-to-end
- Build a reproducible model training and versioning system for customer deployments
- Establish best practices for:
- Model versioning
- Rollouts and rollbacks
- Performance benchmarking
- Production validation
Expected Qualifications
- 5+ years of professional experience in ML infrastructure, systems engineering, or production ML roles.
- Strong software engineering fundamentals; ability to write robust, maintainable production code .
- Deep hands-on experience with LLM inference infrastructure , including:
- PyTorch (required)
- TensorFlow (working knowledge)
- Proven experience with GPU inference optimization , including:
- TensorRT / TensorRT-LLM
- vLLM
- Triton Inference Server
- SGLang or similar serving runtimes
- Strong understanding of LLM internals , such as:
- Transformer architectures
- Attention and KV caching
- Batching, streaming, and token-level generation
- Experience running ML systems in production with high traffic and SLAs.
- Comfortable working in Linux-based, cloud production environments .
Preferred Qualifications
- Experience deploying LLMs on Kubernetes and GPU clusters.
- Familiarity with CUDA, NCCL , or low-level GPU performance concepts.
- Experience with:
- Model sharding and parallelism strategies
- Multi-GPU inference
- Streaming inference systems
- Knowledge of observability for ML systems (metrics, latency breakdowns, GPU monitoring).
- Experience working at startups or owning systems with minimal abstraction layers.
Additional Information
- This is a founding, high-ownership role with direct impact on core product capabilities.
- You will be expected to build, run, and own systems end-to-end .
- The role may include limited on-call responsibilities aligned with production ownership.
Compensation & Benefits
- Market aligned compensation and benefits.
- Founding engineer equity ( Equity is a significant component of this role and will be discussed ).
- Medical, Dental, Vision, Life insurance, 401-K, In-office lunch etc.
$160.36k - $240.54k
...Software Engineer, ML Infrastructure Mountain View, California (HQ) Who We Are Nuro is a self-driving technology company on a mission to make autonomy accessible to all. Founded in 2016, Nuro is building the world's most scalable driver, combining cutting-edge...Suggested$152k - $228k
...Software Engineer, Performance Tooling and Infrastructure Mountain View, California (HQ) Who We Are Nuro is a self-driving technology company on a mission... ...At Nuro, every autonomy code change, from ML model updates to radius of map around the robot to number...SuggestedTemporary work$153k - $222k
...Machine Learning Engineer Applied Intuition, Inc. is powering the... ...is creating the digital infrastructure needed to bring intelligence... ...machine learning pipelines and ML engineers that want to work beyond... ...degree in Computer Science, Software Engineering, or equivalent...SuggestedFull timeFor contractorsFor subcontractorCasual workWork at officeRemote workDay shift$19 - $65 per hour
Medium is looking for a Software Engineer Intern to work on AI-based virtual driver software for... .... The role involves optimizing search infrastructures, developing sampling strategies, and... ...candidates will have a solid foundation in ML and programming languages like Python...SuggestedHourly payInternship$174k - $252k
Senior Software Engineer, Infrastructure AI/ML, Google Cloud Google Sunnyvale, CA, USA Apply Bachelor’s degree or equivalent practical experience. 5 years of experience programming in C++ or Java. 3 years of experience testing, maintaining, or launching software products...SuggestedFull time- Responsibilities Build out core infrastructure services and microservices that impact our machine... ...that is interdependent with other engineering teams. Own features end-to-end, and regularly... ...cross functionally with Core and ML engineering teams, and more. Qualifications...Immediate start
$174k - $252k
Senior Software Engineer, ML Infrastructure, Core Infra corporate_fare Google place Sunnyvale, CA, USA Apply Qualifications Bachelor’s degree or equivalent practical experience. 5 years of experience with software development in one or more programming languages....Full timeTemporary work$180k - $250k
...Lead Data Infrastructure Engineer Cupertino, CA Company Gridmatic is a startup trying to help... ...increasingly important. We use machine learning (ML) forecasting and optimization to trade... ...Flyte, Airflow, or Temporal. Strong software engineering skills. Being able to write...Work at officeRemote workWork from homeHome officeFlexible hours3 days per week$160.36k - $240.54k
...Software Engineer, ML Data Infrastructure Mountain View, California (HQ) Nuro is a self-driving technology company on a mission to make autonomy accessible to all. Founded in 2016, Nuro is building the world's most scalable driver, combining cutting-edge AI with...Work experience placement$160.36k - $240.54k
...About the Team We empower engineers to build the future of transportation by transforming how software is developed, tested, and... ...systems. Our team builds the AI infrastructure that enables autonomous... ...debugging, knowledge discovery, and ML model improvement. We sit...$213k - $263k
...simulation across 15+ U.S. states. The ML Ops team, part of Waymo ML Platform team, builds tools and infrastructure to realize the ML flywheel at Waymo. This includes... ...of professional experience in the field of software engineering ~ Experience programming in C++ ~...Full timeRemote work$193.93k - $291.15k
...Sr. Software Engineer, Perception Data Infrastructure Mountain View, California (HQ) About the Role We are a team of high-output generalists where ML and systems engineering converge to push autonomy performance forward. As a Senior Perception ML Data Infrastructure...$224k - $356.5k
NVIDIA is hiring engineers to scale up the introduction of next generation... ...architecture into its EDA Infrastructure. We expect you to have a deep... ...systems, familiarity with software testing and deployment, and excellent... ...from the crowd Developing ML/AI infrastructure. Developing...$141k - $202k
...2 years of experience with software development in C++. 2 years... ...with developing large-scale infrastructure, distributed systems or networks... ..., and software test engineering. About the job The XLA (Accelerated... ...(infra) gaps to help with ML stack maturation (e.g.,...Full timeWorldwide$153k - $222k
....) About the role We are looking for infrastructure engineers with expertise in scaling open-source... ...data infrastructure to join the Data & ML infra group. This role will work across... .... Develop and deploy high-quality software using modern tooling and frameworks, especially...Full timeFor contractorsFor subcontractorCasual workWork at officeRemote workDay shift$136.8k - $259.2k
...A leading technology company is looking for a Software Engineer Graduate to join the Inference Infrastructure team in San Jose. This role involves designing and building... ...should have a strong background in systems and ML, with a competitive salary range of $136,800 - $25...- ...Software Engineer, AI Compute Infrastructure About HeyGen At HeyGen, our mission is to make visual storytelling accessible to all. Over the last decade... ...such as Kubernetes and Ray . Experience with core ML frameworks such as PyTorch, TensorFlow, or JAX ....Full time
$136.8k - $259.2k
...Software Engineer Graduate (Inference Infrastructure) - 2026 Start (PHD) Location: San Jose Team: Technology Employment Type: Regular The Inference... ...infrastructure to deliver cost-efficient and secure ML platforms. Collaborate across teams to deliver world...Temporary work$136.8k - $359.72k
...Senior Software Engineer - Compute Infrastructure (Orchestration & Scheduling) Location: San Jose Team: Infrastructure Employment Type: Regular... ...innovate, powering global platforms like TikTok and various AI/ML & LLM initiatives, we face the challenge of enhancing...Temporary workOverseas$156k - $387.6k
...Responsibilitie About the Team The Inference Infrastructure team is the creator and open-source... ...new AI workloads, and are looking for engineers passionate about cloud-native systems,... ...infrastructure to deliver cost-efficient and secure ML platforms. - Collaborate across teams to...Temporary workLocal area$175k - $290k
...This role is part of the Software Infrastructure team , responsible for building and scaling the... ...infrastructure that supports the entire software engineering organization. You will work on... ...platforms that enable development of ML accelerator systems across both...Remote work- ...Senior Software Engineer - Test Infrastructure Latitude AI develops automated driving technologies, including L3, for Ford vehicles at scale. We're... ...environments, including Linux-based edge devices, robotics, or ML-driven applications Experience with C++ & Bazel...Work at officeImmediate start
$2,000 per month
...would be impossible with GPUs, like real-time video generation models and extremely deep chain-of-thought reasoning. Software Engineer, Infrastructure Performance Designing and writing software for new ASICs is hard, and requires a huge amount of software and tooling....Work at officeRelocation package$157k - $235k
...and its AR glasses, Spectacles. Snap Engineering teams build fun and technically... ...ll play a critical role in scaling our ML Infrastructure, optimizing training and inference systems... ...and impactful. We're looking for a Software Engineer, ML Infrastructure to join Snap...Live inWork at officeLocal area$156k - $316.8k
...Software Engineer, Ads ML Infrastructure Location: San Jose Employment Type: Regular Job Code: A217691 Responsibilities About the team The ads system at TikTok operates on a massive scale and serves millions of advertisers, clients and influencers across...Temporary workLocal area$2,000 per month
...would be impossible with GPUs, like real-time video generation models and extremely deep chain-of-thought reasoning. Software Engineer, LLM Infrastructure Transformer ASICs, like those built by Etched, dramatically improve time-to-first-token latency. For a large model...Work at officeRelocation package$156k - $316.8k
...recommendations. We bridge the gap between complex ML models and high-performance systems,... .... By joining us, you'll build the infrastructure backbone that connects premium creators... ...areas: personalized recommendations, search engine, machine learning, distributed storage...Temporary workLocal area$244.8k
...highly performant, scalable and stable infrastructures that serve billions of search requests everyday... ...users globally. We apply cutting edge ML/NLP/LLM/VLM technology for end-to-end... ...TikTok's AI Search multi-agent LLM engine, supporting ReAct + Tool calling, DAG-based...Temporary workLocal area$158.8k - $190.55k
...Senior Software Engineer (Backend) - AI Infrastructure San Jose, California, United States Working here means you become part of a vision-driven team that... ...in Python, Java, or similar, along with experience in ML frameworks (e.g., PyTorch) and cloud services (e.g.,...Remote workFlexible hours- ...Join to apply for the Software Engineer, Infrastructure role at Simular . Get AI-powered advice on this job and more exclusive features. Where... ...end. Bonus: you’ve touched GPU scheduling, large-scale ML infra, or scaling SaaS systems. Seniority level...Full time
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer, ML Infrastructure. Be the first to apply!
- software engineer internship remote Sunnyvale, CA
- software engineer staff Sunnyvale, CA
- machine learning software engineer Sunnyvale, CA
- software engineer part time Sunnyvale, CA
- senior robotics software engineer Sunnyvale, CA
- junior software developer Sunnyvale, CA
- software engineer entry level Sunnyvale, CA
- software development engineer aws Sunnyvale, CA
- startup software engineer Sunnyvale, CA
- rust software engineer Sunnyvale, CA

