Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Inference Platform Engineer Low-Latency, Multi-Tenant

MongoDB HQ

A leading data platform company in Palo Alto seeks a Senior Engineer to develop a cutting-edge inference platform supporting semantic search and AI-native experiences. The ideal candidate will have over five years of experience in backend systems and proficiency in languages like Go, Rust, or Python. You'll work alongside ML researchers to enhance infrastructure for real-time inference, ensuring high performance and reliability. This hybrid role offers a unique opportunity to influence the next generation of developer solutions. #J-18808-Ljbffr

Vacancy posted 6 hours ago
Similar jobs that could be interesting for youBased on the Senior Inference Platform Engineer Low-Latency, Multi-Tenant in Palo Alto, CA vacancy
  •  ...We’re looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for...  ...for real-time, low-latency, and high-scale inference — fully...  ...Design and build components of a multi-tenant inference platform integrated... 
    Senior
    Local area
    Worldwide

    MongoDB

    Palo Alto, CA
    6 hours ago
  • $100k - $120k

     ...Cerebras is seeking a Senior Engineer to join their Cloud Platform team in Mountain View, California. In this role, you will own the architecture of our multi-tenant SaaS platform, working at the intersection of backend engineering and cloud infrastructure. The ideal candidate... 
    Senior

    Cerebras

    Mountain View, CA
    1 day ago
  • $128.7k - $261.3k

     ...The Model Deployment & Inference Solutions team in GM...  ...build the ML deployment platform that makes model...  ...they meet the real‑time latency and memory budgets required...  ...performed manually by engineers. Build the developer...  ...toolchains. Experience with low‑latency or real‑time... 
    Senior
    Local area
    Remote work
    Flexible hours
    Shift work

    General Motors

    Mountain View, CA
    7 hours ago
  • $175k - $220k

    ThoughtSpot, Inc. is seeking a Senior Engineer to join their Cloud Platform team in Mountain View, California. This role involves owning the architecture and evolution of their multi-tenant SaaS platform, driving architectural decisions, and mentoring engineers. The ideal... 
    Senior

    ThoughtSpot, Inc.

    Mountain View, CA
    4 days ago
  •  ...Israelvcforum is looking for a Senior ML Infrastructure Engineer in Mountain View, California. This position aims to build and scale robust platforms for ML inference workflows supporting GM’s AI efforts. You will collaborate with ML engineers and researchers to implement... 
    Senior
    Remote work

    Israelvcforum

    Mountain View, CA
    1 day ago
  • $500 per month

     ...Senior Platform Engineer Palo-Alto (In-office); San Francisco, California (In-Office) Banking...  ...implement core platforms that power multi-agent orchestration, real-time decisioning...  ...to ensure reliability and low latency. Develop abstractions that allow AI... 
    Senior
    Work at office

    Interface AI

    Palo Alto, CA
    3 days ago
  •  ...Senior Cloud Platform Engineer Own the production inferencing service reliability and scale Location: Palo Alto, California...  .... Your primary focus will be ensuring our inference endpoints have exceptional uptime, low-latency response times, and efficient resource utilization... 
    Senior
    Immediate start
    Flexible hours
    Shift work

    jobs.frontdoordefense.com - Jobboard

    Palo Alto, CA
    7 hours ago
  • $236k - $339.25k

     ...accelerate your impact. We look for low-ego individuals who thrive in...  ...Snowflake Machine Learning Platform team's mission is to enable...  ...and proactively with senior architects, PMs, and team leadership...  ...in serving LLMs using inference engines like vLLM, TensorRT-LLM, TEI,... 
    Senior
    Flexible hours

    Snowflake Computing

    Menlo Park, CA
    4 days ago
  • $200k - $287.5k

     ...Senior Software Engineer On Billing Platform At Snowflake, we are powering the...  ...impact. We look for low-ego individuals who...  ...auditability, and low-latency processing. This...  ...engineer on large, multi-team initiatives....  ...including token-based inference, agent workflows,... 
    Senior
    Flexible hours

    Streamlit

    Menlo Park, CA
    4 days ago
  • $100k - $120k

    About the Role We are looking for a Senior Engineer to join our Cloud Platform team and take ownership of the architecture and evolution of our multi-tenant SaaS platform. You will work at...  ...path components for high‑throughput, low‑latency workloads Build and enforce... 
    Senior

    Cerebras

    Mountain View, CA
    1 day ago
  • $155.42k - $395.9k

     ...Description About the Team: The ML Inference Platform is part of the AV ML Infrastructure...  .... About the Role: We are seeking a Senior ML Infrastructure engineer to help build and scale robust platforms...  .... Ability to thrive in a dynamic, multi-tasking environment with ever-... 
    Senior
    Local area
    Remote work
    Relocation
    Relocation package
    Flexible hours

    Israelvcforum

    Mountain View, CA
    7 hours ago
  • $248.71k - $292.6k

     ...delivers fast, efficient AI inference. Our LPU-based system powers...  ...Build fast. Sr. Staff Software Engineer - High Performance GPU Inference...  ...and implement scalable, low-latency runtime systems that coordinate...  ...NVLink/Fabric topologies, and multi‑accelerator systems (... 
    Senior

    I did my part and supported the Regular Toilet

    Palo Alto, CA
    4 days ago
  • $137k - $270k

     ...leading developer data platform, MongoDB Atlas, is...  ...globally distributed, multi-cloud database and is...  ...re looking for a Lead Engineer, Inference Platform to join our...  ...time, high-scale, and low-latency inference — all...  ...scalability in a multi-tenant, cloud-native environment... 
    Local area
    Worldwide
    Flexible hours

    MongoDB

    Palo Alto, CA
    more than 2 months ago
  • $256k - $414k

    NVIDIA Gruppe is looking for a Senior Manager to lead high-performance networking operations for cloud gaming. This role will oversee complex data center interconnects, ensuring ultra-low latency and high reliability. The ideal candidate will have extensive networking... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • Groq is seeking a Sr. Staff Software Engineer to enhance their High Performance GPU Inference Systems. Located in California, this position involves designing scalable, low-latency systems and optimizing GPU performance. The ideal candidate will have expertise in distributed... 
    Senior

    I did my part and supported the Regular Toilet

    Palo Alto, CA
    4 days ago
  • $196k - $310.5k

     ...organization is looking for a Senior Cybersecurity Engineer – Identity Platform & Access Management to...  .... Partner with multi‑functional collaborators...  ...with high availability and low latency. Deep understanding of delegated...  ...in complex, multi‑tenant environments. Hands‑on application... 
    Senior
    Worldwide

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  •  ...General Motors is seeking a Senior ML Infrastructure Engineer to build and scale a robust platform for machine learning inference workflows. You will design backend software components, collaborate with ML engineers, and lead initiatives across GM's ML ecosystem. With... 
    Senior
    Remote work

    General Motors

    Sunnyvale, CA
    7 hours ago
  • A technology company in Mountain View is seeking a candidate with hands-on experience in building multi-tenant SaaS platforms. The role requires knowledge of Kubernetes, cloud services (AWS/GCP/Azure), and the ability to integrate AI tools for enhancing productivity. Responsibilities... 
    Senior

    ThoughtSpot

    Mountain View, CA
    1 day ago
  • $152k - $241.5k

     ...superchip. We are looking for expert engineers to come and help design rack...  ...scaling AI supercomputing platforms. What you will be doing Drive...  ..., optimize firmware for low latency APIs. Strong knowledge of analyzing...  .... Experience with ML and multi‑variable optimization... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    7 hours ago
  • $120.1k - $225.7k

     ...Entails End-to-End Inference Optimization: Lead the...  ...and minimize latency. Heterogeneous Computing...  ...Computer Science, Electronic Engineering, AI, or related fields;...  ...techniques, including multi-level KV Cache...  ...deep understanding of low-level programming models... 
    Senior
    Relocation package

    Tencent

    Palo Alto, CA
    2 days ago
  • $148.6k - $306.3k

     ...What you’ll do As a Senior Platform Engineer, contribute across design, coding, testing, operability, and quality assurance in a development...  ...OAuth 2.0, mTLS, proxy or tunnel-based architectures, and multi-tenant service design. Experience with observability practices including... 
    Senior

    SAP SE

    Palo Alto, CA
    6 hours ago
  • $184k - $287.5k

     ...features related to CUDA’s memory model and multi‑node scalability geared towards next‑...  ...in Computer Science, Electrical Engineering or related field (or equivalent...  ...experience with parallel computing, PyTorch, low‑latency AI inference Understanding of system‑level... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    7 hours ago
  •  ...Senior Software Development Engineer, Annapurna Labs, Elastic Collectives job at Annapurna Labs (U.S.) Inc.. Cupertino, CA. DESCRIPTION Annapurna...  ...on AWS. We are seeking an experienced engineer with low-level latency networking or interconnect expertise to optimize... 
    Senior
    Internship
    Work from home
    Flexible hours

    Itlearn360

    Cupertino, CA
    1 day ago
  •  ...Senior Backend Engineer — AI Agents Platform HOAi is a fast-growing startup revolutionizing...  ...executing complex, multi-step processes with...  ..., and multi-tenant isolation....  ...optimize throughput/latency, queue backlogs, caching...  ..., storage, and inference/tool-call costs; build... 
    Senior
    Work at office
    Remote work
    Flexible hours

    Vantaca

    Redwood City, CA
    3 days ago
  •  ...Senior Backend Engineer — AI Voice Agent Platform HOAi is a fast-growing startup revolutionizing...  ...executing complex, multi-step processes with human...  ...that supports real-time, low-latency conversations at scale....  ...idempotency, rate limits, and per-tenant budgets/quotas.... 
    Senior
    Work at office
    Remote work
    Flexible hours

    Vantaca

    Redwood City, CA
    3 days ago
  •  ...Time · Department: Backend Engineer · Work type: On-Site About...  ...the world's first AI platform to bring AI into the real...  ...critical services, optimize for latency and throughput, and...  ...support high-throughput, low-latency AI model inference and data services. Partner... 
    Senior
    Full time

    Neara

    Palo Alto, CA
    6 hours ago
  •  ...’s best data and AI infrastructure platform so our customers can use deep data...  ...improve their business. Founded by engineers — and customer obsessed — we leap at...  ...performance‑sensitive systems (e.g., latency‑critical services, multi‑tenant platforms, large‑scale indexing... 
    Senior

    I did my part and supported the Regular Toilet

    Mountain View, CA
    7 hours ago
  • $152k - $241.5k

     ...NVIDIA is the platform upon which every new AI‑powered application...  ...built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source...  ...enables high‑throughput, low‑latency inference at scale. This...  ...work. Improve multi‑GPU inference performance... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $180k - $240k

     ...We are seeking a Senior Cloud Infrastructure Engineer to architect and manage...  ...of our AI platform, ensuring that multi-GPU clusters, distributed...  ...using Triton Inference Server, Ray Serve,...  ...Optimization: Optimize low-level...  ...RoCE v2) to minimize latency for 3D Gaussian Splatting... 
    Senior
    Odd job
    Work at office

    Gatik AI

    Mountain View, CA
    1 day ago
  • $119.8k - $234.7k

     ...Ourconverged AI fabricdelivers inference capabilities for all LLMs...  ..., Llama, and more. As a Senior Software Engineer , you will shape the...  ...efficiently, and with ultra-low latency-enabling a rich set of AI-...  ...large-scale AI services and platform capabilities that power... 
    Senior
    Ongoing contract
    Local area

    Microsoft Corporation

    Mountain View, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Inference Platform Engineer Low-Latency, Multi-Tenant. Be the first to apply!