Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Model Serving Engineer — LLM Inference, Remote

$100k - $150k

Bright Vision Technologies

San Ramon, CA
  • Remote job

Bright Vision Technologies is hiring a Model Serving Engineer to design, build, and operate high-performance inference platforms. This remote role requires 6+ years in distributed systems, and strong skills in Python and systems languages like Go, Rust, or C++. The engineer will focus on optimizing ML model serving performance and collaborating with teams to support new model releases. A competitive salary of $100K – $150K is offered, with additional benefits. #J-18808-Ljbffr Bright Vision Technologies

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior Model Serving Engineer — LLM Inference, Remote in San Ramon, CA vacancy
  •  ...Bright Vision Technologies is seeking a Model Serving Engineer to design and operate high-performance inference platforms for large ML models. This remote role requires expertise in distributed systems, Python, and Kubernetes. The ideal candidate will have over 6 years... 
    Remote work
    Senior
    Full time

    Bright Vision Technologies

    Plano, TX
    3 days ago
  • $100k - $150k

    Bright Vision Technologies is seeking a Model Serving Engineer to design high-performance inference platforms for machine learning models. This remote position focuses on optimizing client service delivery and performance while collaborating with diverse teams. The ideal... 
    Remote job
    Senior

    Bright Vision Technologies

    Woodbridge, NJ
    4 days ago
  • $100k - $150k

     ...Bright Vision Technologies is looking for a Model Serving Engineer to design, build, and operate high-performance inference platforms. This role requires expertise in distributed...  ...and Kubernetes. The position is fully remote and offers a competitive salary of $100,000... 
    Remote work
    Senior

    Bright Vision Technologies

    Duluth, GA
    1 hour ago
  •  ...Social Discovery Group seeks a Senior NLP/LLM Engineer to conduct experiments with AI models and optimize them for production. This remote full-time role requires strong knowledge in machine learning, NLP, and proficiency in Python. Key responsibilities include developing... 
    Remote work
    Senior
    Full time

    Social Discovery Group

    Poland, NY
    1 hour ago
  • $100k - $150k

    Bright Vision Technologies is seeking a Model Serving Engineer to design and operate high-performance inference platforms for large machine learning models. The role...  ...GPU utilization. This full-time position is 100% remote for candidates in the Continental United States... 
    Remote job
    Senior
    Full time

    Bright Vision Technologies

    Katy, TX
    2 days ago
  • $167.2k - $209k

     ...applications. We are seeking a Senior Engineer 2 to join our AI Inference Data Plane team. In...  ...and scale their models with industry-leading...  ...distributed inference serving frameworks such as llm‑d, NVIDIA Dynamo, or Ray...  ...- $209,000 This is a remote role Why You’ll Like Working... 
    Remote work
    Senior
    Local area
    Worldwide
    Flexible hours

    DigitalOcean

    San Francisco, CA
    3 days ago
  • Bright Vision Technologies is seeking a Model Serving Engineer to design and operate highly reliable inference platforms for large machine learning models. This remote full-time position requires strong expertise in distributed systems and performance engineering, offering... 
    Remote job
    Full time

    Bright Vision Technologies

    Bellevue, WA
    1 day ago
  •  ...Senior AI and Large Language Model (LLM) Engineer NIH-Bethesda Location Bethesda, Maryland (On-site / Not-remote) Overview We are seeking an experienced AI/LLM Engineer to lead...  ...knowledge discovery tools. This role serves as a subject matter expert (SME)... 
    Senior
    Immediate start

    Black Canyon Consulting LLC

    Bethesda, MD
    6 days ago
  •  ...Job We are looking for an experienced AI Model Engineer with deep expertise in kernel...  ...acceleration. The engineer will extend the inference framework to support inference and fine...  ...functional teams to integrate optimized serving and inference frameworks into production... 
    Remote job
    Senior

    Framework Ventures

    New York, NY
    2 days ago
  •  ...your career. THE ROLE As a senior member of the LLM inference framework team, you will be...  ...for large language models on AMD GPUs. You will work...  ...first‑class platform for LLM serving. This role sits at the intersection of inference engines, distributed systems, and GPU... 
    Senior

    Advanced Micro Devices

    Santa Clara, CA
    1 day ago
  •  ...most powerful Large Tabular Model (LTM) – purpose-built for the...  .... About the Role Our Serving team is responsible for turning...  ...of research and production engineering. We work closely with...  ...meaningfully from traditional LLM inference, including irregular computational... 
    Remote work
    Work at office
    Relocation package

    Fundamental

    United States
    1 day ago
  • $100k - $150k

     ...Model Serving Engineer Bright Vision Technologies is a forward-thinking software...  ...Engineer Location: 100% Remote (Continental United States)...  ..., highly reliable inference platforms for serving large...  ...and KV cache strategies for LLM serving workloads. Integrate... 
    Remote work
    Full time
    H1b
    Local area
    Immediate start
    Visa sponsorship

    Bright Vision Technologies

    United States
    3 days ago
  • $145.2k - $252.48k

     ...Responsibilities We are seeking an expert Senior Model-Based Systems Engineer (MBSE) to lead the modeling and...  ...requirements baselines. Serve as a subject matter expert in MBSE methodologies...  ...role, geographic location (For Remote Opportunities), education and... 
    Remote work
    Senior
    Hourly pay
    Contract work
    Temporary work
    For contractors
    Work experience placement

    Arcfield

    Washington DC
    3 days ago
  • $220k - $320k

     ...ML Model Serving Engineer Want to build the layer that actually makes AI usable in real time? You’ll join a team focused on inference, where performance is the product. This is about delivering low...  ...-performance serving systems for LLM, speech, and vision models Scaling... 
    3 days per week

    Trades Workforce Solutions

    San Francisco, CA
    4 days ago
  • $135k

     ...Senior Security Engineer AI Model And Application The Senior Security...  ...Security Engineer will serve as the subject...  ...expert (SME) in AI and LLM security across the...  ...training pipelines to inference endpoints and user-facing...  ...works on-site or remote based on the... 
    Remote work
    Senior
    Temporary work
    Work at office
    Monday to Friday
    Flexible hours

    ImmunityBio

    United States
    1 day ago
  •  ...Product Manager to own AI inference and model serving for k0rdent AI, our...  ..., and performance engineering. You will define how...  ...product management, or a senior technical role owning...  ..., SGLang, TensorRT-LLM, Dynamo, Triton) and...  ...job opportunities. #remote We are a Leader... 
    Remote work

    Mirantis

    Austin, TX
    1 day ago
  • $242k - $290k

     ...of a multi-modality foundation model to drive the next generation...  ...Model Optimization & Deployment Engineer, you will focus on bringing...  ..., and build highly concurrent inference code to ensure real-time, deterministic...  ...technologies (e.g., TensorRT-LLM). $242,000 - $290,000 a... 
    Remote work
    Senior
    Temporary work
    Relocation package

    Zoox

    Nacogdoches, TX
    1 day ago
  •  ...ML Infrastructure Engineer, Model Inference As an ML Infrastructure Engineer, Model Inference at Abridge...  ..., optimize, and maintain ML model serving infrastructure, ensuring high-...  ...such as NVIDIA Triton Server, VLLM, TRT-LLM and so on. Expertise with ML toolchains... 
    Remote work
    Hourly pay
    Full time
    Flexible hours

    Abridge

    United States
    3 days ago
  • $98k - $140k

     ...You'll work with product and engineering teams to build systems to define...  ...of that you'll shape Notion's model strategy and work directly...  ...exploring the "jagged frontier" of LLM capabilities and how AI...  ...working with data — You can self-serve insights from large datasets,... 
    Remote work
    Live in
    Local area

    Notion, LLC

    United States
    1 day ago
  •  ...NVIDIA Gruppe is looking for a skilled engineer to join their TensorRT Edge-LLM team in Santa Clara, California. The role involves developing a state-of-the-art inference framework for large language models and optimizing it for real-time performance on embedded platforms... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...A leading AI platform company in San Francisco is seeking a Software Engineer focused on machine learning performance. This role involves implementing advanced techniques for ML model inference and debugging performance issues with frameworks like PyTorch and TensorRT... 

    Baseten

    San Francisco, CA
    3 days ago
  • $314.8k - $359.3k

    Senior Distinguished Engineer, AI Compute (Remote Eligible) At Capital One, we are creating...  ...reimagine how we serve our customers and...  ...diverse workloads from LLM pre‑training and...  ...running large foundation models Work cross‑...  ...or high‑throughput inference Hands‑on experience... 
    Remote job
    Senior
    Full time
    Part time
    Local area

    Capital One

    Cambridge, MA
    1 day ago
  • $184k - $356.5k

    A leading AI computing company in California is seeking a Senior Deep Learning Software Engineer focused on performance optimization of LLM models. You will analyze and enhance LLM inference performance, working in cross-collaborative teams to implement cutting-edge algorithms... 
    Senior
    Full time

    NVIDIA Corporation

    Santa Clara, CA
    5 days ago
  • NTT Data Americas, Inc. is looking for an On-Premise LLM Inference & GPU Systems Engineer to enhance our large-scale LLM infrastructure in Charlotte,...  ...operating OpenShift AI, and overseeing the Hugging Face model lifecycle. Join us to build innovative solutions in an inclusive... 
    Senior

    NTT Data Americas, Inc.

    Charlotte, NC
    4 days ago
  •  ...LLMs, our proprietary models, and a sophisticated Agentic...  ...Moveworks' Reasoning Engine and natural language...  ...for building and serving LLM's at Moveworks. This role...  ...training and inference pipeline for large language...  ...Work personas (flexible, remote, or required in office... 
    Remote work
    Senior
    Work at office
    Flexible hours

    ServiceNow

    Mountain View, CA
    5 days ago
  •  ...leading training and inference speeds and empowers machine...  ...include top model labs, global enterprises...  ...We are hiring a Senior Performance Engineer to join our Product team...  ...performance and will serve as our resident expert...  ...vLLM, SGLang, TensorRT-LLM), GPU kernel-level... 
    Senior
    Contract work
    Shift work

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    4 days ago
  • $184k - $287.5k

    Senior DL Algorithms Engineer - Inference Performance page is loaded## Senior DL Algorithms Engineer...  ...Implement language and multimodal model inference as part of NVIDIA...  ...production code to TRT-LLM, NVIDIA’s open-source inference serving library.* Profile and analyze... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    5 days ago
  •  ...Distributed LLM Inference Engineer At Anyscale, we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We're commercializing Ray, a popular open-source project that's creating an ecosystem of libraries... 
    Remote work
    Work at office

    Anyscale

    United States
    3 days ago
  •  ...job poster from EDGE Engineering and Science, LLC...  ...Science We are seeking a Senior Air Dispersion Modeler to lead and manage...  ...helps the companies we serve, it ensures the...  ...Science, LLC by 2x Inferred from the description...  ...Actuarial Consultant - REMOTE United States $105,0... 
    Remote work
    Senior
    Full time
    For contractors
    Local area

    EDGE Engineering and Science, LLC

    New York, NY
    2 days ago
  •  ..., cloud, and platform engineering teams. Operationalize...  ...MLOps pipelines for model packaging, testing, deployment...  ...large language model (LLM) APIs and...  ...Experience with model serving, inference optimization, or AI platform...  ...experience REMOTE WORK NOTICE: This position... 
    Remote work
    Senior
    Work at office

    ARA

    Raleigh, NC
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Model Serving Engineer — LLM Inference, Remote. Be the first to apply!