Senior Model Serving Engineer — LLM Inference, Remote
$100k - $150kBright Vision Technologies
- Remote job
Bright Vision Technologies is hiring a Model Serving Engineer to design, build, and operate high-performance inference platforms. This remote role requires 6+ years in distributed systems, and strong skills in Python and systems languages like Go, Rust, or C++. The engineer will focus on optimizing ML model serving performance and collaborating with teams to support new model releases. A competitive salary of $100K – $150K is offered, with additional benefits. #J-18808-Ljbffr Bright Vision Technologies
- ...Bright Vision Technologies is seeking a Model Serving Engineer to design and operate high-performance inference platforms for large ML models. This remote role requires expertise in distributed systems, Python, and Kubernetes. The ideal candidate will have over 6 years...Remote workSeniorFull time
$100k - $150k
Bright Vision Technologies is seeking a Model Serving Engineer to design high-performance inference platforms for machine learning models. This remote position focuses on optimizing client service delivery and performance while collaborating with diverse teams. The ideal...Remote jobSenior$100k - $150k
...Bright Vision Technologies is looking for a Model Serving Engineer to design, build, and operate high-performance inference platforms. This role requires expertise in distributed... ...and Kubernetes. The position is fully remote and offers a competitive salary of $100,000...Remote workSenior- ...Social Discovery Group seeks a Senior NLP/LLM Engineer to conduct experiments with AI models and optimize them for production. This remote full-time role requires strong knowledge in machine learning, NLP, and proficiency in Python. Key responsibilities include developing...Remote workSeniorFull time
$100k - $150k
Bright Vision Technologies is seeking a Model Serving Engineer to design and operate high-performance inference platforms for large machine learning models. The role... ...GPU utilization. This full-time position is 100% remote for candidates in the Continental United States...Remote jobSeniorFull time$167.2k - $209k
...applications. We are seeking a Senior Engineer 2 to join our AI Inference Data Plane team. In... ...and scale their models with industry-leading... ...distributed inference serving frameworks such as llm‑d, NVIDIA Dynamo, or Ray... ...- $209,000 This is a remote role Why You’ll Like Working...Remote workSeniorLocal areaWorldwideFlexible hours- Bright Vision Technologies is seeking a Model Serving Engineer to design and operate highly reliable inference platforms for large machine learning models. This remote full-time position requires strong expertise in distributed systems and performance engineering, offering...Remote jobFull time
- ...Senior AI and Large Language Model (LLM) Engineer NIH-Bethesda Location Bethesda, Maryland (On-site / Not-remote) Overview We are seeking an experienced AI/LLM Engineer to lead... ...knowledge discovery tools. This role serves as a subject matter expert (SME)...SeniorImmediate start
- ...Job We are looking for an experienced AI Model Engineer with deep expertise in kernel... ...acceleration. The engineer will extend the inference framework to support inference and fine... ...functional teams to integrate optimized serving and inference frameworks into production...Remote jobSenior
- ...your career. THE ROLE As a senior member of the LLM inference framework team, you will be... ...for large language models on AMD GPUs. You will work... ...first‑class platform for LLM serving. This role sits at the intersection of inference engines, distributed systems, and GPU...Senior
- ...most powerful Large Tabular Model (LTM) – purpose-built for the... .... About the Role Our Serving team is responsible for turning... ...of research and production engineering. We work closely with... ...meaningfully from traditional LLM inference, including irregular computational...Remote workWork at officeRelocation package
$100k - $150k
...Model Serving Engineer Bright Vision Technologies is a forward-thinking software... ...Engineer Location: 100% Remote (Continental United States)... ..., highly reliable inference platforms for serving large... ...and KV cache strategies for LLM serving workloads. Integrate...Remote workFull timeH1bLocal areaImmediate startVisa sponsorship$145.2k - $252.48k
...Responsibilities We are seeking an expert Senior Model-Based Systems Engineer (MBSE) to lead the modeling and... ...requirements baselines. Serve as a subject matter expert in MBSE methodologies... ...role, geographic location (For Remote Opportunities), education and...Remote workSeniorHourly payContract workTemporary workFor contractorsWork experience placement$220k - $320k
...ML Model Serving Engineer Want to build the layer that actually makes AI usable in real time? You’ll join a team focused on inference, where performance is the product. This is about delivering low... ...-performance serving systems for LLM, speech, and vision models Scaling...3 days per week$135k
...Senior Security Engineer AI Model And Application The Senior Security... ...Security Engineer will serve as the subject... ...expert (SME) in AI and LLM security across the... ...training pipelines to inference endpoints and user-facing... ...works on-site or remote based on the...Remote workSeniorTemporary workWork at officeMonday to FridayFlexible hours- ...Product Manager to own AI inference and model serving for k0rdent AI, our... ..., and performance engineering. You will define how... ...product management, or a senior technical role owning... ..., SGLang, TensorRT-LLM, Dynamo, Triton) and... ...job opportunities. #remote We are a Leader...Remote work
$242k - $290k
...of a multi-modality foundation model to drive the next generation... ...Model Optimization & Deployment Engineer, you will focus on bringing... ..., and build highly concurrent inference code to ensure real-time, deterministic... ...technologies (e.g., TensorRT-LLM). $242,000 - $290,000 a...Remote workSeniorTemporary workRelocation package- ...ML Infrastructure Engineer, Model Inference As an ML Infrastructure Engineer, Model Inference at Abridge... ..., optimize, and maintain ML model serving infrastructure, ensuring high-... ...such as NVIDIA Triton Server, VLLM, TRT-LLM and so on. Expertise with ML toolchains...Remote workHourly payFull timeFlexible hours
$98k - $140k
...You'll work with product and engineering teams to build systems to define... ...of that you'll shape Notion's model strategy and work directly... ...exploring the "jagged frontier" of LLM capabilities and how AI... ...working with data — You can self-serve insights from large datasets,...Remote workLive inLocal area- ...NVIDIA Gruppe is looking for a skilled engineer to join their TensorRT Edge-LLM team in Santa Clara, California. The role involves developing a state-of-the-art inference framework for large language models and optimizing it for real-time performance on embedded platforms...Senior
- ...A leading AI platform company in San Francisco is seeking a Software Engineer focused on machine learning performance. This role involves implementing advanced techniques for ML model inference and debugging performance issues with frameworks like PyTorch and TensorRT...
$314.8k - $359.3k
Senior Distinguished Engineer, AI Compute (Remote Eligible) At Capital One, we are creating... ...reimagine how we serve our customers and... ...diverse workloads from LLM pre‑training and... ...running large foundation models Work cross‑... ...or high‑throughput inference Hands‑on experience...Remote jobSeniorFull timePart timeLocal area$184k - $356.5k
A leading AI computing company in California is seeking a Senior Deep Learning Software Engineer focused on performance optimization of LLM models. You will analyze and enhance LLM inference performance, working in cross-collaborative teams to implement cutting-edge algorithms...SeniorFull time- NTT Data Americas, Inc. is looking for an On-Premise LLM Inference & GPU Systems Engineer to enhance our large-scale LLM infrastructure in Charlotte,... ...operating OpenShift AI, and overseeing the Hugging Face model lifecycle. Join us to build innovative solutions in an inclusive...Senior
- ...LLMs, our proprietary models, and a sophisticated Agentic... ...Moveworks' Reasoning Engine and natural language... ...for building and serving LLM's at Moveworks. This role... ...training and inference pipeline for large language... ...Work personas (flexible, remote, or required in office...Remote workSeniorWork at officeFlexible hours
- ...leading training and inference speeds and empowers machine... ...include top model labs, global enterprises... ...We are hiring a Senior Performance Engineer to join our Product team... ...performance and will serve as our resident expert... ...vLLM, SGLang, TensorRT-LLM), GPU kernel-level...SeniorContract workShift work
$184k - $287.5k
Senior DL Algorithms Engineer - Inference Performance page is loaded## Senior DL Algorithms Engineer... ...Implement language and multimodal model inference as part of NVIDIA... ...production code to TRT-LLM, NVIDIA’s open-source inference serving library.* Profile and analyze...Senior- ...Distributed LLM Inference Engineer At Anyscale, we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We're commercializing Ray, a popular open-source project that's creating an ecosystem of libraries...Remote workWork at office
- ...job poster from EDGE Engineering and Science, LLC... ...Science We are seeking a Senior Air Dispersion Modeler to lead and manage... ...helps the companies we serve, it ensures the... ...Science, LLC by 2x Inferred from the description... ...Actuarial Consultant - REMOTE United States $105,0...Remote workSeniorFull timeFor contractorsLocal area
- ..., cloud, and platform engineering teams. Operationalize... ...MLOps pipelines for model packaging, testing, deployment... ...large language model (LLM) APIs and... ...Experience with model serving, inference optimization, or AI platform... ...experience REMOTE WORK NOTICE: This position...Remote workSeniorWork at office
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Model Serving Engineer — LLM Inference, Remote. Be the first to apply!
- senior brand designer San Ramon, CA
- senior business analyst contract San Ramon, CA
- senior database analyst San Ramon, CA
- legal senior counsel family office San Ramon, CA
- senior aws cloud engineer San Ramon, CA
- senior financial analyst remote San Ramon, CA
- senior accountant San Ramon, CA
- senior vmware engineer San Ramon, CA
- senior consulting engineer San Ramon, CA
- senior human factors engineer San Ramon, CA

