Senior Model Serving Engineer — LLM Inference, Remote

$100k - $150k

Bright Vision Technologies

Remote job

Bright Vision Technologies is hiring a Model Serving Engineer to design, build, and operate high-performance inference platforms. This remote role requires 6+ years in distributed systems, and strong skills in Python and systems languages like Go, Rust, or C++. The engineer will focus on optimizing ML model serving performance and collaborating with teams to support new model releases. A competitive salary of $100K – $150K is offered, with additional benefits. #J-18808-Ljbffr Bright Vision Technologies

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the Senior Model Serving Engineer — LLM Inference, Remote in San Ramon, CA vacancy

Senior Model Serving Engineer - Remote AI Inference
...Bright Vision Technologies is seeking a Model Serving Engineer to design and operate high-performance inference platforms for large ML models. This remote role requires expertise in distributed systems, Python, and Kubernetes. The ideal candidate will have over 6 years...
Remote work
Senior
Full time
Bright Vision Technologies
Plano, TX
3 days ago
Senior Remote Model Serving Engineer for AI Inference
$100k - $150k
Bright Vision Technologies is seeking a Model Serving Engineer to design high-performance inference platforms for machine learning models. This remote position focuses on optimizing client service delivery and performance while collaborating with diverse teams. The ideal...
Remote job
Senior
Bright Vision Technologies
Woodbridge, NJ
4 days ago
Senior Remote Model Serving Engineer
$100k - $150k
...Bright Vision Technologies is looking for a Model Serving Engineer to design, build, and operate high-performance inference platforms. This role requires expertise in distributed... ...and Kubernetes. The position is fully remote and offers a competitive salary of $100,000...
Remote work
Senior
Bright Vision Technologies
Duluth, GA
1 hour ago
Remote Senior NLP/LLM Engineer AI Model Innovator
...Social Discovery Group seeks a Senior NLP/LLM Engineer to conduct experiments with AI models and optimize them for production. This remote full-time role requires strong knowledge in machine learning, NLP, and proficiency in Python. Key responsibilities include developing...
Remote work
Senior
Full time
Social Discovery Group
Poland, NY
1 hour ago
Senior Model Serving Engineer (Remote)
$100k - $150k
Bright Vision Technologies is seeking a Model Serving Engineer to design and operate high-performance inference platforms for large machine learning models. The role... ...GPU utilization. This full-time position is 100% remote for candidates in the Continental United States...
Remote job
Senior
Full time
Bright Vision Technologies
Katy, TX
2 days ago
Senior Engineer 2: AI Inference Engine Systems
$167.2k - $209k
...applications. We are seeking a Senior Engineer 2 to join our AI Inference Data Plane team. In... ...and scale their models with industry-leading... ...distributed inference serving frameworks such as llm‑d, NVIDIA Dynamo, or Ray... ...- $209,000 This is a remote role Why You’ll Like Working...
Remote work
Senior
Local area
Worldwide
Flexible hours
DigitalOcean
San Francisco, CA
3 days ago
Remote Model Serving Engineer - High-Performance ML Inference
Bright Vision Technologies is seeking a Model Serving Engineer to design and operate highly reliable inference platforms for large machine learning models. This remote full-time position requires strong expertise in distributed systems and performance engineering, offering...
Remote job
Full time
Bright Vision Technologies
Bellevue, WA
1 day ago
Senior AI and Large Language Model (LLM) Engineer
...Senior AI and Large Language Model (LLM) Engineer NIH-Bethesda Location Bethesda, Maryland (On-site / Not-remote) Overview We are seeking an experienced AI/LLM Engineer to lead... ...knowledge discovery tools. This role serves as a subject matter expert (SME)...
Senior
Immediate start
Black Canyon Consulting LLC
Bethesda, MD
6 days ago
Senior AI Research Engineer Model Inference Remote
...Job We are looking for an experienced AI Model Engineer with deep expertise in kernel... ...acceleration. The engineer will extend the inference framework to support inference and fine... ...functional teams to integrate optimized serving and inference frameworks into production...
Remote job
Senior
Framework Ventures
New York, NY
2 days ago
Senior Software Development Engineer - LLM Kernel & Inference Systems
...your career. THE ROLE As a senior member of the LLM inference framework team, you will be... ...for large language models on AMD GPUs. You will work... ...first‑class platform for LLM serving. This role sits at the intersection of inference engines, distributed systems, and GPU...
Senior
Advanced Micro Devices
Santa Clara, CA
1 day ago
Model Serving Engineer
...most powerful Large Tabular Model (LTM) – purpose-built for the... .... About the Role Our Serving team is responsible for turning... ...of research and production engineering. We work closely with... ...meaningfully from traditional LLM inference, including irregular computational...
Remote work
Work at office
Relocation package
Fundamental
United States
1 day ago
Model Serving Engineer
$100k - $150k
...Model Serving Engineer Bright Vision Technologies is a forward-thinking software... ...Engineer Location: 100% Remote (Continental United States)... ..., highly reliable inference platforms for serving large... ...and KV cache strategies for LLM serving workloads. Integrate...
Remote work
Full time
H1b
Local area
Immediate start
Visa sponsorship
Bright Vision Technologies
United States
3 days ago
Senior Model-Based Systems Engineer (Undersea Systems)
$145.2k - $252.48k
...Responsibilities We are seeking an expert Senior Model-Based Systems Engineer (MBSE) to lead the modeling and... ...requirements baselines. Serve as a subject matter expert in MBSE methodologies... ...role, geographic location (For Remote Opportunities), education and...
Remote work
Senior
Hourly pay
Contract work
Temporary work
For contractors
Work experience placement
Arcfield
Washington DC
3 days ago
Real-Time Inference & Model Serving Engineer (Equity)
$220k - $320k
...ML Model Serving Engineer Want to build the layer that actually makes AI usable in real time? You’ll join a team focused on inference, where performance is the product. This is about delivering low... ...-performance serving systems for LLM, speech, and vision models Scaling...
3 days per week
Trades Workforce Solutions
San Francisco, CA
4 days ago
Senior Security Engineer, AI Model and Application
$135k
...Senior Security Engineer AI Model And Application The Senior Security... ...Security Engineer will serve as the subject... ...expert (SME) in AI and LLM security across the... ...training pipelines to inference endpoints and user-facing... ...works on-site or remote based on the...
Remote work
Senior
Temporary work
Work at office
Monday to Friday
Flexible hours
ImmunityBio
United States
1 day ago
Product Manager AI Inference Model Serving
...Product Manager to own AI inference and model serving for k0rdent AI, our... ..., and performance engineering. You will define how... ...product management, or a senior technical role owning... ..., SGLang, TensorRT-LLM, Dynamo, Triton) and... ...job opportunities. #remote We are a Leader...
Remote work
Mirantis
Austin, TX
1 day ago
Senior AI Inference Engineer - Model Optimization & Deployment
$242k - $290k
...of a multi-modality foundation model to drive the next generation... ...Model Optimization & Deployment Engineer, you will focus on bringing... ..., and build highly concurrent inference code to ensure real-time, deterministic... ...technologies (e.g., TensorRT-LLM). $242,000 - $290,000 a...
Remote work
Senior
Temporary work
Relocation package
Zoox
Nacogdoches, TX
1 day ago
Machine Learning Infrastructure Engineer- Model Inference
...ML Infrastructure Engineer, Model Inference As an ML Infrastructure Engineer, Model Inference at Abridge... ..., optimize, and maintain ML model serving infrastructure, ensuring high-... ...such as NVIDIA Triton Server, VLLM, TRT-LLM and so on. Expertise with ML toolchains...
Remote work
Hourly pay
Full time
Flexible hours
Abridge
United States
3 days ago
Model Behavior Engineer
$98k - $140k
...You'll work with product and engineering teams to build systems to define... ...of that you'll shape Notion's model strategy and work directly... ...exploring the "jagged frontier" of LLM capabilities and how AI... ...working with data — You can self-serve insights from large datasets,...
Remote work
Live in
Local area
Notion, LLC
United States
1 day ago
Senior Edge-LLM Real-Time Inference Engineer
...NVIDIA Gruppe is looking for a skilled engineer to join their TensorRT Edge-LLM team in Santa Clara, California. The role involves developing a state-of-the-art inference framework for large language models and optimizing it for real-time performance on embedded platforms...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
LLM Inference & Model-Performance Engineer
...A leading AI platform company in San Francisco is seeking a Software Engineer focused on machine learning performance. This role involves implementing advanced techniques for ML model inference and debugging performance issues with frameworks like PyTorch and TensorRT...
Baseten
San Francisco, CA
3 days ago
Senior Distinguished Engineer, AI Compute (Remote Eligible)
$314.8k - $359.3k
Senior Distinguished Engineer, AI Compute (Remote Eligible) At Capital One, we are creating... ...reimagine how we serve our customers and... ...diverse workloads from LLM pre‑training and... ...running large foundation models Work cross‑... ...or high‑throughput inference Hands‑on experience...
Remote job
Senior
Full time
Part time
Local area
Capital One
Cambridge, MA
1 day ago
Senior LLM Performance Engineer - GPU Inference
$184k - $356.5k
A leading AI computing company in California is seeking a Senior Deep Learning Software Engineer focused on performance optimization of LLM models. You will analyze and enhance LLM inference performance, working in cross-collaborative teams to implement cutting-edge algorithms...
Senior
Full time
NVIDIA Corporation
Santa Clara, CA
5 days ago
Senior On-Prem LLM Inference & GPU Systems Engineer
NTT Data Americas, Inc. is looking for an On-Premise LLM Inference & GPU Systems Engineer to enhance our large-scale LLM infrastructure in Charlotte,... ...operating OpenShift AI, and overseeing the Hugging Face model lifecycle. Join us to build innovative solutions in an inclusive...
Senior
NTT Data Americas, Inc.
Charlotte, NC
4 days ago
Senior Machine Learning Engineer, Agentic Systems - Moveworks
...LLMs, our proprietary models, and a sophisticated Agentic... ...Moveworks' Reasoning Engine and natural language... ...for building and serving LLM's at Moveworks. This role... ...training and inference pipeline for large language... ...Work personas (flexible, remote, or required in office...
Remote work
Senior
Work at office
Flexible hours
ServiceNow
Mountain View, CA
5 days ago
Senior Performance Engineer, Inference
...leading training and inference speeds and empowers machine... ...include top model labs, global enterprises... ...We are hiring a Senior Performance Engineer to join our Product team... ...performance and will serve as our resident expert... ...vLLM, SGLang, TensorRT-LLM), GPU kernel-level...
Senior
Contract work
Shift work
CEREBRAS SYSTEMS INC.
Sunnyvale, CA
4 days ago
Senior DL Algorithms Engineer - Inference Performance
$184k - $287.5k
Senior DL Algorithms Engineer - Inference Performance page is loaded## Senior DL Algorithms Engineer... ...Implement language and multimodal model inference as part of NVIDIA... ...production code to TRT-LLM, NVIDIA’s open-source inference serving library.* Profile and analyze...
Senior
NVIDIA Corporation
Santa Clara, CA
5 days ago
Distributed LLM Inference Engineer
...Distributed LLM Inference Engineer At Anyscale, we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We're commercializing Ray, a popular open-source project that's creating an ecosystem of libraries...
Remote work
Work at office
Anyscale
United States
3 days ago
Senior Air Dispersion Model
...job poster from EDGE Engineering and Science, LLC... ...Science We are seeking a Senior Air Dispersion Modeler to lead and manage... ...helps the companies we serve, it ensures the... ...Science, LLC by 2x Inferred from the description... ...Actuarial Consultant - REMOTE United States $105,0...
Remote work
Senior
Full time
For contractors
Local area
EDGE Engineering and Science, LLC
New York, NY
2 days ago
Senior AI Systems Engineer
..., cloud, and platform engineering teams. Operationalize... ...MLOps pipelines for model packaging, testing, deployment... ...large language model (LLM) APIs and... ...Experience with model serving, inference optimization, or AI platform... ...experience REMOTE WORK NOTICE: This position...
Remote work
Senior
Work at office
ARA
Raleigh, NC
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Model Serving Engineer — LLM Inference, Remote. Be the first to apply!