Staff ML Inference Engineer — Model Efficiency (Remote)
Jaide Health
- Remote job
Jaide Health is seeking an engineer for their Model Efficiency team in San Francisco. The role focuses on building reliable ML systems while enhancing core performance metrics across model execution. You'll work with advanced performance techniques such as GPU/CUDA optimizations and collaborate closely with modeling and systems teams. Ideal candidates will have over 5 years of experience in high-performance coding, plus strong skills in C++ or Python and insights into the LLM inference ecosystem. A commitment to diversity and inclusive work culture is celebrated. #J-18808-Ljbffr Jaide Health
- ...Francisco is seeking a Member of Technical Staff specialized in Model Efficiency. In this role, you will enhance LLM inference systems by tackling performance issues and collaborating... ...environment. This position offers a remote-friendly work model, a competitive salary,...Remote work
- ...cutting-edge foundation AI models and end-to-end products that... ...is a team of researchers, engineers, designers, and more, who are... ...systems and optimize audio inference serving efficiency using innovative techniques... ...and London. We embrace a remote-friendly environment, and as...Remote workWork at office
- ...ML Infrastructure Engineer, Model Inference As an ML Infrastructure Engineer, Model Inference at Abridge, you'll play a pivotal role in building and... ...will be instrumental in enhancing the scalability, efficiency, and performance of our AI-driven solutions. You will...Remote workHourly payFull timeFlexible hours
- Cohere is seeking an engineering professional in New York to develop and optimize audio machine... ...with cross-functional teams to improve audio model metrics, addressing latency and throughput while ensuring real-time audio inference integration. The ideal candidate will...Remote job
- ...100x better job search engine: fast, comprehensive, honest... ...looking for a founding ML engineer who can help... ...powerful AI and ML models into fast, reliable production... ...models, optimizing inference latency and throughput,... ...sure our models run efficiently in production. This is...SuggestedRelocation package
- ...Model Efficiency Team Engineer Cohere is the leading security-first enterprise AI company... ...focused on building reliable ML systems and pushing the boundaries of LLM inference efficiency. We develop... ...Where We Work: Cohere is remote-friendly. We have offices in...Remote workWork at office
- Member of Technical Staff, Model Efficiency Who are we? Our mission... ...team of researchers, engineers, designers, and more,... ...on building reliable ML systems and pushing the boundaries of LLM inference efficiency. We... ..., Seoul, and London. Remote‑friendly environment,...Remote workFull timeWork at officeFlexible hours
$170k - $216k
...Machine Learning Engineer, Model Optimization Waymo is an... ...) develop methods for efficiently and continuously learning... ...training and model inference through model architecture... ...~ Experience with ML frameworks like PyTorch... ...role can be performed remote, the specific salary...Remote workFull time$128.7k - $261.3k
...repeatable, high-velocity model deployments through... ...deployment and infra engineers to ship numerically robust... ..., Data Science / ML, or a closely related... .../ model compression / efficient inference or relevant experience... ...This role is based remotely, but if the selected candidate...Remote workLocal areaWork from homeRelocation packageFlexible hours- ...in San Francisco is seeking a Staff Research Engineer to enhance the efficiency of large language models. In this role, you will... ...with model architecture and inference optimization. Join a diverse... ...innovation within a collaborative and remote-friendly work culture,...Remote work
$50k - $60k
Apex Systems is hiring a Principal Machine Learning Engineer for Model Efficiency & Optimization in Austin, Texas. This senior individual contributor role involves overseeing model optimization strategy and ensuring high-performing, production-ready models for document...- ...deploying frontier models for developers and... ...of researchers, engineers, designers, and more... ...can do — but inference is still the bottleneck. The Model Efficiency team is responsible... ...London. We embrace a remote-friendly environment... ...locations. As a Staff Research Engineer,...Remote workFull timeWork at officeFlexible hours
$242k - $290k
...multi-modality foundation model to drive the next... ...Optimization & Deployment Engineer, you will focus on bringing highly efficient, production-ready large-... .... You will optimize the ML models, write custom CUDA... ...build highly concurrent inference code to ensure real-time...Remote workTemporary workRelocation package$155.42k - $205.9k
...About the Team: The ML Inference Platform is part of... ...agnostic, reliable, and cost-efficient platform that powers... ...) machine learning models for experimental, online... ...ML Infrastructure engineer to help build and scale... ...relocation benefits. Remote/Hybrid: This role is...Remote workLocal areaWork from homeRelocationRelocation packageFlexible hours- A healthcare technology firm in San Francisco is seeking an ML Infrastructure Engineer, Model Inference to build and optimize AI-driven solutions. You will design scalable Kubernetes clusters, enhance ML model serving infrastructure, and collaborate with cross-functional...
- ...automotive company seeks a Senior ML Infrastructure Engineer in Austin, Texas, to... ...backend software for ML inference workflows. The engineer will... ...ML engineers to ensure efficient model serving and lead technical... ...compensation and benefits, with a remote work option. #J-18808-...Remote work
$180k - $275k
...builds foundation models of human behavior... ...from the ground up. Engineers here own major... ...Databricks Fully remote or hybrid from several... ...focused on Inference and Serving at Yobi... ...This is an applied ML systems role-equal... ...caching, batching, and efficient feature retrieval....Remote work$150k - $300k
...systems as part of a hybrid team. This role focuses on developing efficient architecture for serving LLMs and optimizing performance using... ...infrastructure tools. Ideal candidates will have significant experience with ML systems, ensuring robust performance and scalability. The...Remote job$180k - $210k
...Overview: The Principal AI/ML Engineer will support the development... ...learning, and large language models. We offer generous... ...variety of applications within remote sensing such as tasking collections... ...engineering techniques / Inference time techniques (e.g. chain of...Remote workTemporary workWork at officeLocal areaVisa sponsorshipRelocation packageFlexible hours- ...technical Product Manager to own AI inference and model serving for k0rdent AI, our... ...systems, and performance engineering. You will define how... ...senior technical role owning AI/ML and inference product(s) ~... ...future job opportunities. #remote We are a Leader for Container...Remote work
- ...Overview: The Principal AI/ML Engineer will support the development... ...learning, and large language models. We offer generous... ...variety of applications within remote sensing such as tasking collections... ...engineering techniques / Inference time techniques (e.g. chain of...Remote workTemporary workWork at officeLocal areaVisa sponsorshipRelocation packageFlexible hours
$180k - $210k
...Position Overview The Principal AI/ML Engineer will support the development... ...learning, and large language models. We offer generous... ...variety of applications within remote sensing such as tasking collections... ...engineering techniques / Inference time techniques (e.g. chain of...Remote workFull timeTemporary workWork at officeLocal areaVisa sponsorshipRelocation packageFlexible hours- ...seeking a Senior Machine Learning Engineer to spearhead core machine learning models and manage data pipelines. The ideal... ...strong technical skills in ML methods, including deep learning,... ...concepts for various stakeholders. A remote work option is available. #J-18808...Remote job
- ...our Machine Learning and Inference Platform that powers... ...hardware, software, and models. We're looking for a strong... ...deep experience in ML serving, high-performance... ...excited to mentor engineers, innovate at scale, and... ...Fridays are flexible for remote work except for employees...Remote workWork at officeLocal areaMonday to ThursdayFlexible hours
$175k - $280k
...New York is seeking an expert in optimizing machine learning models to turbocharge their serving layer, integrating LLM, speech, and... ...significant experience in systems programming and performance engineering, aiming to improve high-throughput, low-latency serving. Join...$128.7k - $261.3k
...Model Deployment & Inference Solutions Team The Model Deployment & Inference Solutions... ...is two-fold: build the ML deployment platform that... ...currently performed manually by engineers. Build the developer... ...AV-1 This role is based remotely, but if the selected candidate...Remote workLocal areaWork from homeFlexible hoursShift work- # Principal Machine Learning Engineer - Model Efficiency & OptimizationApply**Job#: 3036752****Job Description:**Principal Machine Learning Engineer - Model Efficiency & Optimization**Location:** Austin, Texas (Onsite)Role OverviewWe are seeking a Principal Machine Learning...Full time
- ...leading technology company is seeking a skilled ML Engineer responsible for developing and maintaining data pipelines for model training and evaluation. Candidates should... ...competitive compensation, the opportunity to work remotely from anywhere in the world, and access to...Remote job
- Israelvcforum is looking for a Senior ML Infrastructure Engineer in Mountain View,... ...robust platforms for ML inference workflows supporting GM’s AI... ...and researchers to implement model serving strategies and... ...skills. The role offers a remote work setup with required visits...Remote job
$50 per hour
...branch of Sony AI, is a remotely distributed... ...Multimodal Foundation Model for Vision... ...intern is to develop efficient and effective methodologies... ...-class scientists and engineers to tackle the most challenging... ...on model compression, inference speedup, deployement on...Remote workHourly payInternshipLocal areaWorldwideFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff ML Inference Engineer — Model Efficiency (Remote). Be the first to apply!
- staff automation engineer San Francisco, CA
- staff data engineer San Francisco, CA
- research assistant engineering San Francisco, CA
- assistant engineer San Francisco, CA
- staff engineer San Francisco, CA
- assistant mechanical engineer San Francisco, CA
- software engineer staff San Francisco, CA
- assistant engineering manager San Francisco, CA
- senior staff systems engineer San Francisco, CA
- assistant civil engineer San Francisco, CA

