Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Framework Engineer (MetalLM) for GPU Inference

Apple Oakbrook

Apple Inc. in Cupertino, California, is seeking an experienced ML Framework Engineer to join their Server ML Frameworks team. This role focuses on enabling Apple Intelligence through high-performance ML applications and involves working on custom-built server hardware for distributed inference. The ideal candidate will have a strong programming background and expertise in GPU compute, with responsibilities including optimizing ML frameworks and collaborating on GPU architecture design. Competitive benefits and pay are offered. #J-18808-Ljbffr

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the ML Framework Engineer (MetalLM) for GPU Inference in Cupertino, CA vacancy
  • $147.4k - $272.1k

     ...ML Framework (MetalLM) Engineer, Graphics, Game and ML Cupertino, California, United States Apple’s Server ML Frameworks team in GPU, Graphics and Machine Learning works on enabling Apple Intelligence...  ...high-performance, distributed inference of GenAI applications (such as... 
    Suggested
    Relocation package

    Apple

    Cupertino, CA
    10 hours ago
  •  ...About The Role The Inference ML Engineering team at Cerebras Systems is dedicated to enabling our fast...  ...models. Familiarity with LLM serving frameworks, such as vLLM, SGLang, and TensorRT‑LLM...  ...beyond the constraints of the GPU. Publish and open source their cutting... 
    Suggested

    Dormont Manufacturing Company

    Sunnyvale, CA
    1 day ago
  •  ...Inference Optimization MLE At Rhoda AI, we're building...  ...closely with research engineers to translate model innovations...  ...optimization, ML systems, or a closely related...  ...with inference serving frameworks (e.g., Triton, TensorRT...  ...Not Required) GPU kernel or compiler-... 
    Suggested

    Rhoda ai

    Palo Alto, CA
    3 days ago
  •  ...100x better job search engine: fast, comprehensive, honest...  ...looking for a founding ML engineer who can help...  ...models, optimizing inference latency and throughput,...  ...of model performance, GPU utilization, inference...  ...worked with inference frameworks or serving stacks such... 
    Suggested
    Full time
    Relocation package

    HiringCafe

    Cupertino, CA
    3 days ago
  •  ...that a reality. We're looking for an ML Infrastructure Engineer to help build and operate the inference systems that power our automation...  ...cloud providers (e.g., AWS, GCP) and GPU orchestration Familiarity with common ML frameworks (e.g., PyTorch, TensorFlow) and model... 
    Suggested

    Rhoda AI

    Palo Alto, CA
    3 days ago
  • $128.7k - $261.3k

     ...The Model Deployment & Inference Solutions team in GM AV...  ...learning models from training frameworks (e.g. PyTorch) onto...  ...is two-fold: build the ML deployment platform...  ...equivalent) as part of your engineering workflow. Experience...  ...with the NVIDIA GPU stack at the integration... 
    Flexible hours

    General Motors

    Sunnyvale, CA
    1 day ago
  •  ...hiring a Machine Learning Systems Engineer in Cupertino, California. You...  ...optimize model training and inference on Apple's custom Silicon....  ...has strong experience in ML models, with proficiency in Python...  ...and knowledge of various ML frameworks. The role offers competitive... 

    Apple

    Cupertino, CA
    11 hours ago
  • $195k - $298k

     ...assistance. About the Team The ML Inference Platform is part of the AI...  ...We’re committed to maximizing GPU utilization across platforms...  ...seeking a Staff ML Infrastructure engineer to help build and scale...  ...state-of-the‑art model serving frameworks, hardware accelerators, and distributed... 
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    11 hours ago
  • $272k - $431.25k

     ...We are seeking a Principal AI and ML Infra Software Engineer, GPU Clusters at NVIDIA to join our Hardware...  ...developments in AI/ML technologies, frameworks, and successful strategies, and...  ...data processing, model training, and inference pipelines. Proficiency in programming... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $128.7k - $261.3k

     ...export, kernel development, and performance engineering so that every cycle on our accelerators...  ...The AI Kernels team builds high‑performance GPU kernels and custom libraries that sit at the heart of our on‑vehicle ML inference for ADAS and autonomous driving. We own making... 
    Full time
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    2 hours ago
  •  ...Software Engineer Applied Intuition, Inc. is powering...  ...in optimizing ML models and deploying them...  ...work across the entire ML framework stack (e.g. PyTorch, JAX...  ...efficiency and latency of model inference for compute boards...  ...with ML accelerators, GPU, CPU, SoC architecture... 
    For contractors
    For subcontractor
    Casual work
    Work at office
    Remote work
    Day shift

    Applied Intuition

    Sunnyvale, CA
    4 days ago
  • $246.5k

     ...our Machine Learning and Inference Platform that powers the...  ...with deep experience in ML serving, high-performance...  ...computing, and industry standard frameworks - someone excited to mentor engineers, innovate at scale, and...  ...-software co-design, GPU acceleration, and HPC techniques... 
    Work at office
    Local area
    Remote work
    Monday to Thursday
    Flexible hours

    Roku

    San Jose, CA
    4 days ago
  • $128.7k - $261.3k

     ...development, and performance engineering so that every cycle on...  ...into fast, reliable inference across GPUs powering GM...  ...compiler, systems, and GPU engineers who enjoy...  ...reliable, and effortless for ML engineers across the AV...  ...~ Experience with ML frameworks (e.g.,PyTorch,... 
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    3 days ago
  •  ...Senior / Staff Software Engineer with strong distributed...  ...Apple’s unified data+ML platform powered by open...  ...large-scale training and inference workloads. You will...  ...environments—from bare-metal GPU clusters to cloud-...  ...~ Experience with ML frameworks such as PyTorch or TensorFlow... 

    Apple

    Cupertino, CA
    2 days ago
  • $152k - $287.5k

     ...Machine Learning Applications and Compiler Engineer in Santa Clara, California. This role...  ...involves developing algorithms for their LPX inference and compiler stack, optimizing the...  ...skills, and familiarity with deep learning frameworks. The position offers a competitive... 

    NVIDIA Gruppe

    Santa Clara, CA
    10 hours ago
  • $124k - $195.5k

     ...Machine Learning Applications and Compiler Engineer for New College Grad 2026 in Santa...  ...will focus on developing algorithms for inference and compiler stack optimizations, working...  ...development, and experience with ML frameworks like TensorFlow and PyTorch. A MS or PhD... 

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $278.1k - $347.6k

     ...Principal Machine Learning Engineer, Mobile AI Inference Optimization Location...  ...decisions across the full mobile ML stack, and mentor a team of...  ...kernel tuning on NPU, GPU, and CPU. Architecture &...  ...to open-source ML inference frameworks or mobile ML research publications... 
    Work at office
    Worldwide
    Relocation package

    Unity Technologies

    Mountain View, CA
    3 days ago
  •  ...A leading automotive company is seeking a Staff ML Infrastructure Engineer to build robust compute platforms for machine learning workflows in...  ...strong coding skills in Go, Python or C++, and expertise in ML inference. The position offers a hybrid work model and competitive... 

    General Motors

    Sunnyvale, CA
    10 hours ago
  •  ...Rhoda ai in Palo Alto is seeking an Inference Infrastructure Engineer to help power their model deployment stack for humanoid robots. This role involves...  ...deployment pipelines and resource optimization across GPU clusters, you will play a crucial role in scaling the technology... 

    Rhoda ai

    Palo Alto, CA
    1 day ago
  •  ...industry‑leading training and inference speeds and empowers...  ...run large‑scale ML applications, without the...  ...over 10 times faster than GPU‑based hyperscale cloud...  ...versatile and experienced engineer to join our SOTA...  ...Experience with deep learning frameworks (e.g., PyTorch,... 
    Internship

    Cerebras

    Sunnyvale, CA
    2 days ago
  • Lemurian Labs is seeking a Senior ML Performance Engineer in Santa Clara, California, to architect and...  ..., a deep understanding of ML inference workloads, and strong programming skills...  ...opportunity to work with cutting-edge GPU hardware and next-gen large language models... 

    Lemurian Labs

    Santa Clara, CA
    1 day ago
  •  ...industry‑leading training and inference speeds and empowers...  ...run large‑scale ML applications, without the...  ...over 10 times faster than GPU‑based hyperscale cloud...  ...looking for a Software Engineer to join the ML Integration...  ...tools, testing frameworks, or internal developer... 
    Work at office
    Remote work

    Dormont Manufacturing Company

    Sunnyvale, CA
    11 hours ago
  •  ...About the job ML Engineer Our Client Is a rapidly growing Tier 1 VC backed...  ...preprocessing, model training, deployment, inference, and monitoring in production...  ...plus). ~ Hands-on experience with ML frameworks such as PyTorch or TensorFlow. ~ Familiarity... 
    Full time

    Catalyst Labs, LLC

    Sunnyvale, CA
    1 day ago
  • $129k - $198.4k

     ...General Motors is seeking an AI/ML Engineer for the Metrics Frameworks team in Sunnyvale, California. The successful candidate will focus on developing analytics frameworks and tools to accelerate autonomous vehicle development and testing. Candidates should have a BS... 

    General Motors

    Sunnyvale, CA
    10 hours ago
  •  ...industry-leading training and inference speeds and empowers machine learning...  ...effortlessly run large-scale ML applications, without the...  ...world, over 10 times faster than GPU-based hyperscale cloud...  ...cause investigation. Build frameworks for failure classification, regression... 

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    1 day ago
  • $129k - $198.4k

     ...Job Description Role: As an AI/ML Engineer on the Metrics Frameworks team, part of the Simulation, Evaluation, and Data organization, you will be an individual contributor focused on developing and optimizing infrastructure to accelerate autonomous vehicle development... 
    Local area
    Work from home

    General Motors

    Sunnyvale, CA
    2 days ago
  • $184k - $287.5k

     ...workflows that chain model inference, retrieval, and...  ...correct? Build golden-set frameworks and calibration loops for...  ...are candidates for ML replacement and build the...  ...Computer Science, Computer Engineering, or a related technical...  ...JAX. Comfortable with GPU‑based training... 

    NVIDIA Gruppe

    Santa Clara, CA
    10 hours ago
  • $147.4k - $272.1k

     ...quality user‑centric search and data platform, and the primary inference platform that enable next generation user experiences for...  ...are in search of an accomplished and driven Machine Learning Engineer who has a robust understanding of Large Language Models, Generative... 
    Relocation

    Apple

    Cupertino, CA
    11 hours ago
  • $152k - $287.5k

     ...NVIDIA Gruppe, based in Santa Clara, is seeking a Senior Software Engineer to accelerate the development of machine learning innovations. In this role, you'll design and implement solutions for GPU clusters, enabling researchers to optimize their work. Strong expertise... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $144.7k - $261.3k

     ...environments, cloud infrastructure, and ML/AI GPU platforms for AV research and development...  ...GM is looking for a Senior Performance Engineer to join the AV Capacity and Performance...  ...reliability of large-scale ML training and inference environments. Your s kills &... 
    Local area
    Remote work
    Work from home
    Flexible hours
    3 days per week

    General Motors

    Sunnyvale, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Framework Engineer (MetalLM) for GPU Inference. Be the first to apply!