Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Systems Engineer: Production-Scale LLM Inference

ScOp Venture Capital

ScOp Venture Capital is looking for an ML Systems Engineer to optimize LLM inference systems crucial for their AI platform. The role focuses on enhancing performance and efficiency via low-level systems optimization, directly impacting industry leader processes in semiconductor design. A successful candidate will have a strong background in ML systems, GPU optimization, and programming skills in Python and C++. The position offers competitive compensation and professional growth within a leading AI-focused environment. #J-18808-Ljbffr ScOp Venture Capital

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the ML Systems Engineer: Production-Scale LLM Inference in Santa Clara, CA vacancy
  •  ...generative AI to assist engineers in RTL design,...  ...is deployed in production to companies that...  ...We are seeking an ML Systems Engineer to optimize...  ...large language model inference powering our...  ...push the limits of LLM throughput and latency...  ...with large‑scale ML systems, GPU computing... 
    Suggested

    ScOp Venture Capital

    Santa Clara, CA
    2 days ago
  •  ...technology company is hiring a Machine Learning Systems Engineer in Cupertino, California. You will...  ...modeling teams to optimize model training and inference on Apple's custom Silicon. The ideal candidate has strong experience in ML models, with proficiency in Python and... 
    Suggested

    Apple

    Cupertino, CA
    4 days ago
  • $152k - $287.5k

    NVIDIA Gruppe is seeking a Senior Machine Learning Applications and Compiler Engineer in Santa Clara, California. This role involves developing algorithms for their LPX inference and compiler stack, optimizing the performance of neural network workloads on NVIDIA platforms... 
    Suggested

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $147.4k - $272.1k

     ...Description As a Machine Learning Systems Engineer, you will work closely...  ...model training and inference. You will be working across the ML stack at Apple, finding...  ...teams. You will write production-level code to train and...  ...projects Expertise in ML and LLM optimization such as... 
    Suggested
    Relocation

    Apple

    Cupertino, CA
    3 days ago
  • $207k - $300k

    Google Inc. is seeking a Software Engineer in Sunnyvale, CA, to develop cutting-edge technologies for serving Large Language Models. This...  ...candidate will have extensive experience in software development, ML infrastructure, and performance profiling. The US base salary... 
    Suggested
    Full time

    Google Inc.

    Sunnyvale, CA
    3 days ago
  •  ...infrastructure company in California is seeking a Member of Technical Staff — Inference to design and optimize large-scale AI inference systems. The role demands 5+ years in systems engineering and expertise in large-scale inference systems. Successful candidates will... 
    Flexible hours

    RadixArk

    Palo Alto, CA
    14 hours ago
  • Cerebras Systems builds the world's largest...  .... Our novel wafer‑scale architecture provides...  ...training and inference speeds and empowers...  ...effortlessly run large‑scale ML applications,...  ...and experienced engineer to join our SOTA...  ...teams build better products and companies. We... 
    Internship

    Cerebras

    Sunnyvale, CA
    14 hours ago
  • $224k - $356.5k

     ...building agentic systems that can reason about...  ...across the ML lifecycle, including...  ...developer and researcher productivity. Create self‑...  ...and evolve large‑scale Python and PyTorch...  ...iteration. Raise engineering excellence through...  ...Strong agency in LLM‑based systems, such... 

    NVIDIA Corporation

    Santa Clara, CA
    4 hours ago
  •  ...hardware and robot systems to the...  ...manufacturing scale‑up to make generalist...  ...looking for an Inference Optimization MLE...  ...efficiently in production. You'll be...  ...with research engineers to translate model...  ...optimization, ML systems, or a closely...  ..., or other LLM serving optimizations... 

    Rhoda AI

    Mountain View, CA
    4 days ago
  • Rhoda AI in Mountain View is seeking a Staff / Principal ML Training Systems Engineer to lead the performance of large-scale multimodal training systems. This role involves improving training efficiency and collaborating closely with research teams to accelerate model iteration... 

    Rhoda AI

    Mountain View, CA
    14 hours ago
  •  ...California seeks a Member of Technical Staff — Training to design and optimize large-scale distributed training systems for frontier AI models. Candidates should have 5+ years of experience in ML systems and be proficient in Python along with another systems language, such... 

    RadixArk

    Palo Alto, CA
    3 days ago
  •  ...automotive company is seeking a Staff ML Infrastructure Engineer to build robust compute platforms for...  ...decision-making, and driving large-scale initiatives across GM's ML ecosystem....  ..., Python or C++, and expertise in ML inference. The position offers a hybrid work model... 

    General Motors

    Sunnyvale, CA
    3 days ago
  •  ...advanced AI agents and agentic systems. Architect and implement...  ...of AI agents at scale. Diagnose and troubleshoot...  ...AI and apply them to our products. Leverage enterprise data...  ...strategies (QLORA, DPO) and inference optimization (vLLM, TensorRT‑LLM). Research experience in... 
    Work experience placement

    Nutanix

    Santa Clara, CA
    1 day ago
  •  ...better job search engine: fast, comprehensive...  ...looking for a founding ML engineer who can...  ...fast, reliable production systems. You will own the bridge...  ...models, optimizing inference latency and throughput, scaling serving systems,...  ...optimization, or modern LLM/embedding/ranking... 
    Relocation package

    HiringCafe

    Cupertino, CA
    4 days ago
  • $142.7k - $270.95k

     ...researcher - Machine Learning Systems & Efficiency Engineer to join our R&D team...  ...delivering practical, production-ready improvements in inference performance, latency,...  ...: deliver high-quality ML systems at...  ...experience implementing and scaling large-scale inference or... 
    Full time
    Temporary work
    Local area
    Worldwide

    Adobe

    San Jose, CA
    2 days ago
  •  ...Job Title: ML Engineer What You Will Own End‑to‑End ML Lifecycle across real products: data ingestion, feature design...  ...Production‑grade ML systems built with PyTorch or...  ...modes. Applied GenAI and LLM work that creates...  ...deploying, maintaining and scaling ML models in... 

    MetAntz

    Palo Alto, CA
    4 days ago
  • $138k - $198k

    Google Inc. seeks a Systems Development Engineer in Sunnyvale, CA, responsible for managing services and systems at scale. Candidates need a Bachelor's degree in Computer Science or similar, along with experience in systems automation and technical infrastructure. Successful... 

    Google Inc.

    Sunnyvale, CA
    14 hours ago
  •  ...recent advances in AI into systems that are reliable, useful...  ...frontier AI models serve production traffic at enterprise scale, we would like to meet...  ...looking for a Core Systems Engineer with a deep mastery of...  ...data and compute complex inference logic with ultra-low latency... 

    Sequen

    Santa Clara, CA
    14 hours ago
  •  ...looking for a Founding Machine Learning Infrastructure Engineer in Palo Alto to help optimize infrastructure for AI systems. In this role, you will focus on enhancing model...  ...ideal candidate will have strong experience in ML and distributed systems, and troubleshooting... 

    Model AI

    Palo Alto, CA
    3 days ago
  •  ...a Member of Technical Staff in Machine Learning to build core ML components. You will work on real production systems from day one, developing strong systems judgment by shipping and debugging large-scale ML models. The ideal candidate will have strong foundations in... 

    A1

    Palo Alto, CA
    3 days ago
  • HiringCafe is seeking a Founding ML Engineer in Cupertino to transform AI and ML models into reliable production systems. You'll be responsible for deploying models, optimizing their performance, and ensuring they run efficiently in production. Success in this role requires... 

    HiringCafe

    Cupertino, CA
    2 days ago
  • Harmonic in Palo Alto is seeking a pragmatic Software Engineer to lead the productionization of research pipelines within our advanced AI projects. You will engage in building robust ML pipelines as part of a cutting-edge team, ensuring efficient coding practices and scalable... 

    Harmonic

    Palo Alto, CA
    2 days ago
  • $150k - $230k

     ...on Machine Learning Engineer to drive the post‑...  ...work directly with product and business teams...  ...timelines. Run large‑scale training on mid‑to‑...  ...Hands‑on LLM post‑training experience...  ...data engineering for ML. You can independently...  ...or FSDP, and inference engines like vLLM.... 
    Full time

    GoTo Meeting

    Mountain View, CA
    1 day ago
  •  ...This role focuses on advancing state-of-the-art LLM and ML techniques. The successful candidate will have...  ...end ownership of features within the Siri Search system, contributing to impactful innovations across Apple products. Ideal candidates should possess a Ph.D. or... 

    NLP PEOPLE

    Santa Clara, CA
    3 days ago
  • Lifeattinder is seeking a Machine Learning Engineer II to build and ship systems that enhance product experiences and drive business impact. You will translate product...  ...Alto, California. Ideal candidates have a strong ML foundation, industry experience, and the ability to... 
    Work at office

    Lifeattinder

    Palo Alto, CA
    2 days ago
  • AI Chopping Block, Inc. based in Palo Alto, California, is seeking a Software Engineer to lead the development of production-ready ML pipelines. This role emphasizes the transformation of research ideas into scalable software solutions, utilizing robust engineering practices... 

    AI Chopping Block, Inc.

    Palo Alto, CA
    3 days ago
  •  ...remove the limits of scale, hardware, and...  ...that makes every system — from a laptop to...  ...like one seamless engine. Developers can write...  ...for a Senior ML Performance Engineer...  ...platform for evaluating LLM inference workloads across...  ...influences product quality and customer... 

    Lemurian Labs

    Santa Clara, CA
    14 days ago
  • $181.1k - $272.1k

     ...ideal candidate will develop and fine-tune domain-specific LLMs for various NLP tasks and ensure the translation of product requirements into engineering tasks. This role offers a competitive salary between $181,100 and $272,100, comprehensive benefits, and opportunities... 

    Apple Inc.

    Cupertino, CA
    4 days ago
  • General Motors is seeking a Senior ML Infrastructure Engineer to build and scale a robust platform for machine learning inference workflows. You will design backend software components...  ...years of experience in machine learning systems, proficiency in Python and C++, and a... 
    Remote job

    General Motors

    Sunnyvale, CA
    4 days ago
  • $195k - $298k

     ...assistance. About the Team The ML Inference Platform is part of the...  ...building AI-driven products for GM and its customers....  ...Staff ML Infrastructure engineer to help build and scale robust Compute platforms...  ...in designing distributed systems for ML, strong problem-solving... 
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Systems Engineer: Production-Scale LLM Inference. Be the first to apply!