Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Machine Learning Training Framework Engineer, GenAI

$173k - $307k

Frame.io

The Opportunity

Adobe Applied Science & Machine Learning ( ASML ) is seeking a Staff Machine Learning Training Framework Engineer to play a critical role in building and scaling the core training systems behind Adobe’s generative AI foundation models.

In this role, you will serve as a senior technical owner for key components of our training framework , translating research needs into reliable, scalable, and high‑performance training infrastructure. Rather than focusing on a single model, your work will enable multiple multimodal and video foundation models by strengthening the shared systems used to train them.

You will operate at the intersection of applied research and large‑scale systems execution, ensuring that training workflows are robust, reproducible, and performant across large GPU clusters. This role is ideal for a senior engineer who thrives on deep technical ownership, complex execution, and close collaboration with research teams.

Job Responsibilities

  • Training Framework Ownership: Own the design and implementation of major components of the training framework, including abstractions for model configuration, optimizer and scheduler integration, checkpointing, and experiment management.
  • Large‑Scale Training Execution: Implement and support distributed training strategies such as PyTorch FSDP, Tensor Parallelism, and Pipeline Parallelism, ensuring correctness, stability, and scalability across multi‑node GPU environments.
  • Reliability & Fault Tolerance: Improve the resilience of long‑running training jobs by strengthening restartability, state management, and failure handling mechanisms.
  • Performance‑Aware Framework Design: Identify framework‑level inefficiencies and reduce overhead related to memory usage, communication, or execution orchestration in large training runs.
  • Research Enablement: Partner directly with applied researchers to support new model architectures and training requirements, ensuring the framework adapts quickly to evolving research needs.
  • Training Pipeline Integration: Collaborate with infrastructure and platform teams to integrate the training framework with scheduling, storage, monitoring, and logging systems used in production‑scale environments.

What You’ll Need to Succeed

  • Education: Master’s or PhD degree in Computer Science, Electrical Engineering, or a related field, or equivalent practical experience.
  • Strong Systems Engineering Skills: Proficiency in Python and C++, with experience contributing to large, shared codebases that support multiple users or teams.
  • Proven ML Training Experience: Hands‑on experience training models using PyTorch (or JAX), including multi‑GPU and multi‑node distributed training setups.
  • Distributed Systems Understanding: Solid understanding of synchronization, state management, fault tolerance, and performance tradeoffs in distributed systems.
  • Senior‑Level Execution: Demonstrated ability to independently own complex technical problems, drive solutions to completion, and deliver high‑quality systems relied upon by others.

Preferred Experience

  • Experience supporting large‑scale foundation model training or long‑running multi‑node training jobs.
  • Familiarity with ML training infrastructure such as DeepSpeed, Accelerate, or internal training platforms.
  • Experience working closely with applied research teams on rapidly evolving model requirements.
  • Exposure to profiling, debugging, and optimizing training performance at scale.

About Adobe

Adobe empowers everyone to create through innovative platforms and tools that unleash creativity, productivity and personalized customer experiences. Adobe’s industry-leading offerings including Adobe Acrobat Studio, Adobe Express, Adobe Firefly, Creative Cloud, Adobe Experience Platform, Adobe Experience Manager, and GenStudio enable people and businesses to turn ideas into impact, powered by AI and driven by human ingenuity.

Our 30,000+ employees worldwide are creating the future and raising the bar as we drive the next decade of growth. We’re on a mission to hire the very best and believe in creating a company culture where all employees are empowered to make an impact. At Adobe, we believe that great ideas can come from anywhere in the organization. The next big idea could be yours. 

Let’s Adobe together

At Adobe, we believe in creating a company culture where all employees are empowered to make an impact. Learn more about Adobe life, including our values and culture , focus on people, purpose and community , Adobe for All , comprehensive benefits programs , the stories we tell , the customers we serve, and how you can help us advance our mission of empowering everyone to create.

Adobe is proud to be an Equal Employment Opportunity employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other protected characteristic. Learn more.

Adobe aims to make our Careers website and recruiting process accessible to any and all users. If you have a disability or special need that requires accommodation to navigate our website or complete the application process, email View email address on swooped.co or call View phone number on swooped.co

AI Use Guidelines for Interviews:
Our interviews are designed to reflect your own skills and thinking. The use of AI or recording tools during live interviews is not permitted unless explicitly invited by the interviewer or approved in advance as part of a reasonable accommodation. If these tools are used inappropriately or in a way that misrepresents your work, your application may not move forward in the process.

At Adobe, we empower employees to innovate with AI — and we look for candidates eager to do the same. As part of the hiring experience, we provide clear guidance on where AI is encouraged during the process and where it’s restricted during live interviews. See how we think about AI in the hiring experience .

Expected Pay Range:

Our compensation reflects the cost of labor across several  U.S. geographic markets, and we pay differently based on those defined markets. The U.S. pay range for this position is $172,500 -- $306,625 annually. Pay within this range varies by work location and may also depend on job-related knowledge, skills, and experience. Your recruiter can share more about the specific salary range for the job location during the hiring process.

In California, the pay range for this position is $211,800 - $306,625

At Adobe, for sales roles starting salaries are expressed as total target compensation (TTC = base + commission), and short-term incentives are in the form of sales commission plans.  Non-sales roles starting salaries are expressed as base salary and short-term incentives are in the form of the Annual Incentive Plan (AIP).

In addition, certain roles may be eligible for long-term incentives in the form of a new hire equity award.

State-Specific Notices:

California:

Fair Chance Ordinances

Adobe will consider qualified applicants with arrest or conviction records for employment in accordance with state and local laws and “fair chance” ordinances.

Colorado:

Application Window Notice

If this role is open to hiring in Colorado (as listed on the job posting), the application window will remain open until at least the date and time stated above in Pacific Time, in compliance with Colorado pay transparency regulations. If this role does not have Colorado listed as a hiring location, no specific application window applies, and the posting may close at any time based on hiring needs.

Massachusetts:

Massachusetts Legal Notice

It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.

Vacancy posted more than 2 months ago
Similar jobs that could be interesting for youBased on the Staff Machine Learning Training Framework Engineer, GenAI in San Jose, CA vacancy
  • $203.45k - $344.3k

     ...Senior Staff AI Data Infrastructure/Pipeline Engineer Santa Clara, CA XPENG is a leading smart technology...  ...through cutting-edge R&D in AI, machine learning, and smart connectivity. As a core...  ...→ dataset production → model training / simulation input. In autonomous... 
    Training
    Full time
    Overseas

    XPENG

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...Intelligent machines powered by Artificial Intelligence...  ...computers that can learn, reason and interact with...  ...Senior Perception Engineer to develop and productize...  ...using deep learning frameworks (e.g., PyTorch). ~...  ...CUDA kernels as part of training or inference pipelines... 
    Training
    Work experience placement

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $170k - $210k

     ...technically strong Software Test Engineer to validate, scale, and...  ...pipelines and test automation frameworks. Strong communication skills...  ...job-related skills, training, location, experience, relevant...  ...Development: We believe in continuous learning. Access to training,... 
    Training
    Work at office
    Remote work
    Flexible hours

    Versa Networks

    Santa Clara, CA
    4 days ago
  •  ...Cerebras to deliver industry‑leading training and inference speeds and empowers machine learning users to effortlessly run large...  ...Role We are seeking Compiler Engineers to join a small team of...  ...targets. Exposure to ML compiler frameworks (MLIR, XLA, TVM) and understanding... 
    Training

    Dormont Manufacturing Company

    Sunnyvale, CA
    16 hours ago
  •  ...Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large...  .... About The Role As a Kernel Engineer on our team, you will develop...  ...Learning neural networks and frameworks such as TensorFlow and PyTorch.... 
    Training

    Dormont Manufacturing Company

    Sunnyvale, CA
    16 hours ago
  • $192k - $378k

     ...innovative computer vision, machine learning, robotics, and product management...  ...customers love. As an RL Engineer, you will help grow and...  ...shaping the design of the E2E training and inference pipeline, both...  ...; develop evaluation frameworks for simulation and real-world... 
    Training
    Full time
    Work at office
    Immediate start
    Visa sponsorship
    Flexible hours

    Blue River Technology

    Santa Clara, CA
    5 days ago
  • $198.3k - $342.8k

     ...Internationalization Engineering Manager - Intelligence...  ..., United States Machine Learning and AI The International...  ...software, models, and frameworks, combined with the people...  ..., writing code, training models, and stepping...  ...production. Experience using GenAI to prototype... 
    Training
    Relocation

    Apple

    Cupertino, CA
    17 hours ago
  • $320k

     ...high-performance computing to machine learning applications for autonomous...  ...centers is the ability to engineer integrated system designs in...  ...sophisticated international regulatory frameworks and ITU-T standards to...  ...requirements needed for AI training vs. inference. Deep... 
    Training

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...to revolutionize how machines move, perceive, and interact...  ...integrating control engineering, artificial intelligence, and machine learning at every level of...  ...humanoid robots Design training pipelines including...  ...on whole-body control frameworks integrating learned policies... 
    Training
    Temporary work

    Seres

    Milpitas, CA
    4 days ago
  •  ...Staff/Sr. ML Compute Efficiency Engineer Scaling machine learning workloads across thousands of GPUs and TPUs creates challenges...  ...that powers large-scale ML training and inference workloads, bringing...  ...across applications and frameworks. Use workload-driven insights... 
    Training

    Apple

    Santa Clara, CA
    2 days ago
  •  ...Cerebras to deliver industry‑leading training and inference speeds and empowers machine learning users to effortlessly run large...  ...hiring a Senior Performance Engineer to join our Product team. You...  ...‑the‑art open‑source inference frameworks like vLLM, SGLang, or TensorRT‑... 
    Training
    Contract work
    Shift work

    Cerebras

    Sunnyvale, CA
    2 days ago
  •  ...communication libraries features in AI frameworks: from PoC to performance...  ...) with 5+ software engineering and HPC/AI experience Development...  ...experience with Deep Learning Frameworks such as PyTorch,...  ...one or more of these areas: Training, Distributed inference, MoE,... 
    Training

    NVIDIA Gruppe

    Santa Clara, CA
    16 hours ago
  • $257.4k - $386.3k

     ...California, United States Machine Learning and AI Do you want to...  ...Apple's Agentic Eval Engineering organization is...  ...systems architecture, staff/principal engineer, or...  ...‑as‑judge evaluation frameworks, and offline...  ...lifecycle management and training data pipelines. Experience... 
    Training
    Local area
    Relocation

    Apple

    Cupertino, CA
    16 hours ago
  • $149k - $216k

     ...applications utilizing AI methods or frameworks. Experience engaging with,...  ...using AI (e.g., ML APIs, Machine Learning templates, RAG, LLMs)....  ...marketing management and engineering teams to stay on top of...  ...and relevant education or training. Your recruiter can share more... 
    Training
    Full time

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $147.4k - $272.1k

    Deep Learning Engineer - Perception Algorithms Sunnyvale, California, United States Machine Learning and AI Do you have a passion for deep learning...  ...on perception tasks (training and evaluation sets)....  ...utilizing distributed GPU training framework. Experience with advanced... 
    Training
    Relocation

    Apple Inc.

    Sunnyvale, CA
    4 days ago
  •  ...Cybersecurity, Automation, IoT, AI, Machine Learning, and 5G driving the next...  ...innovation. Our Sales Engineering team empowers customers and...  ...network architecture frameworks. JNCIE or equivalent certification...  ...work experience, education/training, and/or skill level. –... 
    Training
    Work experience placement
    Remote work
    Work from home

    HPE

    San Jose, CA
    5 days ago
  •  ...agents and partner with engineering to take POCs into...  ...Generative AI platforms and frameworks that are used by...  ...Technical Depth (AI/ML and GenAI) Hands-on...  ...environments. Modern deep learning architectures (e.g., transformers...  ...models). Pre-training, fine-tuning,... 
    Training
    Work at office
    2 days per week

    Hewlett Packard Enterprise

    San Jose, CA
    28 days ago
  • $143k - $303k

     ...inspiring the world to learn, communicate and...  ...processes to scalable, machine-assisted workflows....  ...technology, security engineering, or security operations...  ...Knowledge of security risk frameworks and governance models....  ...relevant education or training. The pay scale is subject... 
    Training
    Full time
    Local area
    Immediate start
    Shift work

    Micron Technology

    San Jose, CA
    1 day ago
  • $224k - $356.5k

     ...As a Senior / Principal Deep Learning Engineer — Model Evaluation & AI...  ...practices. Work alongside model training, inference, and product divisions...  ...or assessing contemporary machine learning and deep learning...  ...or improving evaluation frameworks, benchmarks, or ML infrastructure... 
    Training

    NVIDIA Gruppe

    Santa Clara, CA
    17 hours ago
  •  ...leading technology company is hiring a Machine Learning Systems Engineer in Cupertino, California. You will...  ...modeling teams to optimize model training and inference on Apple's custom Silicon...  ...Python and knowledge of various ML frameworks. The role offers competitive compensation... 
    Training

    Apple

    Cupertino, CA
    17 hours ago
  • $110k - $170k

     ...Artificial Intelligence (AI)/Machine Learning (ML) systems and other high-...  ...Photonics Systems Test Engineer to own silicon photonics system...  ...software using our Python-based framework; 4) Integrate and debug test...  ...experience, skills, training, education, market demands and... 
    Training

    nEye Systems, Inc.

    Santa Clara, CA
    2 days ago
  • $174.72k - $295.68k

     ...Senior Computer Vision Engineer Santa Clara, CA XPENG is a...  ...through cutting-edge R&D in AI, machine learning, and smart connectivity. We...  ...with multi-modal model training and optimization, a strong foundation...  ...Familiar with deep learning frameworks (e.g., PyTorch, TensorFlow)... 
    Training
    Full time

    XPENG

    Santa Clara, CA
    8 days ago
  • A leading technology company is seeking a Senior Staff Software Developer focused on AI/ML to join its team in Sunnyvale, CA...  ...software development and technical leadership, especially in machine learning and GenAI techniques. You will design and enhance large-scale... 

    Google Inc.

    Sunnyvale, CA
    2 days ago
  • $150k - $188k

     ...intelligence to every moving machine on the planet. Applied...  ...Quality Assurance Engineer to lead the...  ...projects Drive process training and awareness across engineering...  ...knowledge of process framework, hierarchy, and...  ...processes MLE (Machine Learning Engineering) and HWE (... 
    Training
    Full time
    For contractors
    For subcontractor
    Casual work
    Work at office
    Remote work
    Day shift

    Decisive Point

    Sunnyvale, CA
    17 hours ago
  •  ...will interact closely with ML engineers, clinicians, software and...  ...experience in developing machine learning and deep learning models, preferably...  ...in Python and ML frameworks e.g. PyTorch, Tensorflow...  ...SW skills to run large ML training jobs efficiently on a distributed... 
    Training

    Apple

    Cupertino, CA
    5 days ago
  • $224k - $356.5k

     ...NVIDIA is seeking an Engineering Manager to lead our Robotics...  ...integration in robot learning developments....  ..., autonomous driving, machine learning, or related domains...  ...with Deep Learning frameworks (PyTorch, JAX, TensorFlow...  ...large-scale model training on GPU clusters.... 
    Training

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $148.91k - $252k

     ...Machine Learning Engineer - LLM, AI & Robotics Santa Clara, CA XPENG is a leading smart technology...  ...Fine-tune the pre-trained LLM on particular use cases in Humanoid...  ...experience with any major deep learning framework. Open-minded, collaborative and excellent... 
    Training
    Full time

    XPENG

    Santa Clara, CA
    3 days ago
  • $125k - $201.25k

     ...Senior Machine Learning Engineer – Robotics (Santa Clara, CA) Purpose: We are looking for a highly...  ..., implement algorithms, and bring ML‑trained manipulation behaviors from research...  ...Strong proficiency in deep learning frameworks such as PyTorch or TensorFlow. Experience... 
    Training
    Local area

    6267-Auris Health Inc. Legal Entity

    Santa Clara, CA
    17 hours ago
  • $184k - $287.5k

     ...Senior Robotics Research Engineer (Robotics & AI for...  ...control, reinforcement learning, imitation learning, simulation...  ...planning pipelines Training robots to solve...  ...Qualifications A PhD in Robotics, Machine Learning, Computer...  ...modern deep learning frameworks such as PyTorch and... 
    Training

    NVIDIA Gruppe

    Santa Clara, CA
    17 hours ago
  • $147.4k - $272.1k

     ...Machine Learning Engineer - Special Projects We're seeking research engineers to build infrastructure for breakthrough...  ...systems - including feature pipelines, training infrastructure, model serving, or evaluation frameworks. ~ Solid software engineering skills in... 
    Training
    Relocation

    Apple

    Santa Clara, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Machine Learning Training Framework Engineer, GenAI. Be the first to apply!