Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Scientist Privacy-Preserving Large-Scale Model Training & Architecture Optimization

$156k - $316.8k

Ellis Technologies, Inc.

Research Scientist — Privacy-Preserving Large-Scale Model Training & Architecture Optimization Location: San Jose Employment Type: Regular Job Code: DW1L Responsibilities Design and optimize large-scale training architectures for diffusion-based and unified generative models (e.g., DiT, Rectified Flow, hybrid AR + diffusion systems). Lead GPU-centric performance optimization, including memory layout, communication overlap, kernel fusion, and throughput scaling across thousands of accelerators. Develop and evolve distributed training strategies (DP / TP / PP / ZeRO / FSDP-style sharding) tailored to long-running, multi-stage foundation model training. Build fault-tolerant, self-healing training systems that can sustain long-running jobs under frequent hardware, network, and software failures. Design mechanisms for fast failure detection, recovery, and minimal training interruption, including checkpointing strategies, restart policies, and controlled rollouts. Improve training ETTR / MFU / utilization efficiency under real-world production constraints. Optimize Diffusion Transformer training pipelines, including noise schedules, timestep strategies, and memory-efficient attention mechanisms. Support unified generation-and-understanding models, enabling shared context, long-sequence multimodal reasoning, and scalable training without architectural bottlenecks. Collaborate with research teams on architecture-level tradeoffs between quality, compute efficiency, and training stability. Qualifications Minimum Qualifications: Strong background in large-scale deep learning systems and distributed training. Hands‑on experience with GPU optimization, including memory management, communication/computation overlap, and performance profiling. Experience training diffusion models, DiT‑style architectures, or large foundation models at scale. Proficiency in PyTorch and modern distributed training stacks. Solid understanding of parallelism strategies (DP / TP / PP / ZeRO / FSDP or equivalents). Ability to reason about training stability, numerical issues, and long-running job robustness. Preferred Qualifications: Experience with privacy-preserving ML, sensitive data training, or regulated environments. Familiarity with fault-tolerant training systems, checkpointing strategies, or production GPU orchestration. Experience with unified multimodal models (generation + understanding) or hybrid AR/diffusion systems. Low-level performance work (CUDA kernels, custom ops, fused attention, or communication libraries). Background in production ML infrastructure supporting thousands of GPUs. Job Information The base salary range for this position in the selected city is $156,000 - $316,800 annually. Benefits Employees have day one access to medical, dental, and vision insurance, a 401(k) savings plan with company match, paid parental leave, short-term and long-term disability coverage, life insurance, wellbeing benefits, and more. Employees also receive 10 paid holidays per year, 10 paid sick days per year, and 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure). The Company reserves the right to modify or change these benefit programs at any time, with or without notice. Employment Eligibility Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state, and local laws including the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Our company believes that criminal history may have a direct, adverse and negative relationship on the following job duties, potentially resulting in the withdrawal of the conditional offer of employment: Interacting and occasionally having unsupervised contact with internal/external clients and/or colleagues; Appropriately handling and managing confidential information including proprietary and trade secret information and access to information technology systems; Exercising sound judgment. #J-18808-Ljbffr

Vacancy posted 13 hours ago
Similar jobs that could be interesting for youBased on the Research Scientist Privacy-Preserving Large-Scale Model Training & Architecture Optimization in San Jose, CA vacancy
  •  ...Ellis Technologies, Inc. is seeking a Research Scientist specializing in privacy-preserving model training and architecture optimization in San Jose. The candidate will design and optimize large-scale training architectures for advanced generative models and lead performance... 
    Training

    Ellis Technologies, Inc.

    San Jose, CA
    1 day ago
  • $254k - $349.25k

     ...Fortune 100, 10,000 large enterprises, and...  ...expertise in model architecture, training, fine-tuning,...  ...of operating at scale across high-volume...  ...environments Optimize inference...  ...deployment Data privacy and protection in...  ...Contributions to AI/ML research, open-source, or... 
    Training
    Flexible hours

    Proofpoint

    Sunnyvale, CA
    2 days ago
  • $254k - $349.25k

     ...Fortune 100, 10,000 large enterprises, and...  ...expertise in model architecture, training, fine‑tuning,...  ...of operating at scale across high-volume...  ...time environments Optimize inference...  ...deployment Data privacy and protection in...  ...Contributions to AI/ML research, open‑source, or... 
    Training
    Flexible hours

    Proofpoint

    Sunnyvale, CA
    13 hours ago
  • $150k

     ...of Foundation Models We are a dedicated research lab for building...  ...model training, alongside world...  ...researchers, data scientists, and engineers...  ...development of large-scale VLM systems, spanning model architectures, data recipes...  ...and inference optimization. Build and improve... 
    Training

    Institute of Foundation Models

    Sunnyvale, CA
    13 hours ago
  •  ...looking for a skilled professional to enhance the performance of large-scale models through advanced optimization techniques in Santa Clara, California. Candidates should have a strong background in DL model training and deployment, ideally with a PhD or equivalent experience... 
    Training

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $244.8k

     ...team The Vision-Applied Research team focuses on applied research...  ...dedicated to generative models for content creation,...  ...Multimodal Model Training and Inference Optimization Engineer with expertise in...  ...scalability, and deployment of large-scale generative AI models. Responsibilities... 
    Training
    Temporary work
    Local area

    ByteDance

    San Jose, CA
    2 days ago
  • $184k - $299k

     ...Senior Research Scientist, Efficient Deep Learning NVIDIA...  ...about methods for post-training model optimization (pruning,...  ...quantization, NAS), efficient architecture design, adaptive/...  .... Experience with large language models and...  ...with large‑scale model training including... 
    Training

    NVIDIA

    Santa Clara, CA
    14 hours ago
  •  ...Machine Learning Research Scientist Sunnyvale CA...  ...Our novel wafer-scale architecture provides the AI...  ...industry-leading training and inference speeds...  ...run large-scale ML applications...  ...customers include top model labs, global...  ...LLMs) are trained, optimized, and deployed on... 
    Training
    Internship

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    2 days ago
  •  ...Our novel wafer-scale architecture provides the AI compute...  ...industry-leading training and inference...  ...effortlessly run large-scale ML...  ...customers include top model labs, global enterprises...  ...The Role Most AI research today is shaped...  ...at the level of optimization theory, model... 
    Training

    Dormont Manufacturing Company

    Sunnyvale, CA
    13 hours ago
  • $184k - $287.5k

     ...state‑of‑the‑art model optimization techniques—speculative...  ...conversion. Scale DL model performance...  ...NVIDIA edge architectures, maximizing the throughput...  ...interact with large‑scale models...  ...environment. Partner with research, TensorRT, and...  ...track record of training, deploying, or... 
    Training

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $357k

     ...cloud-native architecture connects...  ...orchestration at scale. With...  ...Workato's AI Research Lab is seeking...  ...AI Research Scientist to join our...  ...building and optimization. goal based...  ...techniques, model optimizaiton...  ...production within large-scale...  ...scale model training, transformer... 
    Training
    Work at office
    Remote work
    Flexible hours

    Workato

    San Jose, CA
    25 days ago
  •  ...knit group of researchers and engineers...  ...for building large scale frontier foundation models at Apple. We believe...  ...tackle core training challenges in...  ..., and architectural adaption — designing...  ...integrated, and privacy-forward...  ...for preference optimization, model steering... 
    Training

    Apple Inc.

    Cupertino, CA
    4 days ago
  •  ...future of AI and beyond. Together, we advance your career. PMTS Large Scale Training Performance Optimization ENGINEER THE ROLE: We are looking for a Principal Machine Learning Engineer to join our Models and Applications team. If you are excited by the challenge of distributed... 
    Training

    Advanced Micro Devices , Inc.

    San Jose, CA
    13 hours ago
  • $168k - $264.5k

     ...now looking for a Research Scientist New Graduate with...  ...systems of all scales. Advances in AI/ML...  ...trustworthy systems for training, fine‑tuning, and serving ML models. All layers of AI...  ...‑designed and co‑optimized to maximize...  ..., or computer architecture. What you'll be doing... 
    Training

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  •  ...other degree with significant research and innovation experience)...  ...technology transfer.* Computer architecture* Energy monitoring,...  ...implementing, and managing rack scale architectures.* Knowledge of...  ...efficient technologies and cooling optimization.* Understanding of... 
    Local area

    Hewlett Packard Enterprise Development LP

    Milpitas, CA
    12 hours ago
  •  ...GPUs. Our novel wafer-scale architecture provides the AI compute...  ...industry-leading training and inference speeds and...  ...users to effortlessly run large-scale ML applications,...  ...customers include top model labs, global...  ...their cutting-edge AI research. # Work on one of the... 
    Training

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    4 days ago
  •  ...powerful AI and ML models into fast,...  ...models, optimizing inference latency...  ...throughput, scaling serving...  ...utilization, inference architecture, and...  ...and integrate researcher-trained model checkpoints...  ...-offs while preserving model quality...  ...experience with large-scale model... 
    Training
    Full time
    Relocation package

    HiringCafe

    Cupertino, CA
    3 days ago
  • $184k - $287.5k

     ...edge deep learning models on every NVIDIA...  ...in the realm of large language models (...  ...directly with NVIDIA Researchers, GPU Architects,...  ..., runtime optimizations, and frameworks for...  ...ability to lead and scale high-performing engineering...  ...of GPU architecture, CUDA programming... 

    NVIDIA

    Santa Clara, CA
    12 hours ago
  •  ...unified multimodal foundation model, from pretraining to...  ...hardware. This is foundational research with direct physical...  ...You'll Do Design and train large-scale multimodal architectures where vision, language, and...  ...robotic hardware and optimize for edge inference What... 
    Training

    Prime Recruitment Partners

    Santa Clara, CA
    12 hours ago
  •  ...A dedicated research lab in Sunnyvale, California, is seeking individuals...  ...cutting-edge foundation models. The role involves designing scalable systems for training and optimizing AI models. Candidates should...  ...fields and experience with large-scale training and video... 
    Training
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    13 hours ago
  • $156k - $387.6k

     ...Research Scientist - TikTok E-Commerce Recommendation Foundation...  ...Build and optimize cross-scenario shared Foundation Models to enable unified modeling...  ...participate in model training, inference optimization...  ...Qualifications Experience in large-scale recommendation system... 
    Training
    Local area

    Ellis Technologies, Inc.

    San Jose, CA
    12 hours ago
  •  ...We are looking for a Senior Research Scientist passionate about Large Language Model (LLM) and Diffusion Language Model (DLM) post‑training and system optimization. This role is part of NVIDIA’s foundation...  ...post‑training algorithms, large‑scale system efficiency, and... 
    Training

    NVIDIA Gruppe

    Santa Clara, CA
    13 hours ago
  •  ...Applied Deep Learning Research Scientist, Efficiency! Join...  ...series of models to make our state‑...  ...and algorithms to optimize neural networks for training and deployment. Topics...  ...learning, efficient architectures and pre‑training....  ...world to use. Run large‑scale deep learning experiments... 
    Training

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $184k - $299k

     ...looking for a Senior Research Scientist focused on Multimodal Foundation Models and Robotics! NVIDIA is...  ...multimodal foundation models, large-scale robot learning, game...  ...large-scale AI training and inference methods...  ...for foundation models; Optimize and deploy AI models in... 
    Training

    NVIDIA

    Santa Clara, CA
    13 hours ago
  • $150k

     ...Foundation Models We are a dedicated research lab for building...  ...model training, alongside world...  ..., data scientists, and engineers...  ...experimental work can scale reliably...  ...systems for large‑scale data...  .... Own architecture decisions for...  ...Knowledge of cost optimization, security,... 
    Training
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    1 day ago
  • $300k

     ...Institute of Foundation Models We are a dedicated research lab for building...  ...model training, alongside world...  ...researchers, data scientists, and engineers,...  ...Overview Build and scale distributed pre-...  ...Prototype new optimizers or attention methods...  ...the future of large language models.... 
    Training
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    13 hours ago
  • $150k

     ...Institute of Foundation Models We are a dedicated research lab for building,...  ...foundation model training, alongside world-...  ...researchers, data scientists, and engineers, tackling...  ...the world model on large-scale clusters. Develop...  ...and evaluation. Optimize inference efficiency... 
    Training
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    12 hours ago
  • $150k

     ...Institute of Foundation Models We are a dedicated research lab for building...  ...model training, alongside world...  ...researchers, data scientists, and engineers,...  ...data at the web‑scale to fuel the development...  ...performance of large‑scale machine...  ...domains. Optimize data‑model co‑design... 
    Training
    Worldwide
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    13 hours ago
  • $201.3k - $367.4k

     ...Machine Learning Research Scientist - Health AIML...  ...multimodal models to create intelligent...  ...expertise in large multimodal...  ...models that scale to billions of...  ...scale up new architectures to improve model...  ...Study, debug, and optimize model...  ...Contribute to training and inference... 
    Training
    Work experience placement
    Worldwide
    Relocation

    Apple

    Cupertino, CA
    3 days ago
  •  ...is seeking talented professionals to design AI models and develop training methodologies on our groundbreaking wafer-scale hardware. This role allows you to rethink the...  ..., preferably with a track record of published research. Join us and influence the design of future Cerebras... 
    Training

    Dormont Manufacturing Company

    Sunnyvale, CA
    13 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Scientist Privacy-Preserving Large-Scale Model Training & Architecture Optimization. Be the first to apply!