Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Software Engineer, ML Training and Inference Infrastructure

$228k - $285k
Full-time

Rivian

About Rivian

Rivian is on a mission to keep the world adventurous forever. This goes for the emissions-free Electric Adventure Vehicles we build, and the curious, courageous souls we seek to attract. 

As a company, we constantly challenge what’s possible, never simply accepting what has always been done. We reframe old problems, seek new solutions and operate comfortably in areas that are unknown. Our backgrounds are diverse, but our team shares a love of the outdoors and a desire to protect it for future generations. 


Role Summary

As a Staff Software Engineer, ML training and inference infrastructure , you will be a member of the Perception team at Rivian, which develops advanced machine learning algorithms that directly impact safety critical self-driving features of our category defining vehicles.

We are looking for candidates with deep knowledge and strong enthusiasm towards establishing a state-of-art ML infrastructure for training and inference of large autonomous driving models; and optimizing the training and inference performance.


Responsibilities

  • Optimize the performance of Deep Learning training workload on NVIDIA GPU systems on a large scale
  • Optimize the latency of model inference and model pre- and post-processing on onboard systems
  • Design, train, and deploy large deep learning models that can leverage the vast amount of labeled and unlabeled data

Qualifications

  • PhD in CS/CE/EE, or equivalent, in industry experience
  • Deep knowledge of PyTorch
  • Knowledge of model training framework (e.g. PyTorch Lightning, ray, etc.)
  • In-depth knowledge of transformer architecture and ways to accelerate the training and inference of transformer models
  • Experience of performing large scale distributed training of models
  • A track record of profiling models and doing detective work to improve model training and inference speed

Preferred Skill Requirements:

  • Experience with CUDA or Triton language for writing custom ops
  • Knowledge of Nvidia TensorRT
  • Knowledge of NCCL
  • Experience with edge computing systems
  • A track record of efficiently solving complex problems collaboratively on larger teams

Pay Disclosure

Salary Range for California Based Applicants: $228,000.00 - $285,000.00 (actual compensation will be determined based on experience, location, and other factors permitted by law). 

Benefits Summary : Rivian provides robust medical/Rx, dental and vision insurance packages for full-time employees, their spouse or domestic partner, and children up to age 26. Coverage is effective on the first day of employment 

Equal Opportunity

Rivian is an equal opportunity employer and complies with all applicable federal, state, and local fair employment practices laws. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, ancestry, sex, sexual orientation, gender, gender expression, gender identity, genetic information or characteristics, physical or mental disability, marital/domestic partner status, age, military/veteran status, medical condition, or any other characteristic protected by law.

Rivian is committed to ensuring that our hiring process is accessible for persons with disabilities. If you have a disability or limitation, such as those covered by the Americans with Disabilities Act, that requires accommodations to assist you in the search and application process, please email us at  View email address on ev.careers .

Candidate Data Privacy

Rivian may collect, use and disclose your personal information or personal data (within the meaning of the applicable data protection laws) when you apply for employment and/or participate in our recruitment processes (“Candidate Personal Data”). This data includes contact, demographic, communications, educational, professional, employment, social media/website, network/device, recruiting system usage/interaction, security and preference information. Rivian may use your Candidate Personal Data for the purposes of (i) tracking interactions with our recruiting system; (ii) carrying out, analyzing and improving our application and recruitment process, including assessing you and your application and conducting employment, background and reference checks; (iii) establishing an employment relationship or entering into an employment contract with you; (iv) complying with our legal, regulatory and corporate governance obligations; (v) recordkeeping; (vi) ensuring network and information security and preventing fraud; and (vii) as otherwise required or permitted by applicable law.

Rivian may share your Candidate Personal Data with (i) internal personnel who have a need to know such information in order to perform their duties, including individuals on our People Team, Finance, Legal, and the team(s) with the position(s) for which you are applying; (ii) Rivian affiliates; and (iii) Rivian’s service providers, including providers of background checks, staffing services, and cloud services.

Rivian may transfer or store internationally your Candidate Personal Data, including to or in the United States, Canada, the United Kingdom, and the European Union and in the cloud, and this data may be subject to the laws and accessible to the courts, law enforcement and national security authorities of such jurisdictions. 

Please note that we are currently not accepting applications from third party application services.

Vacancy posted 23 days ago
Similar jobs that could be interesting for youBased on the Staff Software Engineer, ML Training and Inference Infrastructure in California vacancy
  •  ...Type Hybrid Department Inference Model Serving Who are...  ...serve humanity. We’re training and deploying frontier...  ...team of researchers, engineers, designers, and more,...  ...Members of Technical Staff to join the Model Serving...  ...experience running production infrastructure at a large scale... 
    Training
    Full time
    Work experience placement
    Work at office
    Remote work
    Flexible hours

    Jaide Health

    San Francisco, CA
    3 days ago
  • $190.9k - $232.8k

     ...About This Role As a staff software engineer for GenAI inference, you will lead the...  ..., distributed inference infrastructure - orchestrate across nodes...  ...Deep understanding of ML inference internals: attention...  ...certifications and training, and specific work location... 
    Training
    Local area
    Worldwide

    Databricks

    San Francisco, CA
    1 day ago
  •  ...perception team. We're hiring a Staff Software Engineer to own ML Infrastructure at Voxel. Our applied ML team is...  ...the technical direction for how we train, track, and ship vision models, build...  ...trained models to optimized inference formats (TensorRT, ONNX), quantify... 
    Training
    Work at office
    Flexible hours

    Voxel Labs

    San Francisco, CA
    15 hours ago
  • $248.71k - $292.6k

     ...fast, efficient AI inference. Our LPU-based system...  ...possible. Build fast. Sr. Staff Software Engineer - High Performance...  ..., software-defined infrastructure. Low‑Level GPU...  ...closely with teams across ML compilers,...  ...experience with multi‑GPU training/inference frameworks... 
    Training

    I did my part and supported the Regular Toilet

    Palo Alto, CA
    3 days ago
  • $207k - $300k

    Staff Software Engineer, On-Device Machine Learning Infrastructure corporate_fare Google place Sunnyvale, CA, USA...  ...decision making), ML infrastructure, or specialization...  ...-accelerated ML inference techniques....  ...relevant education or training. Your recruiter can share... 
    Training
    Full time
    Shift work

    Google Inc.

    Sunnyvale, CA
    15 hours ago
  •  ...deliver industry-leading training and inference speeds and empowers...  ...run large-scale ML applications, without...  ...We're hiring a Staff Engineer to own major areas of...  ...Partner with ML, Product, Infrastructure, and Platform teams...  ...of experience in software engineering, with substantial... 
    Training

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    2 days ago
  • $170k - $216k

     ...The Simulation Infrastructure team creates reliable...  ...evaluate the Waymo Driver's software stack at a massive...  ...of customers Software Engineers, Product, Data Science...  ...Build and evolve ML inference infrastructure for simulations...  ...experience, relevant training and education, and... 
    Training
    Full time
    Remote work

    Waymo

    San Francisco, CA
    15 hours ago
  •  ...Role We are hiring Software Engineers focused on AI Infrastructure to build the systems that...  ...orchestration, large-scale inference systems, performance...  ...infrastructure supporting training and inference workflows....  ...Familiarity with GPU-based ML workloads or distributed... 
    Training
    Internship
    Immediate start

    SpreeAI

    San Francisco, CA
    15 hours ago
  •  ...deliver industry-leading training and inference speeds and empowers machine...  ...effortlessly run large-scale ML applications, without the...  .... About The Role As a software engineer on our AI cloud platform,...  ...distributed training and inference infrastructure. You will define and... 
    Training

    Cerebras

    Sunnyvale, CA
    4 days ago
  • $156k - $316.8k

     ...About the Team The Inference Infrastructure team is the creator and...  ...storage, machine learning training and inference, and...  ...and are looking for engineers passionate about...  ...efficient and secure ML platforms. - Collaborate...  ...completed a PhD degree in Software Development, Computer... 
    Training
    Temporary work
    Local area

    ByteDance

    San Jose, CA
    1 day ago
  • $207k - $275k

     ...CoreWeave combines superior infrastructure performance with deep...  ...ABOUT THE ROLE As a Staff Software Engineer, you will define and drive...  ...Kubernetes infrastructure, and AI/ML platform reliability....  ...* Experience with AI/ML training and inference infrastructure. *... 
    Training
    Permanent employment
    Full time
    Temporary work
    Casual work
    Work at office
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    8 hours ago
  •  ...Staff Software Engineer, Ads ML Inference Infrastructure   The Ads ML Inference Infra team owns the online inference and feature serving systems that power real-time model scoring and delivery for all Ads models at Pinterest. The team is looking for a staff engineer... 
    Full time
    Work at office
    Relocation
    Relocation package

    Pinterest

    Palo Alto, CA
    15 hours ago
  • $228.4k - $303.55k

     ...the world's best data and AI infrastructure platform, so our customers...  ...central to their missions. Our engineering teams build highly technical...  ...trusted data analytics and ML platform in the world....  ...relevant certifications and training, and specific work location.... 
    Training
    Local area
    Worldwide

    Databricks

    Mountain View, CA
    2 days ago
  • $197k - $291k

    Staff Software Engineer, Infrastructure, NetInfra Telemetry Google Sunnyvale, CA, USA Apply Advanced Experience...  ...networking device telemetry for upcoming ML network fabric devices. Google Cloud...  ..., and relevant education or training. Your recruiter can share more about... 
    Training
    Full time

    Google Inc.

    Sunnyvale, CA
    15 hours ago
  • $207k - $300k

    Staff Software Engineer, Google Global Infrastructure, NetSoft corporate_fare Google place Sunnyvale, CA, USA Apply Bachelor...  ..., and relevant education or training. Your recruiter can share more about...  ...simplified solutions. Understand the AI/ML applications, performance... 
    Training
    Full time
    Worldwide

    Google Inc.

    Sunnyvale, CA
    4 days ago
  • $207k - $300k

     ...testing and launching software products and...  ...developing large‑scale infrastructure, distributed...  ...Computer Science, Engineering or related field....  ...experienced and passionate Staff Software...  ...networking and AI/ML landscape. Effectively...  ...relevant education or training. Your recruiter... 
    Training
    Full time
    Remote work
    Worldwide

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $262k - $365k

    Senior Staff Software Engineer, Infrastructure, Agents Infra Advanced Experience owning outcomes and decision making...  ...leading technical project strategy, ML design, and optimizing ML...  ...experience, and relevant education or training. Your recruiter can share more about... 
    Training
    Full time

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $262k - $365k

    Senior Staff Software Engineer, ML Infrastructure, Agents Infrastructure Google Sunnyvale, CA, USA Qualifications Bachelor's degree or equivalent practical...  ...related skills, experience, and relevant education or training. Your recruiter can share more about the specific... 
    Training
    Full time

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $197.3k - $313.7k

     ...Job Category Software Engineering Job Details About...  ...Slack is looking for a Staff Software Engineer to join the Data Infrastructure team within the broader...  ...powering Slack's analytics, ML, and data-driven...  ...promotion, benefits, training, assessment of job performance... 
    Training

    Salesforce.Com Inc

    San Francisco, CA
    3 days ago
  • $193.93k - $352.29k

     ...Senior/Staff Software Engineer, ML Data Infrastructure Mountain View, California (HQ) Nuro is a self-driving technology company on a mission to make...  ...depends heavily on the quantity and diversity of its training and evaluation data. The team plays a crucial role... 
    Training
    Work experience placement

    Nuro

    Mountain View, CA
    4 days ago
  • $214k - $295k

     ...Staff Software Engineer, Data Infrastructure, AI Compute Platform Redwood City, CA (Hybrid) Biohub is the first...  ...five interconnected pillars: training frontier AI models specifically for...  ...and Infrastructure team brings AI/ML technology and Data to the table in... 
    Training
    Work at office
    Worldwide
    Relocation package
    Flexible hours
    3 days per week

    Biohub

    Redwood City, CA
    2 days ago
  • $207k - $300k

    Staff Software Engineer, ML Data Infrastructure corporate_fare Google place San Bruno, CA, USA Apply Bachelor's degree or equivalent practical experience...  ...of ML concepts, including model architecture and training. Ability to collaborate effectively across teams and... 
    Training
    Full time

    Google Inc.

    San Bruno, CA
    4 days ago
  • $188k - $275k

     ...Staff Software Engineer, Inference CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers...  ..., CoreWeave combines superior infrastructure performance with deep technical expertise...  ...Exposure to large-scale AI/ML infrastructure or hyperscale cloud environments... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    3 days ago
  • $200k - $400k

     ...About the Team The Infrastructure team builds and operates the...  ...power Decagon: networking, data, ML serving, developer platform,...  ...modelserving platforms for LLM inference with multiprovider routing...  ...hiring a Senior Infrastructure Engineer to design, build, and operate... 
    Full time
    Work at office
    Local area

    Decagon

    San Francisco, CA
    2 days ago
  • $176k - $420k

     ...through humanoid robots.As a Software Engineer within our robotics teams,...  ...You'll Do ~ Integrate ML models into embedded or robotic...  ...Python scripts/tools for training, evaluation, and deployment...  ...C++ systems for real-time ML inference and control and for the humanoid... 
    Training
    Hourly pay
    Full time
    Temporary work
    Flexible hours

    Tesla

    Palo Alto, CA
    3 days ago
  • $220k - $300k

     ...Senior/Staff Software Engineer, AI/ML Location: New York, NY / San Francisco,...  ...building foundational AI infrastructure that powers critical experiences...  ...AI workloads and inference pipelines Drive technical...  ...ingestion, preprocessing, training, deployment, and monitoring... 
    Training
    Work at office
    Remote work

    Recruiting from Scratch

    San Francisco, CA
    4 days ago
  • $185k - $275k

     ...Staff Software Engineer, Cluster Orchestration Bellevue, WA / Sunnyvale...  ...CoreWeave combines superior infrastructure performance with deep...  ...foundation that powers AI training and inference at scale. This is an opportunity...  ...-based applications, or ML pipelines. Knowledge... 
    Training
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    2 days ago
  • $190.9k - $232.8k

     ...This Role As a staff software engineer for GenAI Performance...  ...powering our GenAI inference stack. You will lead...  ...will work closely with ML researchers, systems...  ...Collaborate with infrastructure, tooling, and ML teams...  ...certifications and training, and specific work... 
    Training
    Local area
    Worldwide

    Databricks

    San Francisco, CA
    1 day ago
  • $248.4k - $310.5k

     ...Staff Software Engineer, Full-Stack - Enterprise Gen AI Scale...  ...knowledge retrieval, inference, evaluation, and...  ..., designers, and AI/ML teams to create seamless...  ..., and cloud-based infrastructure Ship features at...  ...relevant education or training. Your recruiter can... 
    Training
    Full time

    Scale AI

    San Francisco, CA
    4 days ago
  • $188k - $275k

     ...Staff Software Engineer- AI Workload Orchestration Sunnyvale, CA / Bellevue...  ...combines superior infrastructure performance with deep technical...  ...) and underpins both training and inference workloads across the CoreWeave...  ...in AI infrastructure, ML platforms, HPC, or large-... 
    Training
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Software Engineer, ML Training and Inference Infrastructure. Be the first to apply!