Staff Software Engineer, ML Training and Inference Infrastructure
$228k - $285kRivian
Rivian is on a mission to keep the world adventurous forever. This goes for the emissions-free Electric Adventure Vehicles we build, and the curious, courageous souls we seek to attract.
As a company, we constantly challenge what’s possible, never simply accepting what has always been done. We reframe old problems, seek new solutions and operate comfortably in areas that are unknown. Our backgrounds are diverse, but our team shares a love of the outdoors and a desire to protect it for future generations.
Role Summary
As a Staff Software Engineer, ML training and inference infrastructure , you will be a member of the Perception team at Rivian, which develops advanced machine learning algorithms that directly impact safety critical self-driving features of our category defining vehicles.
We are looking for candidates with deep knowledge and strong enthusiasm towards establishing a state-of-art ML infrastructure for training and inference of large autonomous driving models; and optimizing the training and inference performance.
Responsibilities
- Optimize the performance of Deep Learning training workload on NVIDIA GPU systems on a large scale
- Optimize the latency of model inference and model pre- and post-processing on onboard systems
- Design, train, and deploy large deep learning models that can leverage the vast amount of labeled and unlabeled data
Qualifications
- PhD in CS/CE/EE, or equivalent, in industry experience
- Deep knowledge of PyTorch
- Knowledge of model training framework (e.g. PyTorch Lightning, ray, etc.)
- In-depth knowledge of transformer architecture and ways to accelerate the training and inference of transformer models
- Experience of performing large scale distributed training of models
- A track record of profiling models and doing detective work to improve model training and inference speed
Preferred Skill Requirements:
- Experience with CUDA or Triton language for writing custom ops
- Knowledge of Nvidia TensorRT
- Knowledge of NCCL
- Experience with edge computing systems
- A track record of efficiently solving complex problems collaboratively on larger teams
Pay Disclosure
Salary Range for California Based Applicants: $228,000.00 - $285,000.00 (actual compensation will be determined based on experience, location, and other factors permitted by law).
Benefits Summary : Rivian provides robust medical/Rx, dental and vision insurance packages for full-time employees, their spouse or domestic partner, and children up to age 26. Coverage is effective on the first day of employment
Equal Opportunity
Rivian is an equal opportunity employer and complies with all applicable federal, state, and local fair employment practices laws. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, ancestry, sex, sexual orientation, gender, gender expression, gender identity, genetic information or characteristics, physical or mental disability, marital/domestic partner status, age, military/veteran status, medical condition, or any other characteristic protected by law.
Rivian is committed to ensuring that our hiring process is accessible for persons with disabilities. If you have a disability or limitation, such as those covered by the Americans with Disabilities Act, that requires accommodations to assist you in the search and application process, please email us at View email address on ev.careers .
Candidate Data Privacy
Rivian may collect, use and disclose your personal information or personal data (within the meaning of the applicable data protection laws) when you apply for employment and/or participate in our recruitment processes (“Candidate Personal Data”). This data includes contact, demographic, communications, educational, professional, employment, social media/website, network/device, recruiting system usage/interaction, security and preference information. Rivian may use your Candidate Personal Data for the purposes of (i) tracking interactions with our recruiting system; (ii) carrying out, analyzing and improving our application and recruitment process, including assessing you and your application and conducting employment, background and reference checks; (iii) establishing an employment relationship or entering into an employment contract with you; (iv) complying with our legal, regulatory and corporate governance obligations; (v) recordkeeping; (vi) ensuring network and information security and preventing fraud; and (vii) as otherwise required or permitted by applicable law.
Rivian may share your Candidate Personal Data with (i) internal personnel who have a need to know such information in order to perform their duties, including individuals on our People Team, Finance, Legal, and the team(s) with the position(s) for which you are applying; (ii) Rivian affiliates; and (iii) Rivian’s service providers, including providers of background checks, staffing services, and cloud services.
Rivian may transfer or store internationally your Candidate Personal Data, including to or in the United States, Canada, the United Kingdom, and the European Union and in the cloud, and this data may be subject to the laws and accessible to the courts, law enforcement and national security authorities of such jurisdictions.
Please note that we are currently not accepting applications from third party application services.
- ...Type Hybrid Department Inference Model Serving Who are... ...serve humanity. We’re training and deploying frontier... ...team of researchers, engineers, designers, and more,... ...Members of Technical Staff to join the Model Serving... ...experience running production infrastructure at a large scale...TrainingFull timeWork experience placementWork at officeRemote workFlexible hours
$190.9k - $232.8k
...About This Role As a staff software engineer for GenAI inference, you will lead the... ..., distributed inference infrastructure - orchestrate across nodes... ...Deep understanding of ML inference internals: attention... ...certifications and training, and specific work location...TrainingLocal areaWorldwide- ...perception team. We're hiring a Staff Software Engineer to own ML Infrastructure at Voxel. Our applied ML team is... ...the technical direction for how we train, track, and ship vision models, build... ...trained models to optimized inference formats (TensorRT, ONNX), quantify...TrainingWork at officeFlexible hours
$248.71k - $292.6k
...fast, efficient AI inference. Our LPU-based system... ...possible. Build fast. Sr. Staff Software Engineer - High Performance... ..., software-defined infrastructure. Low‑Level GPU... ...closely with teams across ML compilers,... ...experience with multi‑GPU training/inference frameworks...Training$207k - $300k
Staff Software Engineer, On-Device Machine Learning Infrastructure corporate_fare Google place Sunnyvale, CA, USA... ...decision making), ML infrastructure, or specialization... ...-accelerated ML inference techniques.... ...relevant education or training. Your recruiter can share...TrainingFull timeShift work- ...deliver industry-leading training and inference speeds and empowers... ...run large-scale ML applications, without... ...We're hiring a Staff Engineer to own major areas of... ...Partner with ML, Product, Infrastructure, and Platform teams... ...of experience in software engineering, with substantial...Training
$170k - $216k
...The Simulation Infrastructure team creates reliable... ...evaluate the Waymo Driver's software stack at a massive... ...of customers Software Engineers, Product, Data Science... ...Build and evolve ML inference infrastructure for simulations... ...experience, relevant training and education, and...TrainingFull timeRemote work- ...Role We are hiring Software Engineers focused on AI Infrastructure to build the systems that... ...orchestration, large-scale inference systems, performance... ...infrastructure supporting training and inference workflows.... ...Familiarity with GPU-based ML workloads or distributed...TrainingInternshipImmediate start
- ...deliver industry-leading training and inference speeds and empowers machine... ...effortlessly run large-scale ML applications, without the... .... About The Role As a software engineer on our AI cloud platform,... ...distributed training and inference infrastructure. You will define and...Training
$156k - $316.8k
...About the Team The Inference Infrastructure team is the creator and... ...storage, machine learning training and inference, and... ...and are looking for engineers passionate about... ...efficient and secure ML platforms. - Collaborate... ...completed a PhD degree in Software Development, Computer...TrainingTemporary workLocal area$207k - $275k
...CoreWeave combines superior infrastructure performance with deep... ...ABOUT THE ROLE As a Staff Software Engineer, you will define and drive... ...Kubernetes infrastructure, and AI/ML platform reliability.... ...* Experience with AI/ML training and inference infrastructure. *...TrainingPermanent employmentFull timeTemporary workCasual workWork at officeFlexible hours- ...Staff Software Engineer, Ads ML Inference Infrastructure The Ads ML Inference Infra team owns the online inference and feature serving systems that power real-time model scoring and delivery for all Ads models at Pinterest. The team is looking for a staff engineer...Full timeWork at officeRelocationRelocation package
$228.4k - $303.55k
...the world's best data and AI infrastructure platform, so our customers... ...central to their missions. Our engineering teams build highly technical... ...trusted data analytics and ML platform in the world.... ...relevant certifications and training, and specific work location....TrainingLocal areaWorldwide$197k - $291k
Staff Software Engineer, Infrastructure, NetInfra Telemetry Google Sunnyvale, CA, USA Apply Advanced Experience... ...networking device telemetry for upcoming ML network fabric devices. Google Cloud... ..., and relevant education or training. Your recruiter can share more about...TrainingFull time$207k - $300k
Staff Software Engineer, Google Global Infrastructure, NetSoft corporate_fare Google place Sunnyvale, CA, USA Apply Bachelor... ..., and relevant education or training. Your recruiter can share more about... ...simplified solutions. Understand the AI/ML applications, performance...TrainingFull timeWorldwide$207k - $300k
...testing and launching software products and... ...developing large‑scale infrastructure, distributed... ...Computer Science, Engineering or related field.... ...experienced and passionate Staff Software... ...networking and AI/ML landscape. Effectively... ...relevant education or training. Your recruiter...TrainingFull timeRemote workWorldwide$262k - $365k
Senior Staff Software Engineer, Infrastructure, Agents Infra Advanced Experience owning outcomes and decision making... ...leading technical project strategy, ML design, and optimizing ML... ...experience, and relevant education or training. Your recruiter can share more about...TrainingFull time$262k - $365k
Senior Staff Software Engineer, ML Infrastructure, Agents Infrastructure Google Sunnyvale, CA, USA Qualifications Bachelor's degree or equivalent practical... ...related skills, experience, and relevant education or training. Your recruiter can share more about the specific...TrainingFull time$197.3k - $313.7k
...Job Category Software Engineering Job Details About... ...Slack is looking for a Staff Software Engineer to join the Data Infrastructure team within the broader... ...powering Slack's analytics, ML, and data-driven... ...promotion, benefits, training, assessment of job performance...Training$193.93k - $352.29k
...Senior/Staff Software Engineer, ML Data Infrastructure Mountain View, California (HQ) Nuro is a self-driving technology company on a mission to make... ...depends heavily on the quantity and diversity of its training and evaluation data. The team plays a crucial role...TrainingWork experience placement$214k - $295k
...Staff Software Engineer, Data Infrastructure, AI Compute Platform Redwood City, CA (Hybrid) Biohub is the first... ...five interconnected pillars: training frontier AI models specifically for... ...and Infrastructure team brings AI/ML technology and Data to the table in...TrainingWork at officeWorldwideRelocation packageFlexible hours3 days per week$207k - $300k
Staff Software Engineer, ML Data Infrastructure corporate_fare Google place San Bruno, CA, USA Apply Bachelor's degree or equivalent practical experience... ...of ML concepts, including model architecture and training. Ability to collaborate effectively across teams and...TrainingFull time$188k - $275k
...Staff Software Engineer, Inference CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers... ..., CoreWeave combines superior infrastructure performance with deep technical expertise... ...Exposure to large-scale AI/ML infrastructure or hyperscale cloud environments...Permanent employmentTemporary workCasual workWork at officeRemote workFlexible hours$200k - $400k
...About the Team The Infrastructure team builds and operates the... ...power Decagon: networking, data, ML serving, developer platform,... ...modelserving platforms for LLM inference with multiprovider routing... ...hiring a Senior Infrastructure Engineer to design, build, and operate...Full timeWork at officeLocal area$176k - $420k
...through humanoid robots.As a Software Engineer within our robotics teams,... ...You'll Do ~ Integrate ML models into embedded or robotic... ...Python scripts/tools for training, evaluation, and deployment... ...C++ systems for real-time ML inference and control and for the humanoid...TrainingHourly payFull timeTemporary workFlexible hours$220k - $300k
...Senior/Staff Software Engineer, AI/ML Location: New York, NY / San Francisco,... ...building foundational AI infrastructure that powers critical experiences... ...AI workloads and inference pipelines Drive technical... ...ingestion, preprocessing, training, deployment, and monitoring...TrainingWork at officeRemote work$185k - $275k
...Staff Software Engineer, Cluster Orchestration Bellevue, WA / Sunnyvale... ...CoreWeave combines superior infrastructure performance with deep... ...foundation that powers AI training and inference at scale. This is an opportunity... ...-based applications, or ML pipelines. Knowledge...TrainingPermanent employmentTemporary workCasual workWork at officeRemote workFlexible hours$190.9k - $232.8k
...This Role As a staff software engineer for GenAI Performance... ...powering our GenAI inference stack. You will lead... ...will work closely with ML researchers, systems... ...Collaborate with infrastructure, tooling, and ML teams... ...certifications and training, and specific work...TrainingLocal areaWorldwide$248.4k - $310.5k
...Staff Software Engineer, Full-Stack - Enterprise Gen AI Scale... ...knowledge retrieval, inference, evaluation, and... ..., designers, and AI/ML teams to create seamless... ..., and cloud-based infrastructure Ship features at... ...relevant education or training. Your recruiter can...TrainingFull time$188k - $275k
...Staff Software Engineer- AI Workload Orchestration Sunnyvale, CA / Bellevue... ...combines superior infrastructure performance with deep technical... ...) and underpins both training and inference workloads across the CoreWeave... ...in AI infrastructure, ML platforms, HPC, or large-...TrainingPermanent employmentTemporary workCasual workWork at officeRemote workFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Software Engineer, ML Training and Inference Infrastructure. Be the first to apply!
- senior c# .net software developer California
- ultimate software California
- software intern California
- healthcare software sales California
- software quality assurance California
- software sales California
- embedded software California
- software California
- internship software California
- software implementation project manager California

