Research Engineer, Data Infrastructure
Mistral AI
Job Description
Job Description
About Mistral
At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.
We democratize AI through high-performance, optimized, open-source and cutting-edge models, products and solutions. Our comprehensive AI platform is designed to meet enterprise as well as personal needs. Our offerings include Le Chat, La Plateforme, Mistral Code and Mistral Compute - a suite that brings frontier intelligence to end-users.
We are a dynamic, collaborative team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation. Our teams are distributed between France, USA, UK, Germany and Singapore. We are creative, low-ego and team-spirited.
Join us to be part of a pioneering company shaping the future of AI. Together, we can make a meaningful impact. See more about our culture on
Role Summary
This role focuses on building and operating the next generation of data infrastructure at Mistral AI. You will be a core contributor to our evolution, helping us design and scale massive compute fleets and storage systems designed for high performance and scalability.
You will help us move toward a future of decoupled control and data planes, scaling big data compute and storage platforms while ensuring secure and governed data access for MLOps and research. You will take full lifecycle ownership: from architecting the migration away from legacy orchestrators to implementing production-grade pipelines and participating in on-call rotations for critical training jobs.
What will you do
• Build & Scale: Help us reach our goal of operating massive distributed compute and storage systems
• Global Orchestration: Architect and maintain multi-cluster orchestration layers to optimize workload placement across diverse hardware and regions.
• Design Future-Proof Storage: Architect our transition to modern storage formats to handle fine-tuning datasets at a scale that anticipates exabyte growth.
• Platform Engineering: Contribute to the development of our internal training platform, ensuring seamless model training and fine-tuning capabilities across Kubernetes and SLURM based environments.
• Metadata & Lineage : Implement and manage systems to provide clear visibility and lineage as our data and model pipelines grow in complexity.
• Operational Excellence : Use modern deployment workflows to manage cloud-native deployments, ensuring our data platform can scale by orders of magnitude while remaining reliable and efficient.
About you
• Have 4+ years of experience in Data Infrastructure, MLOps, or Infrastructure Engineering.
• Have experience or a strong interest in supporting foundational compute and storage platforms.
• Are proficient in Python and enjoy solving the "brittle data lake" problem with modern, columnar storage standards.
• Are well-versed in Kubernetes-native tooling and excited to debug large-scale distributed systems across multi-cluster environments.
• Take pride in building and operating scalable, reliable, and secure systems from the ground up.
• Are comfortable with ambiguity and the challenges of building high-scale infrastructure in a rapid-growth AI environment.
What we offer
- \uD83D\uDCB0 Competitive salary and equity.
- \uD83D\uDE91 Healthcare: Medical/Dental/Vision covered for you and your family.
- \uD83D\uDC74\uD83C\uDFFB Pension : 401K (6% matching)
- \uD83C\uDFDD️ PTO : 18 days
- \uD83D\uDE97 Transportation: Reimburse office parking charges, or $120/month for public transport
- \uD83C\uDFC0 Sport: $120/month reimbursement for gym membership
- \uD83E\uDD55 Meal stipend: $400 monthly allowance for meals (solution might evolve as we grow bigger)
- \uD83C\uDF0E Visa sponsorship
- \uD83E\uDD1D Coaching: we offer BetterUp coaching on a voluntary basis
By applying, you agree to our Applicant Privacy Policy.
- ...video, lidar, radar, and sensor data. But today's data platforms (... ...to close it. Our open‑source engine, Daft, is the distributed... ...PhysicalAI labs and public AI infrastructure companies today. We have raised... ...office. Your Role As a Research Engineer on the Visual Understanding...SuggestedHourly payWork at officeFlexible hoursNight shift1 day per week
- ...expertise in model innovation and systems engineering paired with a design-minded product... ...global AI, our models must be trained on data that reflects the world's diversity of languages... ...and cultures. We are searching for a Research Engineer to own the quality and coverage...SuggestedWork at officeVisa sponsorshipFlexible hours
- talentpluto is seeking a Research Engineer to enhance the quality assurance (QA) systems supporting training data for reinforcement learning. This position demands close collaboration with stakeholders to guarantee reliability and consistency in datasets. Key responsibilities...Suggested
$250k - $350k
...of AI applications. For 9 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including... ...in enterprises around the world. The Enterprise ML Research Lab works on the front lines of this AI revolution. We are working...SuggestedFull time- ...Devin, the first AI software engineer, and Windsurf, an AI-native... ...programmers, former founders, and researchers from the frontier of AI,... ...moves at the speed of the infrastructure underneath it. Every training... ..., experiment orchestration, data pipelines, and the tooling...Suggested
$350k
Research Engineer, RL Infrastructure and Reliability (Knowledge Work) Anthropic’s mission is to create reliable, interpretable, and steerable AI systems... ...injection, or large‑scale load testing. Experience with data quality pipelines, drift detection, or evaluation‑set...Visa sponsorshipShift work- Rime Labs is hiring a Machine Learning Engineer to manage the operational data pipeline for voice AI. This role demands strong software engineering fundamentals and expertise in managing production data systems, particularly within GCP environments. Candidates should be...Remote job
$153k - $376k
...and collaboration, join us! The Data Platform team at Figma builds and operates... ...set of stakeholders, including AI researchers, machine learning engineers, data scientists, product engineers... ..., orchestration and pipeline infrastructure, and large‑scale data ingestion and...Full timeRemote workWork from home$350k
...Software Engineer, Data Infrastructure Thinking Machines Lab's mission is to empower humanity through advancing collaborative general intelligence... ...enables every breakthrough. You'll work directly with researchers to accelerate experiments, develop new datasets, improve...Local areaImmediate startVisa sponsorshipWork visaRelocation package$250k - $380k
...LLM training and inference infrastructure that powers frontier models... ...scale. Our systems unify how researchers train and serve models, abstracting... ...Role We are looking for an engineer to design and implement the... ...for multimodal (MM) data that cannot fit in memory....$200k - $400k
...Senior Data Infrastructure Engineer Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences... ...across ClickHouse, BigQuery, or similar. Partner with research and product teams to architect data solutions, evaluate...Full timeWork at officeLocal area$160k - $225k
...agentic platform synthesizes complex employee data, pinpoints risky behaviors, and deploys... ...Us Build and scale the foundational data infrastructure powering a category-defining product Work closely with engineering, data science, and product teams to operationalize...Work experience placementRelocation packageFlexible hours$120k - $160k
...Founding Engineer For Airweave's Data And Infrastructure We're looking for a founding engineer to own Airweave's data and infrastructure layer, the systems that make our distributed search and data pipelines scalable, reliable and observable. At Airweave, you'll...- ...great technology. The Liquid team is a community of world-class engineers, researchers, and builders creating the next generation of AI. Whether... ...consolidating, gathering, and generating high-quality text data for pretraining, midtraining, SFT, and preference optimization...
- A leading analytics startup in San Francisco seeks a Sales Engineer to develop solutions utilizing their data and AI platform. The ideal candidate will have over 5 years of experience in Sales Engineering, a strong background in the data stack, and exceptional communication...
$200k - $275k
...Staff Software Engineer, Data Infrastructure San Francisco, CA Backed by leading Silicon Valley investors, Peregrine helps public safety organizations, state and local and governments, federal agencies, and private-sector institutions address society's challenges...Work at officeLocal area$197.3k - $313.7k
...Staff Software Engineer Salesforce is the #1 AI CRM, where humans with agents drive customer success together. Here, ambition... ...Slack is looking for a Staff Software Engineer to join the Data Infrastructure team within the broader Data Engineering organization. The mission...- Palantir is seeking a Backend Software Engineer in San Francisco to develop scalable software for data-driven operations. The role requires expertise in programming... ...familiarity in distributed systems and cloud infrastructure. The position offers significant autonomy in a...Relocation package
- ...the way we work and live. We’re growing rapidly and looking for exceptional people to join us! About the Role As an engineer on the Data Infrastructure team at Persona, you will play a key role in designing, building, and maintaining the data platform that powers our...Full timeFor contractorsInternship
- ...model innovation and systems engineering paired with a design‑minded... ...experts in AI. About the Role Data is the lifeblood of our... ...the training data and ML data infrastructure at Cartesia. This role sits... ...code and partners closely with research and inference teams. This is...Work at officeVisa sponsorshipFlexible hours
$175k
.... Building these large-scale models requires performant data infrastructure to create and store the datasets used in all of our training... ...costs to optimize for company value Partner with engineers and research scientists to facilitate progress for both research and...Work at officeRemote work- Role As a Data Infrastructure Software Engineer at OpenEvidence, you will build end-to-end systems powering critical product and research workflows. Your work will focus on performance, scalability, and accuracy, granting you full autonomy over the infrastructure that helps...Full time
$140k - $200k
.... These include frontend and backend engineers, AI research scientists, and others from Amazon, Microsoft... ...We're looking to hire for our Data side of our AI team at Speechify.... ...cost through a tight integration of infrastructure, engineering, and research work. We are...Full timeWork at officeShift work$300k - $405k
...is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working... ...Engineer on the Economic Research Data Platform team, you will design, build, and maintain critical infrastructure that powers Anthropic's research on AI's...Work at officeVisa sponsorshipFlexible hours- ...ML Engineer - Data Scientist (Enterprise) Hilbert is building the ML systems that power demand intelligence for the world's largest... ...owned the technical relationship Experience with ML infrastructure - feature stores, model serving, orchestration, monitoring...Live inFlexible hoursShift work
- Cartesia is seeking a Research Engineer in San Francisco to develop large-scale datasets essential for training our AI models. This role focuses on ensuring data quality and linguistic representation to enhance performance across multiple languages. The ideal candidate...Flexible hours
$141.1k - $190.91k
...rich and variety of science data. To keep up our innovation,... ...of Benchling’s Data Platform engineers, you’ll join a rapidly growing... ...warehouse. The Big Data Infrastructure team is responsible for enabling... ...accelerating the pace of research in the Life Sciences Comfortable...Full timeTemporary workWork at officeLocal areaRemote workHome officeFlexible hours3 days per week$162k - $216k
...everything from customer-facing software to the data platform that will power the next era of... ...problems and creating impact for the engine of the American economy, you'll love it here. Role: Software Engineer - Infrastructure Department: Data Platform Location...Full timeWork at officeImmediate startRemote workMonday to Friday$295k - $380k
...Team The team works on research and systems that advance frontier... ...means we also build the infrastructure needed to make new training... ...Role This is a systems engineering role focused on ML training... ...performance across training and data pipelines. Debug issues...- ..., London and Amsterdam. The Data Foundation and AI team within... ...machine learning and AI infrastructure that powers capabilities across... ...Responsibilities As a Senior Research Scientist on the Data... ...serving infrastructure, feature engineering, and monitoring. In addition...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Research Engineer, Data Infrastructure. Be the first to apply!
- research software engineer San Francisco, CA
- senior research engineer San Francisco, CA
- research engineer San Francisco, CA
- ai research engineer San Francisco, CA
- deep learning research engineer San Francisco, CA
- research assistant engineering San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- research programmer San Francisco, CA
- staff data engineer San Francisco, CA
- data visualization developer San Francisco, CA



