Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Engineer, Data Infrastructure

Mistral AI

Job Description

Job Description

About Mistral 

 

At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.

 

We democratize AI through high-performance, optimized, open-source and cutting-edge models, products and solutions. Our comprehensive AI platform is designed to meet enterprise as well as personal needs. Our offerings include Le Chat, La Plateforme, Mistral Code and Mistral Compute - a suite that brings frontier intelligence to end-users.

 

We are a dynamic, collaborative team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation. Our teams are distributed between France, USA, UK, Germany and Singapore. We are creative, low-ego and team-spirited.

 

Join us to be part of a pioneering company shaping the future of AI. Together, we can make a meaningful impact. See more about our culture on

Role Summary 

 

This role focuses on building and operating the next generation of data infrastructure at Mistral AI. You will be a core contributor to our evolution, helping us design and scale massive compute fleets and storage systems designed for high performance and scalability.
You will help us move toward a future of decoupled control and data planes, scaling big data compute and storage platforms while ensuring secure and governed data access for MLOps and research. You will take full lifecycle ownership: from architecting the migration away from legacy orchestrators to implementing production-grade pipelines and participating in on-call rotations for critical training jobs.

 

 

What will you do

 

Build & Scale: Help us reach our goal of operating massive distributed compute and storage systems

Global Orchestration: Architect and maintain multi-cluster orchestration layers to optimize workload placement across diverse hardware and regions.
Design Future-Proof Storage: Architect our transition to modern storage formats to handle fine-tuning datasets at a scale that anticipates exabyte growth.
Platform Engineering: Contribute to the development of our internal training platform, ensuring seamless model training and fine-tuning capabilities across Kubernetes and SLURM based environments.
Metadata & Lineage : Implement and manage systems to provide clear visibility and lineage as our data and model pipelines grow in complexity.
Operational Excellence : Use modern deployment workflows to manage cloud-native deployments, ensuring our data platform can scale by orders of magnitude while remaining reliable and efficient.

 

About you

 

• Have 4+ years of experience in Data Infrastructure, MLOps, or Infrastructure Engineering.
• Have experience or a strong interest in supporting foundational compute and storage platforms.
• Are proficient in Python and enjoy solving the "brittle data lake" problem with modern, columnar storage standards.
• Are well-versed in Kubernetes-native tooling and excited to debug large-scale distributed systems across multi-cluster environments.
• Take pride in building and operating scalable, reliable, and secure systems from the ground up.
• Are comfortable with ambiguity and the challenges of building high-scale infrastructure in a rapid-growth AI environment.

 

 

What we offer
  • \uD83D\uDCB0 Competitive salary and equity.
  • \uD83D\uDE91 Healthcare: Medical/Dental/Vision covered for you and your family.
  • \uD83D\uDC74\uD83C\uDFFB Pension : 401K (6% matching)
  • \uD83C\uDFDD️ PTO : 18 days 
  • \uD83D\uDE97 Transportation: Reimburse office parking charges, or $120/month for public transport
  • \uD83C\uDFC0 Sport: $120/month reimbursement for gym membership
  • \uD83E\uDD55 Meal stipend: $400 monthly allowance for meals (solution might evolve as we grow bigger)
  • \uD83C\uDF0E Visa sponsorship 
  • \uD83E\uDD1D Coaching: we offer BetterUp coaching on a voluntary basis

 

By applying, you agree to our Applicant Privacy Policy.

Vacancy posted 8 days ago
Similar jobs that could be interesting for youBased on the Research Engineer, Data Infrastructure in San Francisco, CA vacancy
  •  ...video, lidar, radar, and sensor data. But today's data platforms (...  ...to close it. Our open‑source engine, Daft, is the distributed...  ...PhysicalAI labs and public AI infrastructure companies today. We have raised...  ...office. Your Role As a Research Engineer on the Visual Understanding... 
    Suggested
    Hourly pay
    Work at office
    Flexible hours
    Night shift
    1 day per week

    Eventual

    San Francisco, CA
    3 days ago
  •  ...expertise in model innovation and systems engineering paired with a design-minded product...  ...global AI, our models must be trained on data that reflects the world's diversity of languages...  ...and cultures. We are searching for a Research Engineer to own the quality and coverage... 
    Suggested
    Work at office
    Visa sponsorship
    Flexible hours

    Cartesia

    San Francisco, CA
    2 days ago
  • talentpluto is seeking a Research Engineer to enhance the quality assurance (QA) systems supporting training data for reinforcement learning. This position demands close collaboration with stakeholders to guarantee reliability and consistency in datasets. Key responsibilities... 
    Suggested

    talentpluto

    San Francisco, CA
    20 hours ago
  • $250k - $350k

     ...of AI applications. For 9 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including...  ...in enterprises around the world. The Enterprise ML Research Lab works on the front lines of this AI revolution. We are working... 
    Suggested
    Full time

    Scale AI

    San Francisco, CA
    20 hours ago
  •  ...Devin, the first AI software engineer, and Windsurf, an AI-native...  ...programmers, former founders, and researchers from the frontier of AI,...  ...moves at the speed of the infrastructure underneath it. Every training...  ..., experiment orchestration, data pipelines, and the tooling... 
    Suggested

    Cognition

    San Francisco, CA
    3 days ago
  • $350k

    Research Engineer, RL Infrastructure and Reliability (Knowledge Work) Anthropic’s mission is to create reliable, interpretable, and steerable AI systems...  ...injection, or large‑scale load testing. Experience with data quality pipelines, drift detection, or evaluation‑set... 
    Visa sponsorship
    Shift work

    aijoblist

    San Francisco, CA
    2 days ago
  • Rime Labs is hiring a Machine Learning Engineer to manage the operational data pipeline for voice AI. This role demands strong software engineering fundamentals and expertise in managing production data systems, particularly within GCP environments. Candidates should be... 
    Remote job

    Rime Labs

    San Francisco, CA
    3 days ago
  • $153k - $376k

     ...and collaboration, join us! The Data Platform team at Figma builds and operates...  ...set of stakeholders, including AI researchers, machine learning engineers, data scientists, product engineers...  ..., orchestration and pipeline infrastructure, and large‑scale data ingestion and... 
    Full time
    Remote work
    Work from home

    Figma

    San Francisco, CA
    5 days ago
  • $350k

     ...Software Engineer, Data Infrastructure Thinking Machines Lab's mission is to empower humanity through advancing collaborative general intelligence...  ...enables every breakthrough. You'll work directly with researchers to accelerate experiments, develop new datasets, improve... 
    Local area
    Immediate start
    Visa sponsorship
    Work visa
    Relocation package

    Thinking Machines Lab

    San Francisco, CA
    3 days ago
  • $250k - $380k

     ...LLM training and inference infrastructure that powers frontier models...  ...scale. Our systems unify how researchers train and serve models, abstracting...  ...Role We are looking for an engineer to design and implement the...  ...for multimodal (MM) data that cannot fit in memory.... 

    OpenAI

    San Francisco, CA
    3 days ago
  • $200k - $400k

     ...Senior Data Infrastructure Engineer Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences...  ...across ClickHouse, BigQuery, or similar. Partner with research and product teams to architect data solutions, evaluate... 
    Full time
    Work at office
    Local area

    Decagon

    San Francisco, CA
    20 hours ago
  • $160k - $225k

     ...agentic platform synthesizes complex employee data, pinpoints risky behaviors, and deploys...  ...Us Build and scale the foundational data infrastructure powering a category-defining product Work closely with engineering, data science, and product teams to operationalize... 
    Work experience placement
    Relocation package
    Flexible hours

    Fable

    San Francisco, CA
    4 days ago
  • $120k - $160k

     ...Founding Engineer For Airweave's Data And Infrastructure We're looking for a founding engineer to own Airweave's data and infrastructure layer, the systems that make our distributed search and data pipelines scalable, reliable and observable. At Airweave, you'll... 

    Airweave (yc X25)

    San Francisco, CA
    4 days ago
  •  ...great technology. The Liquid team is a community of world-class engineers, researchers, and builders creating the next generation of AI. Whether...  ...consolidating, gathering, and generating high-quality text data for pretraining, midtraining, SFT, and preference optimization... 

    Liquid AI

    San Francisco, CA
    2 days ago
  • A leading analytics startup in San Francisco seeks a Sales Engineer to develop solutions utilizing their data and AI platform. The ideal candidate will have over 5 years of experience in Sales Engineering, a strong background in the data stack, and exceptional communication... 

    Cube

    San Francisco, CA
    4 days ago
  • $200k - $275k

     ...Staff Software Engineer, Data Infrastructure San Francisco, CA Backed by leading Silicon Valley investors, Peregrine helps public safety organizations, state and local and governments, federal agencies, and private-sector institutions address society's challenges... 
    Work at office
    Local area

    Peregrine Corporation

    San Francisco, CA
    4 days ago
  • $197.3k - $313.7k

     ...Staff Software Engineer Salesforce is the #1 AI CRM, where humans with agents drive customer success together. Here, ambition...  ...Slack is looking for a Staff Software Engineer to join the Data Infrastructure team within the broader Data Engineering organization. The mission... 

    Salesforce

    San Francisco, CA
    1 day ago
  • Palantir is seeking a Backend Software Engineer in San Francisco to develop scalable software for data-driven operations. The role requires expertise in programming...  ...familiarity in distributed systems and cloud infrastructure. The position offers significant autonomy in a... 
    Relocation package

    jobs.frontdoordefense.com - Jobboard

    San Francisco, CA
    3 days ago
  •  ...the way we work and live. We’re growing rapidly and looking for exceptional people to join us! About the Role As an engineer on the Data Infrastructure team at Persona, you will play a key role in designing, building, and maintaining the data platform that powers our... 
    Full time
    For contractors
    Internship

    Persona

    San Francisco, CA
    2 days ago
  •  ...model innovation and systems engineering paired with a design‑minded...  ...experts in AI. About the Role Data is the lifeblood of our...  ...the training data and ML data infrastructure at Cartesia. This role sits...  ...code and partners closely with research and inference teams. This is... 
    Work at office
    Visa sponsorship
    Flexible hours

    Cartesia

    San Francisco, CA
    1 day ago
  • $175k

     .... Building these large-scale models requires performant data infrastructure to create and store the datasets used in all of our training...  ...costs to optimize for company value Partner with engineers and research scientists to facilitate progress for both research and... 
    Work at office
    Remote work

    I did my part and supported the Regular Toilet

    San Francisco, CA
    20 hours ago
  • Role As a Data Infrastructure Software Engineer at OpenEvidence, you will build end-to-end systems powering critical product and research workflows. Your work will focus on performance, scalability, and accuracy, granting you full autonomy over the infrastructure that helps... 
    Full time

    OpenEvidence

    San Francisco, CA
    2 days ago
  • $140k - $200k

     .... These include frontend and backend engineers, AI research scientists, and others from Amazon, Microsoft...  ...We're looking to hire for our Data side of our AI team at Speechify....  ...cost through a tight integration of infrastructure, engineering, and research work. We are... 
    Full time
    Work at office
    Shift work

    Clutch Canada

    San Francisco, CA
    3 days ago
  • $300k - $405k

     ...is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working...  ...Engineer on the Economic Research Data Platform team, you will design, build, and maintain critical infrastructure that powers Anthropic's research on AI's... 
    Work at office
    Visa sponsorship
    Flexible hours
    San Francisco, CA
    16 days ago
  •  ...ML Engineer - Data Scientist (Enterprise) Hilbert is building the ML systems that power demand intelligence for the world's largest...  ...owned the technical relationship Experience with ML infrastructure - feature stores, model serving, orchestration, monitoring... 
    Live in
    Flexible hours
    Shift work

    Hilbert\'s AI

    San Francisco, CA
    20 hours ago
  • Cartesia is seeking a Research Engineer in San Francisco to develop large-scale datasets essential for training our AI models. This role focuses on ensuring data quality and linguistic representation to enhance performance across multiple languages. The ideal candidate... 
    Flexible hours

    Cartesia

    San Francisco, CA
    1 day ago
  • $141.1k - $190.91k

     ...rich and variety of science data. To keep up our innovation,...  ...of Benchling’s Data Platform engineers, you’ll join a rapidly growing...  ...warehouse. The Big Data Infrastructure team is responsible for enabling...  ...accelerating the pace of research in the Life Sciences Comfortable... 
    Full time
    Temporary work
    Work at office
    Local area
    Remote work
    Home office
    Flexible hours
    3 days per week

    Benchling

    San Francisco, CA
    15 hours ago
  • $162k - $216k

     ...everything from customer-facing software to the data platform that will power the next era of...  ...problems and creating impact for the engine of the American economy, you'll love it here. Role: Software Engineer - Infrastructure Department: Data Platform Location... 
    Full time
    Work at office
    Immediate start
    Remote work
    Monday to Friday

    Baton (A Ryder Technology Lab)

    San Francisco, CA
    14 days ago
  • $295k - $380k

     ...Team The team works on research and systems that advance frontier...  ...means we also build the infrastructure needed to make new training...  ...Role This is a systems engineering role focused on ML training...  ...performance across training and data pipelines. Debug issues... 

    OpenAI

    San Francisco, CA
    2 days ago
  •  ..., London and Amsterdam. The Data Foundation and AI team within...  ...machine learning and AI infrastructure that powers capabilities across...  ...Responsibilities As a Senior Research Scientist on the Data...  ...serving infrastructure, feature engineering, and monitoring. In addition... 

    Plaid Inc

    San Francisco, CA
    20 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Engineer, Data Infrastructure. Be the first to apply!