Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Data Infrastructure Engineer GPU-Scale Datasets & APIs

$250k - $380k

Slope

Location San Francisco Employment Type Full time Department Scaling Compensation $250K – $380K • Offers Equity The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits. Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit) 401(k) retirement plan with employer match Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks) Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law) Mental health and wellness support Employer-paid basic life and disability coverage Annual learning and development stipend to fuel your professional growth Daily meals in our offices, and meal delivery credits as eligible Relocation support for eligible employees Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided. More details about our benefits are available to candidates during the hiring process. This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions. About the Team The Workload team is responsible for designing and running OpenAI’s LLM training and inference infrastructure that powers frontier models at massive scale. Our systems unify how researchers train and serve models, abstracting away the complexity of performance, parallelism, and execution across vast GPU/accelerator fleets. By providing this foundation, the Workload team ensures that researchers can focus on advancing model capabilities while we handle the scale, efficiency, and reliability required to bring those models to life. About the Role We are looking for an engineer to design and implement the dataset infrastructure that powers OpenAI’s next-generation training stack. You will be responsible for building standardized dataset interfaces, scaling pipelines across thousands of GPUs, and proactively testing performance bottlenecks. In this role, you will collaborate closely with the multimodal researchers, and other infra groups to ensure datasets are unified, efficient, and easy to consume. In this role, you will: Design and maintain standardized dataset APIs, including for multimodal (MM) data that cannot fit in memory. Build proactive testing and scale validation pipelines for dataset loading at GPU scale. Collaborate with teammates to integrate datasets seamlessly into training and inference pipelines, ensuring smooth adoption and a great user experience. Document and maintain dataset interfaces so they are discoverable, consistent, and easy for other teams to adopt. Establish safeguards and validation systems to ensure datasets remain reproducible and unchanged once standardized. Debug and resolve performance bottlenecks in distributed dataset loading (e.g., straggler systems slowing global training). Provide visualization and inspection tools to surface errors, bugs, or bottlenecks in datasets. You might thrive in this role if you: Have strong engineering fundamentals with experience in distributed systems, data pipelines, or infrastructure. Have experience building APIs, modular code, and scalable abstractions, while recognizing that abstractions ultimately serve the users and UX is an important part of the abstractions design. Are comfortable debugging bottlenecks across large fleets of machines. Take pride in building infrastructure that “just works,” and find joy in being the guardian of reliability and scale. Are collaborative, humble, and excited to own a foundational (if not glamorous) part of the ML stack. Bonus points if you: Have background knowledge in data math, probability, or distributed data theory. Have worked with GPU-scale distributed systems or dataset scaling for real-time data About OpenAI OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic. For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement . Qualified applicants with arrest or conviction records will be considered for employment in accordance with applicable law, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations. To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form . No response will be provided to inquiries unrelated to job posting compliance. We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link . OpenAI Global Applicant Privacy Policy At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology. Compensation Range: $250K - $380K #J-18808-Ljbffr

Vacancy posted 4 hours ago
Similar jobs that could be interesting for youBased on the Data Infrastructure Engineer GPU-Scale Datasets & APIs in San Francisco, CA vacancy
  •  ...Regular Toilet is seeking a Software Engineer to build large-scale models that support our mission of...  ...technology to create reliable data infrastructure. The ideal candidate has 5+ years of...  ...experience and expertise in managing large datasets for machine learning applications.... 
    Suggested

    I did my part and supported the Regular Toilet

    San Francisco, CA
    4 hours ago
  • $120k - $150k

    The Opportunity As a Data Engineer, you are passionate and drive excellence...  ...by modeling complex datasets to fit their needs Identify,...  ...how data gets delivered, and scale our data systems where needed...  ...enhance our data platform and infrastructure Required to work on-site 3... 
    Suggested
    Full time
    3 days per week

    Kandji

    San Francisco, CA
    3 days ago
  • $140k - $180k

     ...Data Infrastructure Engineer Alljoined is creating a future where humans are fully understood and...  ...apply deep learning research to large scale EEG datasets to decode multimedia input,...  ...datasets that maximize and saturate GPU utilization. Experience managing... 
    Suggested
    Local area
    Visa sponsorship

    Alljoined

    San Francisco, CA
    5 days ago
  •  ...Baseten is hiring a Network Engineer (Data Centers) in San Francisco to design and own the high-performance network infrastructure for their GPU clusters. This senior role collaborates closely with hardware and platform teams, directly impacting model performance and... 
    Suggested
    Flexible hours

    BaseTen

    San Francisco, CA
    4 hours ago
  • $257k - $327k

     ...OpenAI is building the infrastructure foundation for the...  ...generation of AI. The Data Center Engineering team defines the strategy...  ...for the large-scale data centers that support...  ...systems align with GPU deployments, campus-wide...  ...with scripting, APIs, and automation workflows... 
    Suggested
    For contractors
    Work at office
    Remote work

    OpenAI

    San Francisco, CA
    3 days ago
  • $179k - $218k

     ...Senior Staff Data Center Operations Engineer, GPU Hardware Architecture Crusoe is on...  ...vertically integrated AI infrastructure company built from the ground...  ..., who believe in the scale of our ambition and thrive...  .... Experience using large datasets or basic ML frameworks to... 
    Temporary work

    Crusoe

    San Francisco, CA
    4 days ago
  • $135k - $180k

     ...Data Engineer Opportunity Samba is a media intelligence...  ...about — in real time, at scale, across every screen....  ...and contextual datasets to scalable applications...  ...Develop and manage REST APIs to support secure data...  ...performance of cloud-based data infrastructure Who You Are... 

    Samba TV

    San Francisco, CA
    2 days ago
  •  ...globe, we offer an innovative GPU marketplace and AI...  ...We're seeking a Senior Infrastructure Engineer to help build and scale Hyperbolic's GPU Cloud Marketplace...  ...with storage and data infrastructure for AI/ML workloads...  ...checkpoints Proficiency with API design and cloud-init for... 
    Remote work

    Hyperbolic Labs

    San Francisco, CA
    4 days ago
  •  ...A cutting-edge AI video platform is seeking a Senior Software Engineer (Infrastructure) to manage its GPU deployments and maintain a reliable AWS backbone. You will collaborate with specialized providers to ensure high availability and architect scalable systems, impacting... 

    Jack & Jill/External ATS

    San Francisco, CA
    5 hours ago
  •  ...Data Infrastructure Engineer Los Angeles, Palo Alto, San Francisco, Toronto About HeyGen At HeyGen...  ...to be costly and challenging to scale. Our ambition is to build technology that...  ...Qualifications: Experience with GPU computing Experience with distributed... 

    HeyGen

    San Francisco, CA
    3 days ago
  •  ...A digital identity platform company in San Francisco is looking for a Data Infrastructure Engineer to design, build, and maintain their data platform. The role requires 3+ years of software engineering experience, proficiency in Python, and knowledge of technologies like... 

    Persona

    San Francisco, CA
    4 hours ago
  •  ...A leading AI research organization located in San Francisco is seeking an experienced data infrastructure engineer to design and operate data infrastructure supporting extensive compute fleets. You will manage the lifecycle ownership and ensure high performance, scalability... 
    Relocation package

    OpenAI

    San Francisco, CA
    5 hours ago
  •  ...Judgment Labs builds infrastructure for Agent Behavior Monitoring...  ...context retrieval loss in scaled production environments....  ...are looking for a Senior Data Infrastructure Engineer to build and scale the real...  ...building pipelines that call LLM APIs at scale: request batching... 

    Judgment Labs

    San Francisco, CA
    1 day ago
  • $160k - $225k

     ...A technology-driven security company based in California is looking for a Data Infrastructure Engineer. This role focuses on designing and maintaining scalable data pipelines and infrastructure, ensuring data quality and reliability. Ideal candidates should have 3–7+... 
    Flexible hours

    Fable Security LLP

    San Francisco, CA
    5 hours ago
  •  ...combination of proprietary infrastructure and software, we...  ...finance at a global scale. Proudly founded in...  ...hiring a Director, Data Platform Engineering (based in Singapore)...  ...-modeled, structured datasets including transactional...  ...automated pipelines, API endpoints, and... 
    Local area
    Worldwide

    Airwallex

    San Francisco, CA
    5 days ago
  • $320k

    Principal Engineer, AI And Data Platform Engineering (r4941) Own...  ...responsible for the infrastructure that underpins autonomy...  ...Engineer that will scale an initial architecture...  ...EW, IMU). Establish dataset versioning, data lineage...  ...high utilization of GPU resources under... 
    Full time
    Temporary work
    Part time

    jobs.frontdoordefense.com - Jobboard

    San Francisco, CA
    4 days ago
  •  ...About the Role We are seeking a Data Infrastructure Engineer to build and operate the infrastructure...  ...orbital sensing data into production datasets, models, and customer-facing insights...  ...model complexity, and product usage scale. What You'll Do Design,... 
    Permanent employment
    Full time

    Matter Intelligence

    San Francisco, CA
    2 days ago
  • Cursor is seeking an Analytics Platform Engineer to take ownership of data foundations and work on optimizing their data lakehouse infrastructure. In this role, you will collaborate...  ...including expertise in managing large-scale data ingestion. This is an opportunity... 

    Cursor

    San Francisco, CA
    3 days ago
  •  ...AI research, flexible infrastructure, and seamless developer...  ...help build the platform engineers turn to to ship AI products...  ...As a Network Engineer (Data Centers) at Baseten,...  ...that powers our GPU clusters—from cluster fabric...  ...define how we build and scale data center networks. Your... 
    Flexible hours

    BaseTen

    San Francisco, CA
    5 hours ago
  • 100 Salesforce, Inc. is looking for a Staff Software Engineer to join the Data Infrastructure team. This role involves designing and operating reliable, scalable data infrastructure that supports analytics and machine learning workflows. The ideal candidate will have 1... 

    100 Salesforce, Inc.

    San Francisco, CA
    4 hours ago
  • $350k

     ...training researcher, responsible for curating and analyzing large-scale datasets that support AI model development. The ideal candidate will...  ...in relevant fields. This role blends research and engineering, requiring both theoretical knowledge and practical skills.... 

    Thinking Machines Lab

    San Francisco, CA
    3 days ago
  • $257k - $327k

     ...OpenAI is building the infrastructure foundation for the next generation of AI. The Data Center Engineering team defines the strategy, reference...  ...standards for the large-scale data centers that support OpenAI...  ...systems support liquid-cooled GPU rack deployments and reliable... 
    For contractors
    Work at office

    OpenAI

    San Francisco, CA
    3 days ago
  • $257k - $327k

     ...OpenAI is building the infrastructure foundation for the next generation of AI. The Data Center Engineering team defines the strategy, reference...  ...standards for the large-scale data centers that support OpenAI...  ...systems support liquid-cooled GPU rack deployments and reliable... 
    For contractors
    Work at office

    OpenAI

    San Francisco, CA
    3 days ago
  • $248k - $279k

    A leading gaming platform in San Francisco is seeking an Engineering Manager excited to build data infrastructure at scale. You will lead teams that process petabytes of data for millions of users and make impactful technical decisions. The ideal candidate has significant... 
    Relocation package

    Discord

    San Francisco, CA
    5 days ago
  •  ...already live with pilot partners and scaling fast. The founding team brings experience...  ...Role We’re looking for a Robotics Data Infrastructure Engineer to own and build the data systems...  ...devices, manage large-scale multi-modal datasets (images, video, time-series, text, etc... 
    Full time
    Work experience placement
    Immediate start

    Verne Robotics

    San Francisco, CA
    5 hours ago
  •  ...Zyphra, an AI company in San Francisco, is seeking a Data Engineer to enhance datasets and data pipelines across various modalities. In this collaborative role, you'll collect large-scale datasets and implement efficient processing pipelines. Ideal candidates are proactive... 

    Zyphra

    San Francisco, CA
    5 hours ago
  • $100k - $200k

     ...Voiceflow is seeking a skilled ML-Infrastructure Engineer in San Francisco to architect and operate auto-scaling systems for our voice AI simulation platform. The role includes optimizing GPU and compute infrastructure, ensuring high performance and reliability. Ideal... 
    Work at office

    Voiceflow

    San Francisco, CA
    5 hours ago
  •  ...and inference infrastructure that powers frontier...  ...at massive scale. Our systems unify...  ...across vast GPU/accelerator fleets...  ...looking for an engineer to design and implement the dataset infrastructure...  ...dataset APIs, including for...  ...multimodal (MM) data that cannot fit... 

    Slope

    San Francisco, CA
    4 hours ago
  • $160k - $230k

     ...A technology company in San Francisco is hiring for a foundational role to design and implement a large-scale data infrastructure. You'll develop the Models API and manage data pipelines using Kafka, Postgres, and Clickhouse. Ideal candidates will have experience in schema... 
    Flexible hours

    Meter Service

    San Francisco, CA
    4 hours ago
  • $300k - $430k

     ...team. About the Team The ML Infrastructure team builds the systems that...  ...hiring a Staff ML Infrastructure Engineer to own the platforms powering...  ...-tuning and post-training at scale Implement and integrate state...  ...training: multi-node GPU clusters, fault tolerance, and... 
    Work at office

    Decagon

    San Francisco, CA
    4 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Data Infrastructure Engineer GPU-Scale Datasets & APIs. Be the first to apply!