Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, Data Acquisition

$293k - $385k

OpenAI

Overview:

The Data Acquisition team within the Foundations organization at OpenAI is responsible for all aspects of data collection to support our model training operations. Our team manages web crawling and GPTBot services and works closely with Data Processing, Architecture, and Scaling teams. We are looking for a skilled Software Engineer to join our Data Acquisition team.

Responsibilities:

  • Own and lead engineering projects in the area of data acquisition including web crawling, data ingestion, and search.

  • Collaborate with other sub-teams, such as Data Processing, Architecture, and Scaling, to ensure smooth data flow and system operability.

  • Work closely with the legal team to handle any compliance or data privacy-related matters.

  • Develop and deploy highly scalable distributed systems capable of handling petabytes of data.

  • Architect and implement algorithms for data indexing and search capabilities.

  • Build and maintain backend services for data storage, including work with key-value databases and synchronization.

  • Deploy solutions in a Kubernetes Infrastructure-as-Code environment and perform routine system checks.

  • Conduct and analyze experiments on data to provide insights into system performance.


Qualifications:

  • BS/MS/PhD in Computer Science or a related field.

  • 4+ years of industry experience in software development.

  • Experience with large web crawlers a plus

  • Strong expertise in large stateful distributed systems and data processing.

  • Proficiency in Kubernetes, and Infrastructure-as-Code concepts.

  • Willingness and enthusiasm for trying new approaches and technologies.

  • Ability to handle multiple tasks and adapt to changing priorities.

  • Strong communication skills, both written and verbal.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI's Affirmative Action and Equal Employment Opportunity Policy Statement.

Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Compensation Range: $293K - $385K

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Software Engineer, Data Acquisition in San Francisco, CA vacancy
  • $150k - $200k

    Lumafield is seeking a Senior Systems Software Engineer for our San Francisco office to create software powering our manufacturing inline CT scanning products. You will design data acquisition solutions and write firmware while collaborating with cross-functional teams... 
    Suggested
    Work at office

    Lumafield

    San Francisco, CA
    2 days ago
  • $175.6k - $204.3k

     ...the Team/Role As WEX continues to scale its Data-as-a-Service (DaaS) platform, the Data Acquisition Team plays a critical role in enabling secure,...  .... We are seeking a hands-on Senior Manager, Software Engineering - Data Acquisition to lead our team in acquiring... 
    Suggested
    Remote work
    Flexible hours

    WEX

    San Francisco, CA
    13 hours ago
  •  ...Senior Software Engineer, ML Data San Francisco, CA • Hybrid • Reports to Head of Vision & AI Who We Are Voxel is building the future of Computer Vision and Machine Learning for operations, risk, and safety. We use computer vision and AI to enable existing... 
    Suggested
    Work at office
    Flexible hours

    Voxel Labs

    San Francisco, CA
    3 days ago
  •  ...billion. We work in-person five days a week in our San Francisco, NYC, or London offices. About the Role As a Senior Software Engineer (AI Data & Evaluation) at Mercor, you will be at the core of building the data infrastructure and evaluation systems that power... 
    Suggested
    Work at office
    Relocation package

    Mercor Alabaster

    San Francisco, CA
    1 day ago
  •  ...revolutionizing the lending landscape. SoFi is seeking enthusiastic Senior Software Engineers who are ready to lead the development of key advancement to...  ...next generation of our financial services platform. Data Foundations leads the path on building the central platform-... 
    Suggested
    Full time
    Work experience placement
    Remote work

    SoFi

    San Francisco, CA
    1 day ago
  • $144k - $216k

     ...Report, Amplitude is the best-in-class solution for product, data, and marketing teams. Learn more at amplitude.com. As an organization...  ...the lifecycle of those connections and credentials. As a Software Engineer II on the Data Warehouse team, you'll help build and scale... 
    Work at office
    Home office
    Flexible hours

    Amplitude

    San Francisco, CA
    13 hours ago
  • $179.5k - $221.5k

     ...gets done. At Airtable, we're passionate about democratizing software creation — empowering anyone to build powerful, flexible...  ...full apps and deploy AI agents directly into their workflows. Data engineering plays a critical role in this evolution by delivering the insights... 
    Live in
    Remote work
    Flexible hours
    Shift work

    Airtable

    San Francisco, CA
    2 days ago
  • $230k - $385k

     ...understand and reflect human preferences - the Human Data team is at the heart of that effort. The Human Data engineering team creates the systems that enable scalable,...  ...loops. About the Role We're looking for software engineers to join the Human Data team and build... 
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    4 days ago
  • $140k - $265k

     ...work across teams by accessing the industry's broadest range of data: enterprise and world, structured and unstructured,...  ...every company. About the Role: We are looking for a Software Engineer to join Glean's Data Foundations team - the group that owns the... 
    Work at office
    Home office
    Flexible hours

    Glean.info

    San Francisco, CA
    2 days ago
  • $180k - $220k

     ...Software Engineer, Data Los Angeles, Palo Alto, San Francisco About HeyGen At HeyGen, our mission is to make visual storytelling accessible to all. Over the last decade, visual content has become the preferred method of information creation, consumption, and retention... 
    Work experience placement

    HeyGen

    San Francisco, CA
    3 days ago
  • $160k - $220k

     ...platform turns siloed and disconnected data into operational intelligence — instantly...  ...enterprise and internationally. Team As an engineering team, we believe strongly that empathy...  ...controls. Role We are looking for a software engineer to join our growing team where... 
    Work at office
    Local area

    Peregrine Technologies

    San Francisco, CA
    3 days ago
  • $144k - $288k

     ...Senior Software Engineer, Data Cambridge, MA USA; San Francisco, CA USA Your Impact at LILA Join us in shaping the future of science! We are seeking Senior Software Engineers with backend experience to join our Data Platform Team (Data), where you'll collaborate... 
    Full time
    Work at office
    Local area
    Flexible hours

    Lila Sciences

    San Francisco, CA
    13 hours ago
  • $320k

     ...Anthropic Rl Data Engineer Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe...  ...external data vendors Minimum Qualifications Strong software engineering skills and proficiency in at least one modern... 
    Work at office
    Visa sponsorship
    Flexible hours

    Colorwave Inc

    San Francisco, CA
    2 days ago
  • $190k - $240k

     ...both organically and through strategic acquisitions. We have a mix of remote and onsite employees...  ...Bay. We are looking for a senior software engineer who has fully embraced AI tooling, and...  ...the stack, from backend services and data pipelines to frontend product... 
    Remote work
    Shift work

    Cynch AI

    San Francisco, CA
    3 days ago
  • $185k - $385k

     ...operates at the intersection of research, engineering, product, and design to bring OpenAI's...  ...expanding distribution, unlocking new user acquisition channels, and building high-leverage...  ...backend services, APIs, experimentation, and data-that facilitate seamless integrations,... 
    Full time

    OpenAI

    San Francisco, CA
    13 hours ago
  • Python and Kubernetes Software Engineer - Data, AI/ML & Analytics Join to apply for the Python and Kubernetes Software Engineer - Data, AI/ML & Analytics role at Canonical Python and Kubernetes Software Engineer - Data, AI/ML & Analytics 2 months ago Be among the first... 
    Full time
    Freelance
    Internship
    Local area
    Remote work
    Work from home
    Worldwide

    Canonical

    San Francisco, CA
    13 hours ago
  • $350k

     ...Software Engineer, Data Infrastructure Thinking Machines Lab's mission is to empower humanity through advancing collaborative general intelligence. We're building a future where everyone has access to the knowledge and tools to make AI work for their unique needs and... 
    Local area
    Immediate start
    Visa sponsorship
    Work visa
    Relocation package

    Thinking Machines Lab

    San Francisco, CA
    4 days ago
  • $120k - $160k

     ...Founding Engineer For Airweave's Data And Infrastructure We're looking for a founding engineer to own Airweave's data and infrastructure layer, the systems that make our distributed search and data pipelines scalable, reliable and observable. At Airweave, you'll... 

    Airweave (yc X25)

    San Francisco, CA
    13 hours ago
  • $160k - $225k

     ...enterprise scale, our agentic platform synthesizes complex employee data, pinpoints risky behaviors, and deploys highly relevant...  ...infrastructure powering a category-defining product Work closely with engineering, data science, and product teams to operationalize data at... 
    Work experience placement
    Relocation package
    Flexible hours

    Fable

    San Francisco, CA
    13 hours ago
  • $200k - $400k

     ...Senior Data Infrastructure Engineer Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences. Our technology enables industry-defining enterprises like Avis Budget Group, Block's Cash App and Square, Chime,... 
    Full time
    Work at office
    Local area

    Decagon

    San Francisco, CA
    6 days ago
  • $250k - $380k

     ...those models to life. About the Role We are looking for an engineer to design and implement the dataset infrastructure that powers OpenAI...  ...standardized dataset APIs, including for multimodal (MM) data that cannot fit in memory. Build proactive testing and scale... 

    OpenAI

    San Francisco, CA
    4 days ago
  • $180k - $250k

     ...portfolios with confidence by unifying energy data, planning, forecasting, and operations in...  ...of energy buyers, data scientists, and engineers, Verse enables faster, smarter energy...  ...these solutions with highest standards of software-engineering best practices. You will be an... 
    Remote work
    Flexible hours

    Verse

    San Francisco, CA
    25 days ago
  • $90 - $110 per hour

     ...Angelo , Larry Summers , and Jack Dorsey . Position: Software & Data Science Expert Type: Contract Compensation: $90–$...  ...Must-Have ~3+ years of experience in software engineering or data science & analytics . Application Process (Takes... 
    Contract work
    Summer work
    Remote work

    Mercor

    San Francisco, CA
    6 days ago
  • A technology company in San Francisco is seeking a skilled software engineer to build and scale data pipelines for delivering high-quality datasets. The ideal candidate has strong Python development skills and potential experience in Go or Typescript. Responsibilities include... 

    Sieve

    San Francisco, CA
    4 days ago
  • $144k - $216k

    About the Role & Team The Data Warehouse team builds the systems that connect Amplitude to the broader data ecosystem, including...  ...the lifecycle of those connections and credentials. As a Software Engineer II on the Data Warehouse team, you will help build and scale... 
    Work at office
    Home office
    Flexible hours

    Amplitude

    San Francisco, CA
    13 hours ago
  • A leading AI research company in San Francisco seeks software engineers for their Human Data team. The role focuses on building robust systems for gathering and evaluating human feedback that improve AI models. Ideal candidates are strong in full-stack development and enjoy... 
    Work at office
    Flexible hours

    OpenAI

    San Francisco, CA
    2 days ago
  • $160.65k - $217.35k

     ...Organizations use Mapbox applications, data, SDKs and APIs to create customized and...  ...service. This role will be focused on data engineering for feature expansion, improving...  ...quality data to our customers Mentor other software engineers to develop all aspects of their... 

    Mapbox

    San Francisco, CA
    2 days ago
  • $216k - $270k

    Scale AI, Inc. is seeking a software engineer to design, build, and maintain scalable systems within its Generative AI Data Engine. As part of a dynamic hybrid team based in San Francisco or New York City, you will play a crucial role in producing high-quality AI data while... 

    Scale AI, Inc.

    San Francisco, CA
    13 hours ago
  • $123.7k - $254.67k

     ...platform purpose-built for performance marketers. We leverage massive data and cutting‑edge science to automate and optimize TV advertising...  ...advertisers can trust to grow their business. As a Senior Data Engineer at tvScientific, you will be a key player in implementing the... 
    Work at office
    Local area
    Relocation
    Relocation package

    Pinterest

    San Francisco, CA
    1 day ago
  • $149k - $198.5k

     ...Job Description Job Description Mission Summary: We are seeking an experienced full stack engineer to join our new AI Data Application team. This pivotal role will drive the development and execution of initiatives aimed at significantly accelerating our ML dataset... 
    Work at office
    Remote work

    Motional

    San Francisco, CA
    a month ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, Data Acquisition. Be the first to apply!