Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, Data Infrastructure & Acquisition

Jobgether

Software Engineer, Data Infrastructure & Acquisition

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Software Engineer, Data Infrastructure & Acquisition based in South Africa.

This role sits at the intersection of software engineering, data infrastructure, and applied AI, focusing on building and scaling the systems that power large-scale dataset acquisition for next-generation machine learning models. You will work in a fully distributed environment alongside engineers, researchers, and product leaders to design robust ingestion pipelines capable of handling massive, high-quality audio and text datasets. The work directly impacts how data is collected, processed, and transformed into training-ready assets that fuel AI innovation. You'll contribute to improving the cost, scale, and efficiency of data systems while helping define the roadmap for dataset development. The environment is fast-moving, highly collaborative, and deeply technical, with strong ownership and autonomy. This is a chance to shape foundational infrastructure used by millions of users globally.

Accountabilities

You will be responsible for building, maintaining, and scaling large-scale data ingestion and acquisition systems that support AI model training and product development. You will design and extend cloud-based infrastructure, optimize data pipelines, and ensure efficient processing of high-volume datasets across distributed systems. You will collaborate closely with AI scientists and engineering teams to improve data quality, reduce cost, and increase throughput for training workflows. You will also identify and integrate new external data sources, including audio and web-based datasets, into production pipelines. Additionally, you will help define dataset strategy and contribute to architectural decisions that support long-term scalability and reliability of infrastructure systems.

  • Build and maintain scalable data ingestion and processing pipelines
  • Extend cloud infrastructure (GCP) using Infrastructure-as-Code tools
  • Identify and integrate new data sources into acquisition systems
  • Collaborate with research and AI teams to improve dataset quality and efficiency
  • Optimize systems for cost, throughput, and reliability at scale
  • Contribute to architecture and roadmap decisions for data infrastructure
Requirements

The ideal candidate brings strong software engineering experience with a focus on distributed systems, data infrastructure, or backend engineering in production environments. You should have hands-on experience with Python and Linux-based development workflows, along with strong familiarity with cloud platforms such as GCP and infrastructure-as-code tools like Terraform. Experience with Docker, large-scale data pipelines, or web crawling systems is highly valuable. You are comfortable working in fast-paced, ambiguous environments and can manage multiple priorities effectively. Strong communication skills and the ability to collaborate across technical and research-driven teams are essential. A background in Computer Science or a related technical field is expected, along with a proven ability to build reliable and scalable systems.

  • 5+ years of software engineering experience
  • Strong proficiency in Python and Linux environments (bash scripting)
  • Experience with GCP and Infrastructure-as-Code (Terraform preferred)
  • Hands-on experience with Docker and cloud-native development
  • Exposure to large-scale data pipelines or web crawling systems (preferred)
  • Strong problem-solving and system design skills
  • Excellent communication and cross-functional collaboration abilities
  • Degree in Computer Science or related technical field (BS/MS/PhD)
Benefits
  • Competitive base salary with bonus and equity opportunities
  • Fully remote, distributed-first work environment
  • High-impact role working on AI systems used at global scale
  • Opportunity to shape foundational data infrastructure for ML models
  • Collaborative, engineering-driven culture with strong autonomy
  • Access to cutting-edge AI and data engineering technologies
  • Fast-paced environment with ownership over meaningful technical problems
  • Work on a product that improves accessibility and learning experiences worldwide
Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Software Engineer, Data Infrastructure & Acquisition in United States vacancy
  • $140k - $200k

     ...include frontend and backend engineers, AI research scientists, and...  ...'re looking to hire for our Data side of our AI team at...  ...through a tight integration of infrastructure, engineering, and research work...  ...are looking for a skilled Software Engineer to join us. What You... 
    Suggested
    Full time
    Work at office
    Shift work

    Clutch Canada

    Cleveland, OH
    1 day ago
  • $140k - $200k

    Overview We’re looking to hire for the data side of our AI team at Speechify. This...  ...cost through a tight integration of infrastructure, engineering, and research work. What You’ll Do Be...  .... 5+ years of industry experience in software development. Proficiency with bash or... 
    Suggested
    Full time
    Shift work

    TryApplyNow

    Culver City, CA
    3 days ago
  • $140k - $200k

    Software Engineer, Data Infrastructure & Acquisition - Virginia Beach, VA, USA Full‑time, On‑site, Virginia Beach, VA, USA; Salary range: $140k-$200k. Role Overview Speechify is hiring a mid‑level Software Engineer to build and scale the data ingestion and infrastructure... 
    Suggested
    Full time

    TryApplyNow

    Virginia Beach, VA
    1 day ago
  • $140k - $200k

    Software Engineer, Data Infrastructure & Acquisition Location: Scottsdale, AZ, USA. Full-time. Salary range: $140,000 - $200,000, plus bonus and equity. Overview Speechify is building a high‑quality, petabyte‑scale data pipeline to support the training of next‑generation... 
    Suggested
    Full time

    TryApplyNow

    Scottsdale, AZ
    1 day ago
  • $140k - $200k

    Software Engineer, Data Infrastructure & Acquisition - Riverside, CA, USA Riverside, CA, USA $140k - $200k Posted 5 weeks ago Role Overview Speechify is hiring a mid-level Software Engineer, Data Infrastructure & Acquisition. This is a full-time role based in Riverside... 
    Suggested
    Full time
    Shift work

    TryApplyNow

    Riverside, CA
    1 day ago
  • $200k - $400k

     ...work and grow as a team. About the Team The Infrastructure team builds and operates the foundations that power Decagon: networking, data, ML serving, developer platform, and real‑...  ...We're hiring a Senior Data Infrastructure Engineer to design, build, and operate the data... 
    Full time
    Work at office
    Local area

    Decagon

    New York, NY
    1 day ago
  •  ...running OpenAI’s LLM training and inference infrastructure that powers frontier models at massive...  .... About the Role We are looking for an engineer to design and implement the dataset...  ...dataset APIs, including for multimodal (MM) data that cannot fit in memory. Build proactive... 

    Slope

    San Francisco, CA
    2 days ago
  • $19 - $65 per hour

     ...pioneering AI-based virtual driver software for factory-built autonomous...  ...on with real‑world, large‑scale data challenges? We’re seeking a Software Engineer Intern to help build and improve...  ...development and automation. Backend & infrastructure fundamentals: Solid... 
    Hourly pay
    Internship

    Medium

    Santa Clara, CA
    1 day ago
  • $191k - $225k

     ...The Community You Will Join: Data represents the voice of...  ...at scale. The Data Warehouse Infrastructure team is responsible for the...  ...which is used by hundreds of engineers to collect, manage, and analyze...  ...and contribute to open source software, and have industry impact.... 
    Work experience placement

    Nerdleveltech

    San Francisco, CA
    3 days ago
  • $210k - $267k

     ...we do. We ingest large‑scale data—weather, prices, load, and grid...  ...Role We’re looking for an engineer to help lead the scaling and reliability of our data infrastructure, which is core to the ML work...  ...Airflow, or Temporal. Strong software engineering skills. Being able... 
    Work at office
    Remote work
    Work from home
    Home office
    Flexible hours
    3 days per week

    Gridmatic

    Cupertino, CA
    1 day ago
  •  ...Data Engineer The defense market is surging, but the data that drives it hasn’t kept up. Companies, government, and investors are forced...  ...is building a data source of truth and AI tools for defense acquisition to solve this. We fuse information from thousands of sources... 
    Work at office
    Work from home
    Flexible hours
    Night shift
    Weekend work

    Obviant

    Arlington, TX
    1 day ago
  • $165k - $242k

     ...Senior Software Engineer, Data Center Infrastructure Tooling CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted... 

    CoreWeave

    Bellevue, WA
    4 days ago
  • $121.5k - $145.5k

    ## Senior Software Engineer - Data AcquisitionApplylocations: Portland, ME: Bay Area, CA: Chicago...  ...706### **About the Role****The Data Acquisition Team is the entry point to WEX’s Data...  ...across domains, products, and infrastructure layers.* A strong sense of **ownership... 
    Flexible hours

    WEX

    Portland, ME
    3 days ago
  • $213k - $263k

     ...Senior Software Engineer, Data Infrastructure Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The... 
    Full time
    Remote work

    Waymo

    Mountain View, CA
    1 day ago
  • $158k - $210k

     ...of progress. Over the last decade, software has transformed the digital world. But...  ...work at scale. We are roboticists, engineers, operators, and builders. We believe...  ...join us. What you’ll do Work on a data intelligence infrastructure team, which is focused on gaining... 
    Full time
    Temporary work
    Work at office
    Flexible hours

    ATOMS Careers page

    Austin, TX
    3 days ago
  •  ...Data Infrastructure Engineer The Data Infrastructure teams are responsible for building and maintaining data storage technologies across the...  .... What We're Looking For We're looking for talented software engineers to help us build the vision of making our database... 

    Roberts Recruiting

    Cambridge, MA
    4 days ago
  •  ...Data Infrastructure Engineer We believe that the way people interact with their finances will drastically improve in the next few years. We're...  ...documents and code changes. Qualifications ~5+ years of software engineering experience ~ Extensive hands-on software... 
    Work experience placement
    Local area
    Remote work

    Plaid

    United States
    3 days ago
  • $165k - $242k

     ...Senior Software Engineer - Data Infrastructure Services Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    4 days ago
  •  ...Infrastructure Engineer Applied Intuition, Inc. is powering the future of physical AI. Founded in...  ...with expertise in scaling open-source data infrastructure to join the Data & ML infra...  ...Develop and deploy high-quality software using modern tooling and frameworks, especially... 

    Applied Intuition

    Sunnyvale, CA
    2 days ago
  • $243.29k - $295.25k

     ...create safer, more civil shared experiences for everyone. Roblox's data infrastructure processes petabytes of data daily, powering analytics, ML, and product decisions. As a Senior Software Engineer in our Data Infra org, you will design, build, and scale the distributed... 
    Full time
    Work experience placement
    H1b
    Work at office
    Local area
    Visa sponsorship
    Monday to Friday

    Roblox

    San Mateo, CA
    1 day ago
  • $153k - $222k

     ...company is creating the digital infrastructure needed to bring intelligence to...  ...are looking for infrastructure engineers with expertise in scaling open-source data infrastructure to join the Data...  ...Develop and deploy high-quality software using modern tooling and frameworks... 
    Full time
    For contractors
    For subcontractor
    Casual work
    Work at office
    Remote work
    Day shift

    Decisive Point

    Sunnyvale, CA
    1 day ago
  • $156k - $217k

     ...freight. Prior to the September 2022 acquisition, Baton was a venture-backed start-up...  ...work you'll do here. Role Senior Software Engineer - Infrastructure Location Hayes Valley, San Francisco...  ...your expertise in system architecture, data pipelines, and automation to ensure... 
    Full time
    Work at office
    Immediate start
    Work from home
    Monday to Friday

    6AM City

    California, MO
    2 days ago
  • PlusAI in Santa Clara is seeking a Software Engineer Intern to contribute to the development of advanced metrics dashboards. The intern...  ...features while collaborating across domains to enhance backend infrastructure. This role requires strong programming ability and is ideal... 
    Internship

    PlusAI

    Santa Clara, CA
    3 days ago
  • $160.36k - $240.54k

     ...its training and evaluation data. The team plays a crucial role...  ...scalable and reliable data infrastructure. This infrastructure is...  ...collaborates closely with system engineers to thoroughly validate the...  ...best practices across broader software organization. A bachelor's... 
    Work experience placement

    Kindredventures

    Mountain View, CA
    1 day ago
  • Join to apply for the Software Engineer - Data Infrastructure role at Canonical Join to apply for the Software Engineer - Data Infrastructure role at Canonical Canonical is building a comprehensive automation suite to provide multi-cloud and on-premise data solutions for... 
    Full time
    Remote work
    Work from home

    Canonical

    Chicago, IL
    4 days ago
  • $180k - $250k

     ...train their own large models on their own data. The current industry standard is to...  ...looking for an experienced Data Platform Engineer to join as a member of our core Datology...  ...lead of a Data Engineering / Platform / Infrastructure Team. Experience building ML/DL systems... 
    Work at office
    Visa sponsorship
    Relocation package

    datologyai

    Redwood City, CA
    5 days ago
  • $185k - $230k

    The Opportunity We are looking for a Senior Data Engineer to join our Data Platform team and build the core data foundations that power analytics, experimentation, and decision‑making across the company. In this role, you will design and own foundational data models, pipelines... 

    Cacheflow

    Mountain View, CA
    3 days ago
  • $160k - $225k

     ...agentic platform synthesizes complex employee data, pinpoints risky behaviours, and deploys...  ...Join Us Build and scale the foundational data infrastructure powering a category‑defining product Work closely with engineering, data science, and product teams to operationalize... 
    Work experience placement
    Relocation package
    Flexible hours

    Fable Security

    San Francisco, CA
    22 hours ago
  •  ...model innovation and systems engineering paired with a design‑minded product...  ...in AI. About the Role Data is the lifeblood of our models, and we’re looking for a Software Engineer to help build the training data and ML data infrastructure at Cartesia. This role sits at... 
    Work at office
    Visa sponsorship
    Flexible hours

    Cartesia

    San Francisco, CA
    1 day ago
  •  ...for exceptional people to join us! About the Role As an engineer on the Data Infrastructure team at Persona, you will play a key role in designing,...  ...What you’ll bring to Persona 3+ years of experience in software engineering, with a focus on data infrastructure or large... 
    Full time
    For contractors
    Internship

    Persona

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, Data Infrastructure & Acquisition. Be the first to apply!