Software Engineer, Data Infrastructure & Acquisition
Jobgether
Software Engineer, Data Infrastructure & Acquisition
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Software Engineer, Data Infrastructure & Acquisition based in South Africa.
This role sits at the intersection of software engineering, data infrastructure, and applied AI, focusing on building and scaling the systems that power large-scale dataset acquisition for next-generation machine learning models. You will work in a fully distributed environment alongside engineers, researchers, and product leaders to design robust ingestion pipelines capable of handling massive, high-quality audio and text datasets. The work directly impacts how data is collected, processed, and transformed into training-ready assets that fuel AI innovation. You'll contribute to improving the cost, scale, and efficiency of data systems while helping define the roadmap for dataset development. The environment is fast-moving, highly collaborative, and deeply technical, with strong ownership and autonomy. This is a chance to shape foundational infrastructure used by millions of users globally.
Accountabilities
You will be responsible for building, maintaining, and scaling large-scale data ingestion and acquisition systems that support AI model training and product development. You will design and extend cloud-based infrastructure, optimize data pipelines, and ensure efficient processing of high-volume datasets across distributed systems. You will collaborate closely with AI scientists and engineering teams to improve data quality, reduce cost, and increase throughput for training workflows. You will also identify and integrate new external data sources, including audio and web-based datasets, into production pipelines. Additionally, you will help define dataset strategy and contribute to architectural decisions that support long-term scalability and reliability of infrastructure systems.
- Build and maintain scalable data ingestion and processing pipelines
- Extend cloud infrastructure (GCP) using Infrastructure-as-Code tools
- Identify and integrate new data sources into acquisition systems
- Collaborate with research and AI teams to improve dataset quality and efficiency
- Optimize systems for cost, throughput, and reliability at scale
- Contribute to architecture and roadmap decisions for data infrastructure
Requirements
The ideal candidate brings strong software engineering experience with a focus on distributed systems, data infrastructure, or backend engineering in production environments. You should have hands-on experience with Python and Linux-based development workflows, along with strong familiarity with cloud platforms such as GCP and infrastructure-as-code tools like Terraform. Experience with Docker, large-scale data pipelines, or web crawling systems is highly valuable. You are comfortable working in fast-paced, ambiguous environments and can manage multiple priorities effectively. Strong communication skills and the ability to collaborate across technical and research-driven teams are essential. A background in Computer Science or a related technical field is expected, along with a proven ability to build reliable and scalable systems.
- 5+ years of software engineering experience
- Strong proficiency in Python and Linux environments (bash scripting)
- Experience with GCP and Infrastructure-as-Code (Terraform preferred)
- Hands-on experience with Docker and cloud-native development
- Exposure to large-scale data pipelines or web crawling systems (preferred)
- Strong problem-solving and system design skills
- Excellent communication and cross-functional collaboration abilities
- Degree in Computer Science or related technical field (BS/MS/PhD)
Benefits
- Competitive base salary with bonus and equity opportunities
- Fully remote, distributed-first work environment
- High-impact role working on AI systems used at global scale
- Opportunity to shape foundational data infrastructure for ML models
- Collaborative, engineering-driven culture with strong autonomy
- Access to cutting-edge AI and data engineering technologies
- Fast-paced environment with ownership over meaningful technical problems
- Work on a product that improves accessibility and learning experiences worldwide
$140k - $200k
...include frontend and backend engineers, AI research scientists, and... ...'re looking to hire for our Data side of our AI team at... ...through a tight integration of infrastructure, engineering, and research work... ...are looking for a skilled Software Engineer to join us. What You...SuggestedFull timeWork at officeShift work$140k - $200k
Overview We’re looking to hire for the data side of our AI team at Speechify. This... ...cost through a tight integration of infrastructure, engineering, and research work. What You’ll Do Be... .... 5+ years of industry experience in software development. Proficiency with bash or...SuggestedFull timeShift work$140k - $200k
Software Engineer, Data Infrastructure & Acquisition - Virginia Beach, VA, USA Full‑time, On‑site, Virginia Beach, VA, USA; Salary range: $140k-$200k. Role Overview Speechify is hiring a mid‑level Software Engineer to build and scale the data ingestion and infrastructure...SuggestedFull time$140k - $200k
Software Engineer, Data Infrastructure & Acquisition Location: Scottsdale, AZ, USA. Full-time. Salary range: $140,000 - $200,000, plus bonus and equity. Overview Speechify is building a high‑quality, petabyte‑scale data pipeline to support the training of next‑generation...SuggestedFull time$140k - $200k
Software Engineer, Data Infrastructure & Acquisition - Riverside, CA, USA Riverside, CA, USA $140k - $200k Posted 5 weeks ago Role Overview Speechify is hiring a mid-level Software Engineer, Data Infrastructure & Acquisition. This is a full-time role based in Riverside...SuggestedFull timeShift work$200k - $400k
...work and grow as a team. About the Team The Infrastructure team builds and operates the foundations that power Decagon: networking, data, ML serving, developer platform, and real‑... ...We're hiring a Senior Data Infrastructure Engineer to design, build, and operate the data...Full timeWork at officeLocal area- ...running OpenAI’s LLM training and inference infrastructure that powers frontier models at massive... .... About the Role We are looking for an engineer to design and implement the dataset... ...dataset APIs, including for multimodal (MM) data that cannot fit in memory. Build proactive...
$19 - $65 per hour
...pioneering AI-based virtual driver software for factory-built autonomous... ...on with real‑world, large‑scale data challenges? We’re seeking a Software Engineer Intern to help build and improve... ...development and automation. Backend & infrastructure fundamentals: Solid...Hourly payInternship$191k - $225k
...The Community You Will Join: Data represents the voice of... ...at scale. The Data Warehouse Infrastructure team is responsible for the... ...which is used by hundreds of engineers to collect, manage, and analyze... ...and contribute to open source software, and have industry impact....Work experience placement$210k - $267k
...we do. We ingest large‑scale data—weather, prices, load, and grid... ...Role We’re looking for an engineer to help lead the scaling and reliability of our data infrastructure, which is core to the ML work... ...Airflow, or Temporal. Strong software engineering skills. Being able...Work at officeRemote workWork from homeHome officeFlexible hours3 days per week- ...Data Engineer The defense market is surging, but the data that drives it hasn’t kept up. Companies, government, and investors are forced... ...is building a data source of truth and AI tools for defense acquisition to solve this. We fuse information from thousands of sources...Work at officeWork from homeFlexible hoursNight shiftWeekend work
$165k - $242k
...Senior Software Engineer, Data Center Infrastructure Tooling CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted...$121.5k - $145.5k
## Senior Software Engineer - Data AcquisitionApplylocations: Portland, ME: Bay Area, CA: Chicago... ...706### **About the Role****The Data Acquisition Team is the entry point to WEX’s Data... ...across domains, products, and infrastructure layers.* A strong sense of **ownership...Flexible hours$213k - $263k
...Senior Software Engineer, Data Infrastructure Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The...Full timeRemote work$158k - $210k
...of progress. Over the last decade, software has transformed the digital world. But... ...work at scale. We are roboticists, engineers, operators, and builders. We believe... ...join us. What you’ll do Work on a data intelligence infrastructure team, which is focused on gaining...Full timeTemporary workWork at officeFlexible hours- ...Data Infrastructure Engineer The Data Infrastructure teams are responsible for building and maintaining data storage technologies across the... .... What We're Looking For We're looking for talented software engineers to help us build the vision of making our database...
- ...Data Infrastructure Engineer We believe that the way people interact with their finances will drastically improve in the next few years. We're... ...documents and code changes. Qualifications ~5+ years of software engineering experience ~ Extensive hands-on software...Work experience placementLocal areaRemote work
$165k - $242k
...Senior Software Engineer - Data Infrastructure Services Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI...Permanent employmentTemporary workCasual workWork at officeFlexible hours- ...Infrastructure Engineer Applied Intuition, Inc. is powering the future of physical AI. Founded in... ...with expertise in scaling open-source data infrastructure to join the Data & ML infra... ...Develop and deploy high-quality software using modern tooling and frameworks, especially...
$243.29k - $295.25k
...create safer, more civil shared experiences for everyone. Roblox's data infrastructure processes petabytes of data daily, powering analytics, ML, and product decisions. As a Senior Software Engineer in our Data Infra org, you will design, build, and scale the distributed...Full timeWork experience placementH1bWork at officeLocal areaVisa sponsorshipMonday to Friday$153k - $222k
...company is creating the digital infrastructure needed to bring intelligence to... ...are looking for infrastructure engineers with expertise in scaling open-source data infrastructure to join the Data... ...Develop and deploy high-quality software using modern tooling and frameworks...Full timeFor contractorsFor subcontractorCasual workWork at officeRemote workDay shift$156k - $217k
...freight. Prior to the September 2022 acquisition, Baton was a venture-backed start-up... ...work you'll do here. Role Senior Software Engineer - Infrastructure Location Hayes Valley, San Francisco... ...your expertise in system architecture, data pipelines, and automation to ensure...Full timeWork at officeImmediate startWork from homeMonday to Friday- PlusAI in Santa Clara is seeking a Software Engineer Intern to contribute to the development of advanced metrics dashboards. The intern... ...features while collaborating across domains to enhance backend infrastructure. This role requires strong programming ability and is ideal...Internship
$160.36k - $240.54k
...its training and evaluation data. The team plays a crucial role... ...scalable and reliable data infrastructure. This infrastructure is... ...collaborates closely with system engineers to thoroughly validate the... ...best practices across broader software organization. A bachelor's...Work experience placement- Join to apply for the Software Engineer - Data Infrastructure role at Canonical Join to apply for the Software Engineer - Data Infrastructure role at Canonical Canonical is building a comprehensive automation suite to provide multi-cloud and on-premise data solutions for...Full timeRemote workWork from home
$180k - $250k
...train their own large models on their own data. The current industry standard is to... ...looking for an experienced Data Platform Engineer to join as a member of our core Datology... ...lead of a Data Engineering / Platform / Infrastructure Team. Experience building ML/DL systems...Work at officeVisa sponsorshipRelocation package$185k - $230k
The Opportunity We are looking for a Senior Data Engineer to join our Data Platform team and build the core data foundations that power analytics, experimentation, and decision‑making across the company. In this role, you will design and own foundational data models, pipelines...$160k - $225k
...agentic platform synthesizes complex employee data, pinpoints risky behaviours, and deploys... ...Join Us Build and scale the foundational data infrastructure powering a category‑defining product Work closely with engineering, data science, and product teams to operationalize...Work experience placementRelocation packageFlexible hours- ...model innovation and systems engineering paired with a design‑minded product... ...in AI. About the Role Data is the lifeblood of our models, and we’re looking for a Software Engineer to help build the training data and ML data infrastructure at Cartesia. This role sits at...Work at officeVisa sponsorshipFlexible hours
- ...for exceptional people to join us! About the Role As an engineer on the Data Infrastructure team at Persona, you will play a key role in designing,... ...What you’ll bring to Persona 3+ years of experience in software engineering, with a focus on data infrastructure or large...Full timeFor contractorsInternship
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer, Data Infrastructure & Acquisition. Be the first to apply!
- software sales engineer United States
- software engineer amazon United States
- oracle software engineer United States
- software engineer student United States
- agile software developer United States
- rust software engineer United States
- software developer positions United States
- senior software design engineer United States
- software developer United States
- ngo software engineer United States

