Distributed Web Crawling Engineer for Large-Scale AI Data
Reflection
Reflection in San Francisco is seeking an engineer to build and operate web-scale systems for data collection. This role involves working closely with AI researchers to optimize crawling infrastructure and enhance data quality. The ideal candidate will have a strong background in distributed systems and web data processing. Competitive compensation and benefits are offered to ensure impactful work. #J-18808-Ljbffr Reflection
Vacancy posted 14 hours ago
Similar jobs that could be interesting for youBased on the Distributed Web Crawling Engineer for Large-Scale AI Data in San Francisco, CA vacancy
- ...states. Our team of AI researchers and... ...the Role The web is one of the most... ...diversity of web data directly influence... ...build and operate large-scale web crawling systems that continuously... ...scheduling to distributed crawling, content... ...is ideal for engineers who love building...DataRelocation package
- ...retrieve and process all our data? Our founding team... ...that made generative AI accessible to millions.... ...building infrastructure at a scale where billion-image... ...—the kind where "large" means billions of assets... ...Slurm/HPC environments for distributed data processing Have...DataWorldwide
- ...ARR this year. Why top engineers are joining: Hyper‑growth... ...ship The platform is scaling at a pace few startups... ...DynamoDB, EC2, ELB), and AI‑driven workflows. High‑... ..., and solve complex distributed systems problems at scale... ...‑powered workflows and data intelligence features Maintain...Data
$180k - $215k
As a Backend Engineer on our application team at... ...our customer data. It is the “brains”... ...and build a scalable distributed system capable of supporting... ...Plan for scale in building solutions... ...languages - Java, Kotlin Web framework - Spring... ...intelligence and AI company that gives...Data- A cutting-edge AI research firm in San Francisco is seeking talent to build and optimize GPU infrastructure for large-scale model inference and training workloads. The ideal candidate will... ..., actively contributing to synthetic data generation and reinforcement learning pipelines...Data
- ...We are seeking a Lead Data Engineer to architect, build, and... ...experience with distributed data platforms, and strong... ...Proven experience leading large‑scale data platform... ...compliance frameworks. Use of AI Tools As a technology... ..., Climbing Never, Crawling Never, Kneeling Never....Data
- ...that every software engineer begins and ends... ...Our customers are large engineering teams who... ...to code review to data analysis. Indent is... ...the Swift Compiler, distributed data orchestration software, and scaled video conferencing... ...and dislikes about AI coding tools you have...DataImmediate start
$350k
...training researcher, responsible for curating and analyzing large-scale datasets that support AI model development. The ideal candidate will demonstrate... ...in relevant fields. This role blends research and engineering, requiring both theoretical knowledge and practical skills...Data- ...Backend Software Engineer Location: San Francisco... ...SF) About Billables AI At Billables AI, we’... ...engineer to help build and scale the core systems... ...ingest and process large volumes of activity data Improve reliability,... ...and scaling distributed systems or microservice...DataWork at office3 days per week
$140k - $220k
...David AI provided pay range This range... ...0/yr About Our Engineering Team: At David AI... ...into high‑signal data for leading AI... ...and performance of distributed systems that ingest... ...serve audio data at scale. Define and... ...async processing, or large‑scale data pipelines...DataFull timeWork at officeFlexible hours- ...the role We’re looking for a top‑tier Backend Engineer to join our tech team and work hand‑in‑hand... ...responsibilities will be to: Build and operate large-scale, distributed systems powering AI-ready web search Develop APIs, data pipelines, and microservices for indexing,...DataRemote workFlexible hours
$160k - $300k
...pioneering foundational AI company for... ...unstructured technical data into real-time,... ...revolutionize how engineering decisions are made... ..., build, and scale the core systems that... ...designing and building distributed backend services... ...RESTful APIs for large-scale applications...DataWork at officeVisa sponsorshipFlexible hours- Scribe is seeking a Sr. Software Engineer, Backend to aid in building... ...their product. The role focuses on large-scale data ingestion, workflow processing, and AI capabilities using Python, Typescript... ...development, particularly with distributed systems. The company offers...DataFlexible hours
- ...bringing advanced AI capabilities to... ..., mobile engineers, frontend engineers... ...generation at global scale. Whether users... ...orchestration systems, and data infrastructure.... ...with modern web technologies... ..., APIs, and distributed systems. Enjoy owning... ..., and large-scale user experiences...DataWorldwide
$140k - $260k
...marketing platform for the AI era. As people... ...us. As a Backend Engineer, you will build and... ...APIs, process large datasets efficiently... ...-time insights and data retrieval Optimize... ...pipelines to handle large-scale structured and... ...pipelines and working with distributed systems Familiarity...DataWork at officeVisa sponsorshipShift work- Requirements 2+ years of web development experience... ...role, working on a large scale consumer app , (Desirable... ...exceptional Fullstack / Web Engineers that can work across... ..., frameworks, and AI tooling , Responsible for... ...deployment, and secure data handling , Drive technical...DataWorldwide
- ...by bringing advanced AI capabilities to hundreds... ...experienced Backend Engineer to join the Image Generation... ..., and cost across large-scale distributed systems. Partner with... ...closely with Android, iOS, web, and full-stack... ...platform systems. Use data and experimentation to...DataWorldwide
- Founding Engineer, Backend & Infrastructure About... ...the clinical AI layer for healthcare... ...infrastructure needed to scale from thousands of... ...and scale the distributed systems that power... ...services across chart data, claims data, authorization... ...data pipelines, or large-scale backend...Data
- ...The team’s mission is to distribute OpenAI’s API broadly... ...platforms, model behavior, and large-scale infrastructure. About... ...looking for a backend engineer who can quickly... ...developer tools, especially AI-powered tools,... ...keeping workloads and data within cloud environments...DataInternship
$300k
...Us We are a leading AI startup pushing the... ...AI researchers and engineers from top... ...architect, build, and scale the infrastructure... ...workloads. Architect distributed systems for high throughput... ...Build robust APIs, data pipelines, and... ...-efficiency across large-scale compute environments...DataRemote work$150k - $300k
...Company A Seed-stage AI company (founded 202... ...into LLM-ready data for AI agents and large language models. The... ...infrastructure-heavy engineering role on a small, talent... ...Architect, build, and scale core backend systems... ...); C++/Rust (plus); distributed systems (...DataFull timeH1bWork at officeVisa sponsorship- ...client is building the AI backbone for the... ...for AI models”—not data or raw compute, but... ...a Backend Software Engineer (ML Infrastructure)... ...design, build, and scale the core systems that power large-scale model training... ...candidate will work on distributed training pipelines,...DataRemote work
$325k
...Team Full Stack engineers within the Fleet... ...efficiently manage AI workloads across... ..., and operate web-based systems... ...designing tools that scale to exascale... ...monitor, and manage large-scale AI... ...backends. Build data visualization tools... ...across globally distributed supercomputing infrastructure...DataWork at officeRelocation package- ...capture, process, and scale knowledge. As a Sr. Software Engineer, Backend, you will... ...systems behind our data ingestion, workflow processing, and AI inference capabilities... ...deeply about distributed systems, scalability... ...architect and evolve large‑scale data ingestion...DataFull timeHome officeFlexible hours
- ...using technology and AI. We empower... ...more at Life as an Engineer at EvenUp Location... ...dedicated to making large language models (LLMs... ...multidisciplinary team of Data Scientists,... ...and efficiently at scale. This is an excellent... ...frameworks, libraries, and distributed systems with a...DataFull timeTemporary workWork at officeLocal areaHome officeFlexible hours3 days per week
- ...leading 3D generative AI company on a... ...with team members distributed across North America... ...interns to join our engineering team and help... ...systems to training large-scale machine learning models... ...understanding of data structures,... ...GCP, Kubernetes) Web development (React...DataInternshipWork at officeRemote workWorldwideFlexible hours1 day per week
$192k - $260k
...obsessed with enabling data teams to solve the... ...’s best data and AI infrastructure... ...in the world. Our engineering teams build highly... ...resilience, security and scale that is critical... ...learning and distributed systems. Our technology... ..., and operating large scale distributed...DataWork at officeLocal areaWorldwideFlexible hours- A leading data and AI company is seeking a Staff Engineer to design and implement core systems for their Foundation Model Serving. The position focuses on large-scale distributed systems, optimizing GPU workloads, and collaborating across teams. Applicants should have...Data
- ...commerce with ATHIA, our AI-powered orchestration... ...platform that helps large enterprises boost approval... ...optimization, and data orchestration in one powerful... ...a Backend Staff Engineer to act as a deep... ...engineering challenges, scaling distributed systems, and setting backend...DataRemote work
- Granica is looking for a skilled engineer specializing in distributed systems to design self-optimizing data infrastructure. This role... ...success in delivering impactful large-scale data systems. Join us to pioneer advancements in AI infrastructure while enjoying competitive...Data
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Distributed Web Crawling Engineer for Large-Scale AI Data. Be the first to apply!
Related searches
- web programmer San Francisco, CA
- junior web developer part time San Francisco, CA
- remote contract web developer San Francisco, CA
- web developer internship San Francisco, CA
- ecommerce web developer San Francisco, CA
- remote junior web developer San Francisco, CA
- remote web developer apprenticeship San Francisco, CA
- web api developer San Francisco, CA
- graduate web developer San Francisco, CA
- c# .net web developer San Francisco, CA

