Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Crawling Engineer

Wynd Labs

Research Crawling Engineer

As a Research Crawling Engineer, you will design and operate large-scale web data acquisition systems for research and model development. Your work will span distributed systems, scraping infrastructure, and data pipelines.

Responsibilities:

  • Build and maintain large-scale web crawlers across diverse domains
  • Design high-throughput, fault-tolerant systems for data collection (millions to billions of URLs/day)
  • Handle anti-bot systems, rate limits, and dynamic/JS-heavy sites
  • Develop pipelines for cleaning, deduplication, filtering, and normalization
  • Construct and maintain datasets for research and model training
  • Monitor crawl performance, coverage, and data quality; iterate quickly
  • Collaborate with research teams to align data collection with modeling needs
  • Optimize infrastructure for cost, latency, and reliability

Requirements:

  • Strong programming experience in one or more of: Go, Rust, Python, Java, or C++
  • Experience building web crawlers or large-scale data pipelines
  • Solid understanding of networking, and browser behavior
  • Familiarity with distributed systems and parallel processing
  • Experience working with large datasets (TB–PB scale preferred)

Ability to debug unstable or adversarial environments

Preferred / Bonus:

  • Experience with NLP pipelines or dataset curation for ML
  • Familiarity with LLM pretraining data or retrieval systems
  • Experience with headless browsers (e.g., Chrome DevTools Protocol, Playwright, Puppeteer)
  • Knowledge of proxy systems, IP rotation, and large-scale request orchestration
  • Background in data quality evaluation or benchmarking
  • Experience running workloads on cloud or bare-metal infrastructure

What This Role Involves:

  • Operating at the boundary of scale and reliability
  • Adapting to constantly changing web environments
  • Balancing throughput, coverage, and data quality
  • Owning end-to-end data acquisition pipelines

Evaluation Criteria:

  • Ability to design systems that scale without degrading quality
  • Practical problem-solving under real-world constraints
  • Speed of iteration and ownership
  • Measurable improvements in data coverage, quality, or efficiency

Compensation: Based on experience and demonstrated ability to operate at scale

Example Projects:

  • Build a distributed crawler for a continuously updated, high-quality web project
  • Design a system to classify and filter billions of pages for pretraining
  • Extract structured data from dynamic, JS-heavy sites at scale
  • Improve deduplication and quality scoring across multimodal datasets
  • Why Work With Us:

    • Opportunity. We are at the forefront of developing a web-scale crawler and knowledge graph that improves access to public web data and extends the value of AI to the people.
    • Culture. We're a lean team with a high bar. We come to work not to be comfortable, but to find out what we're capable of and to do work that matters. We're not calling for people who keep things moving. We're calling for people who make everyone around them better. We prioritize low ego and high output. This is a fully remote team.
    • Compensation. You'll receive a competitive salary, benefits and equity package.
Vacancy posted 12 hours ago
Similar jobs that could be interesting for youBased on the Research Crawling Engineer in United States vacancy
  • $150k - $225k

     ...Research Crawling Engineer The employer is a decentralized, Solana-based web-scraping network that allows users to monetize their unused internet bandwidth. By installing a browser extension, users securely share bandwidth to help AI companies crawl the web for public... 
    Suggested
    Permanent employment
    Contract work
    Remote work

    Startup Talents

    United States
    2 days ago
  • $100k - $130k

     ...global scale. Additionally, the team has engineered sophisticated pipelines for the...  ...facilitating dataset creation for frontier research labs. The organization operates as a...  ...speed and direct execution. As a Research Crawling Engineer , the successful candidate will... 
    Suggested
    Full time
    Remote work

    MLabs

    United States
    3 days ago
  • $100k - $130k

     ...Job Description Job Description Career Renew is recruiting for one of its clients a Research Crawling Engineer - this is a fully remote role and candidates can be based anywhere, as long as there is a 6 hours overlap with EST hours. Salary range: 100-130K USD yearly... 
    Suggested
    Remote work

    Career Renew

    Miami, FL
    13 days ago
  •  ...Research Engineer, Foundation Models About the Opportunity We are seeking a Research Engineer to help advance the next generation...  ...Data Engineering, Data Pipelines, ETL, Data Processing, Web Crawling, Data Collection, Feature Engineering, MLOps, ML Systems, Scalable... 
    Suggested
    Visa sponsorship
    Relocation package
    Flexible hours

    Acceler8 Talent

    Sonoma, CA
    11 hours ago
  • $160k - $240k

     ...Research Engineer - Evals You'll build the evaluation systems that tell us whether Firecrawl actually works. That sounds simple. It isn...  ...whether Firecrawl's outputs are actually good - across scrape, crawl, extract, and map. That means defining metrics, building... 
    Suggested
    Full time
    Temporary work
    Remote work

    Firecrawl

    United States
    1 day ago
  • $180k - $290k

     ...Research Engineer - Search/IR Research Engineer (Focused on Search/IR) You'll own the search and information retrieval systems at...  .... You'll build systems that keep our index fresh without re-crawling everything, deduplicate content intelligently, and handle incremental... 
    Full time
    Temporary work
    Remote work

    Firecrawl

    United States
    2 days ago
  •  ...Research Engineer I, II, III, or Senior Job no: 510665 Position type: Full-Time 12-Month Department: 193603 - Instit...  ...walking, climbing or balancing, stooping/kneeling/crouching/crawling and lifting up to 50 pounds. Vision requirements: Ability... 
    Full time
    Work at office
    Local area
    Immediate start

    Mississippi State University

    Vicksburg, MS
    1 day ago
  •  ...Research Engineer The Research Engineer is an entry-level position in the field of research and development. They work under the guidance...  ...Occasionally required to stand, walk and stoop, kneel, crouch, or crawl. Must frequently lift and/or move up to 10 pounds and... 
    Full time
    Contract work
    Temporary work
    For contractors
    Work at office
    Immediate start

    CHICKASAW NATION INDUSTRIES INC

    Kinsey, AL
    12 hours ago
  • Exa is building a search engine from scratch to serve every AI application. We build massive‑scale infrastructure to crawl the web, train state‑of‑the‑art embedding models to index...  ...of thousands of machines. Generalist Research Engineers work across our search and retrieval... 
    H1b

    Exa

    San Francisco, CA
    1 day ago
  •  ...built with, by, and for AIs. Our work spans innovations across crawling, indexing, ranking, retrieval, and reasoning systems. Our...  ...conviction bets - Try and fail. But succeed an unfair amount. Job Research engineer. You will enhance our core research product to train and... 

    Parallel Web Systems

    Palo Alto, CA
    12 hours ago
  • $150k

     ...of Foundation Models We are a dedicated research lab for building, understanding, using,...  ...class researchers, data scientists, and engineers, tackling the most fundamental and impactful...  ...experience with web scraping and crawling frameworks (e.g., scrapy, selenium, playwright... 
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    2 days ago
  • A technology company is seeking a Generalist Research Engineer in San Francisco to enhance their search engine capabilities. This role offers...  ...performance, enhancing ML models, and optimizing web crawling speed. Candidates must have strong attention to detail and a... 

    Exa

    San Francisco, CA
    3 days ago
  • $98.4k - $164k

     ...Job Description Summary As an FPGA Design Research Engineer, you will have the opportunity to architect and develop state-of-the-art embedded systems for real-time controls and industrial communication applications. You will lead and contribute to R&D projects that... 
    Full time
    Contract work
    Work at office
    Work visa
    Relocation package

    GE Vernova

    Niskayuna, NY
    1 day ago
  • $89.3k - $148.7k

     ...Job Description Summary As a Power Electronics Research Engineer in Vernova R&D organization, you will be involved in the development of new power electronics concepts. Your work will include power electronics system modeling and simulation, design and testing of power... 
    Full time
    Contract work
    Work at office
    Relocation package

    GE Vernova

    Niskayuna, NY
    3 days ago
  • $80k - $90k

     ...the general direction of the Director of Engineering, the Innovations Engineer is responsible...  ...& Technology Scouting (Primary Focus) Researches and evaluates new technologies, materials...  ...required to stoop, kneel, crouch, or crawl. The employee must regularly lift and/or... 

    Lapp USA, Inc.

    Florham Park, NJ
    2 hours ago
  • RESEARCH ENGINEER - SR. RESEARCH ENGINEER - Computational Thermofluid Engineer 18-01568 Who We Are: The Propulsion & Energy Machinery Section performs engineering R&D in the fields of gas turbine combustion, air-breathing propulsion, industrial heat and power, and liquid... 

    Southwest Research Institute

    San Antonio, TX
    1 day ago
  •  ...Description: Established nearly two centuries ago, FM is a leading mutual insurance company whose capital, scientific research capability and engineering expertise are solely dedicated to property risk management and the resilience of its policyholder-owners. These... 
    Full time
    Flexible hours

    FM

    Norwood, MA
    12 hours ago
  • $150k - $250k

     ...position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Research Engineer based in the United States. This role sits at the intersection of applied research and production-grade machine learning engineering... 
    Remote job
    Full time
    Work at office
    Home office

    jobgether

    United States
    4 days ago
  • Senior Research Engineer, Training Data Infrastructure in Foundation Models Cupertino, California, United States - Software and Services Our...  ...text and multimodal data from diverse sources, including web crawls and third-party partnerships. Repository Optimization:... 

    Apple Inc.

    Cupertino, CA
    2 days ago
  •  ...This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Research Scientist / Research Engineer in United States. This role sits at the intersection of cutting-edge machine learning research and real-world product impact,... 
    Remote job
    Full time
    Flexible hours

    jobgether

    United States
    6 days ago
  •  ...Internship Opportunity We are seeking Electrical Engineering, Computer Science, and Computer Engineering students for an internship in the Strategic Systems Operations Division of the Applied Research Laboratory (ARL) at Penn State. This position will be onsite at State... 
    Part time
    Internship
    Relocation

    Penn State University

    Reston, VA
    12 hours ago
  • $89.3k

     ...Directorates within the Lab, focused on a specific area of scientific research or other function, with its own leadership team and dedicated...  ...for building applications. BS&DG is seeking a Research Engineer II - AI for Building Energy Systems . The successful candidate... 
    For contractors
    Work at office
    Local area
    Relocation package
    Flexible hours

    Pacific Northwest National Laboratory

    Richland, WA
    4 days ago
  • $158k - $269k

     ...Research Engineer In Calibration Waabi, founded by AI visionary Raquel Urtasun, is the leader in Physical AI. With a world-class team, we're unlocking the next era of autonomous transportation with technology that's powering commercial autonomous trucks and robotaxis... 
    Full time
    Work at office
    Remote work
    Work from home
    Flexible hours

    Waabi

    United States
    12 hours ago
  • $350k

     ...want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role The... 
    Work at office
    Remote work
    Visa sponsorship
    Flexible hours

    Anthropic

    United States
    4 hours ago
  •  ...want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role The RL... 
    Work at office
    Remote work
    Visa sponsorship
    Flexible hours

    Anthropic

    United States
    4 hours ago
  •  ...will move faster than anyone else. They will attract the world's most capable talent. They will be on the forefront of applied research, engineering, infrastructure and deployment at scale. They will continue to scale their training to larger & more capable models. They... 
    Remote work
    Home office
    Flexible hours

    Poolside

    United States
    12 hours ago
  •  ...Research Engineer II The Alaka`ina Foundation Family of Companies (FOCs) has a potential need for a Research Engineer II to provide research support administration services for our government customer in San Antonio, TX. Description of Responsibilities: Resolve... 
    For contractors

    Alakaina Family of Companies

    San Antonio, TX
    12 hours ago
  •  ...Research Engineer III – Ai For Building Energy Systems The Electricity Infrastructure and Buildings Division, part of the Energy and Environment Directorate, is accelerating the transition to an efficient, resilient, and secure energy system through basic and applied... 
    Work experience placement

    PNNL

    Richland, WA
    12 hours ago
  •  ...SSAB range from machine operators and sales people to advanced engineers and corporate professionals in HR, finance and more. SSAB...  ...practices. The incumbent will work in the SSAB state-of-the-art Research and Development Facility located in Muscatine, Iowa. The position... 
    Full time
    Work at office
    Monday to Friday
    Flexible hours
    Day shift

    SSAB

    Montpelier, IA
    3 days ago
  •  ...Job Title Research Engineer IV Agency Texas A&M Engineering Department Materials Science & Engineering Proposed Minimum Salary Commensurate Job Location College Station, Texas Job Type Staff Job Description Why work... 
    Flexible hours

    The Texas A&M University System

    College Station, TX
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Crawling Engineer. Be the first to apply!