Research Crawling Engineer
$150k - $225kStartup Talents
Job Description
The employer is a decentralized, Solana-based web-scraping network that allows users to monetize their unused internet bandwidth. By installing a browser extension, users securely share bandwidth to help AI companies crawl the web for public data, receiving Points (convertible to crypto tokens) as compensation. They also operate a massive distributed crawler, giving them unique access to high-quality public web data at global scale. They are hiring a Research Crawling Engineer (Full remote - USA/EU 6 hour overlap with EST)
You will join a company at the forefront of developing a web-scale crawler and knowledge graph that improves access to public web data and extends the value of AI to the people. As a Research Crawling Engineer, you will design and operate large-scale web data acquisition systems for research and model development. You will work will span distributed systems, scraping infrastructure, and data pipelines. This Role Involves:
- Operating at the boundary of scale and reliability - Adapting to constantly changing web environments - Balancing throughput, coverage, and data quality - Owning end-to-end data acquisition pipelines MISSIONS
- Build a distributed crawler for a continuously updated, high-quality web project - Design a system to classify and filter billions of pages for pretraining - Extract structured data from dynamic, JS-heavy sites at scale - Improve deduplication and quality scoring across multimodal datasets Requirements
The employer is a decentralized, Solana-based web-scraping network that allows users to monetize their unused internet bandwidth. By installing a browser extension, users securely share bandwidth to help AI companies crawl the web for public data, receiving Points (convertible to crypto tokens) as compensation. They also operate a massive distributed crawler, giving them unique access to high-quality public web data at global scale. They are hiring a Research Crawling Engineer (Full remote - USA/EU 6 hour overlap with EST)
You will join a company at the forefront of developing a web-scale crawler and knowledge graph that improves access to public web data and extends the value of AI to the people. As a Research Crawling Engineer, you will design and operate large-scale web data acquisition systems for research and model development. You will work will span distributed systems, scraping infrastructure, and data pipelines. This Role Involves:
- Operating at the boundary of scale and reliability - Adapting to constantly changing web environments - Balancing throughput, coverage, and data quality - Owning end-to-end data acquisition pipelines MISSIONS
- Design high-throughput, fault-tolerant systems for data collection (millions to billions of URLs/day)
- Handle anti-bot systems, rate limits, and dynamic/JS-heavy sites
- Develop pipelines for cleaning, deduplication, filtering, and normalisation
- Construct and maintain datasets for research and model training
- Monitor crawl performance, coverage, and data quality; iterate quickly
- Collaborate with research teams to align data collection with modeling needs
- Optimize infrastructure for cost, latency, and reliability
- Build a distributed crawler for a continuously updated, high-quality web project - Design a system to classify and filter billions of pages for pretraining - Extract structured data from dynamic, JS-heavy sites at scale - Improve deduplication and quality scoring across multimodal datasets Requirements
- Strong programming experience in one or more of : Go, Rust, Python, Java, or C++
- Experience working for reputable companies
- Experience building and maintaining large-scale web crawlers or large-scale data pipelines
- Experience designing high-throughput, fault-tolerant systems for data collection (millions to billions of URLs/day)
- Experience handling anti-bot systems, rate limits, and dynamic/JS-heavy sites
- Experience constructing and maintaining datasets for research and model training
- Solid understanding of networking, and browser behavior
- Familiarity with distributed systems and parallel processing
- Experience working with large datasets (TB-PB scale preferred)
- Ability to debug unstable or adversarial environments
- Experience with NLP pipelines or dataset curation for ML
- Familiarity with LLM pretraining data or retrieval systems
- Experience with headless browsers (e.g., Chrome DevTools Protocol, Playwright, Puppeteer)
- Knowledge of proxy systems, IP rotation, and large-scale request orchestration
- Background in data quality evaluation or benchmarking
- Experience running workloads on cloud or bare-metal infrastructure
- Ability to design systems that scale without degrading quality
- Practical problem-solving under real-world constraints
- Speed of iteration and ownership
- Measurable improvements in data coverage, quality, or efficiency
- Contract : Permanent role (Full remote - USA or 6 hour overlap with EST).
- Salary : $150k to $225k based on experience and demonstrated ability to operate at scale + Equity package / tokens
- Recruiter / HR Call
- Technical Interview
- CEO Interview
- Final Interview
Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Research Crawling Engineer in New York, NY vacancy
$120k - $250k
...Research Engineer London, England, United Kingdom; New York, New York, United States; San Francisco, California, United States; Seattle, Washington, United States Who We Are Lightning AI is the company behind PyTorch Lightning. Founded in 2019, we build an end...SuggestedWork at officeWork from homeFlexible hours2 days per week$84.58k - $130k
...improving the human condition through medical education, scientific research, and direct patient care. At NYU Langone Health, equity and... ...We have an exciting opportunity to join our team as a Research Engineer. This position offers an exceptional opportunity to...SuggestedWork experience placement- ...About Basis Basis is a nonprofit applied AI research organization with two mutually reinforcing goals. The first is to understand... ...that puts human values first. About the Role Research engineers support Basis' mission by translating research ideas into...SuggestedFull time
- ...Research Engineer We're looking for a Research Engineer to build the intelligent systems that power Antimetal. You'll prototype new approaches, run experiments, and own the path from research to production. You'll work closely with platform and product to shape agent...SuggestedWork at officeNight shift
$150k - $250k
...Research Engineer We are looking for a research engineer to research, test, and deploy new trading strategies, as well as to design and build research infrastructure to test trading ideas. Candidates focusing on strategy development should be strong quantitative...SuggestedHome office$190k - $250k
...Research Engineer New York, NY The Research Engineering team is dedicated to accelerating the velocity of machine learning research and expanding the exploration space for innovations at PDT. We partner with PDT's quantitative researchers to design and build a state...Work at office3 days per week- ...Research Engineer We are looking to hire a rock star Research Engineer who can help build the future of personalized recommendations. You will be part of a combined data science/engineering team, which is responsible for all merchandising and personalisation algorithms...
- ...Mbodi Engineer Opportunity Mbodi is building embodied AI platform that makes robots learn and operate like humans, with natural language... .... Our founders bring experience from Google and robotics research at UPenn GRASP, and we are building a small, senior team...
$300k - $350k
...Research Engineer, Pre-Training Jump Trading Group is committed to world-class research. We empower exceptional talents in Mathematics, Physics, and Computer Science to seek scientific boundaries, push through them, and apply cutting-edge research to global financial...- ...resulting network enables biopharma companies to design better research, launch new drugs more effectively, and invest in harder-to-treat... ...in person in New York. About the role As a Research Engineer on our team, you will work on real production use cases of LLMs...Full timeWork at office
$127k - $235k
...for customers?Then come and apply your skills and passion for technology at Thomson Reuters Labs. We are seeking a Senior Research Engineer who willbringexpertiseinAI and ML andisinterestedinbuilding data-driven capabilities that transform the way legal, accounting...Work at officeLocal areaRemote workFlexible hours2 days per week3 days per week$175k - $225k
...team that intimately collaborates with traders and quantitative researchers to implement, refine and deploy alpha signals, evaluate and... ...stand the test of time. About the Role As a Research Engineer , you will be an integral member of a systematic trading team...Temporary workFlexible hours- ...and SKIMS, our company is positioned at the intersection of AI, technology, and culture. The Role We're looking for Research Engineers to help us build the next generation of our avatar and virtual try-on models. This role is in-person from our office in NYC....Work at office
$26.37 per hour
...PART TIME RESEARCH ENGINEER Nanofabrication Cleanroom New York University Tandon School of Engineering The NYU Nanofabrication Facility (NYU Nanofab) provides open-access semiconductor fabrication services to the research community at NYU and external institutions...Hourly payPart time10 hours per week- ...Research Engineer As a Research Engineer on our team, you will work on real production use cases of LLMs and other ML techniques to solve business problems and create groundbreaking AI applications. The role requires that you develop a deep understanding of our product...Full timeWork at office
- ...and discovery. Ship models, not slide decks — partner with research and infra to prototype, train, and deploy state-of-the-art voice... ...Qualifications: Expert-level PyTorch. Proven software engineer who loves ML; comfortable writing production code across the stack...Full timeContract workFlexible hoursShift work
$200k - $400k
...and grow as a team. About the Team Read more about the research team's work here: The Research team develops the model and... ...behavior at scale. About the Role As a Senior Research Engineer, you'll be responsible for building industry-leading conversational...Full timeWork at officeLocal area$200k - $300k
...non-siloed, collaborative coding environment empowers talented engineers to make significant contributions and see their impact daily.... ...in the field. We are seeking highly motivated and skilled Research Engineers who will work very closely with our Algo Developer...Temporary workWork at officeLocal areaImmediate start$101.6k - $188.7k
...solutions for customers?Then come and apply your skills and passion for technology at Thomson Reuters Labs. We are seeking a Research Engineer who willbringexpertiseinAI and ML andisinterestedinbuilding data-driven capabilities that transform the way legal, accounting...Work at officeLocal areaRemote workFlexible hours2 days per week3 days per week- ...–it's how we do it. DRW is a place of high expectations, integrity, innovation and a willingness to challenge consensus. As a Research Engineer , you will be an integral member of a systematic research team comprised of experienced technologists, quantitative researchers...Immediate start
$120k - $175k
...We are seeking a Security Research Engineer to operate as a hybrid Forward Deployed Engineer and offensive security researcher. You'll be on the front lines of customer engagements - using our open source tool Apex to run pentests, curate and present findings, and stand...- ...Security Research Engineer We are seeking talented engineers intent on changing the security industry. If you have experience on fast-moving teams, building security products that developers love, and driving projects to completion through ambiguity: we want to talk...
- Cohere is looking for world-class research scientists and research engineers to advance multilingual AI technologies. You will lead the design of scalable solutions that enhance language model performance and contribute to groundbreaking research in natural language processing...Remote job
$165k - $260k
Senior NLP Research Engineer - Artificial Intelligence Location New York Business Area Engineering and CTO Ref # 10049310 Description & Requirements Bloomberg’s Engineering AI department has 400+ AI practitioners building highly sought after products and features...Temporary workFor contractorsWork experience placement- A cutting-edge AI company in New York seeks a Research Engineer/Scientist specializing in Reinforcement Learning. In this role, you will tackle real-world challenges and develop RL methods to optimize operations across critical industries. Ideal candidates will hold an...
$174k - $252k
PhD degree in Computer Science, Engineering, Computer Information Systems, Mathematics, Physics, or a related field and 2 years of experience in the job offered or in a Research Engineer-related occupation. Alternatively, will accept a Master’s degree in Computer Science...Full timeWork at office- ...want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the RL Teams Our...Visa sponsorship
- About Basis Basis is a nonprofit applied AI research organization with two mutually reinforcing goals. The first is to understand and... ...that puts human values first. About the Role Research Engineers in Operations at Basis build the internal tooling, automation,...Full timeContract workWork at office
- A leading Voice AI company in the United States is seeking a highly skilled Machine Learning Engineer to join their Research team. This role focuses on scalable model training for speech technologies and developing robust data strategies. The ideal candidate has strong...Remote work
$350k
A leading AI research company is seeking a Pre-training Research Engineer to advance large language models. You will be engaged in research, implementing solutions, and enhancing training infrastructure while collaborating with other experts. Candidates must possess an...Work at office
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Research Crawling Engineer. Be the first to apply!
Related searches
- engineering business analyst New York, NY
- ai research engineer New York, NY
- research software engineer New York, NY
- junior machine learning research engineer New York, NY
- deep learning research engineer New York, NY
- senior research engineer New York, NY
- engineering change analyst New York, NY
- engineering analyst New York, NY
- research programmer New York, NY
- research assistant engineering New York, NY

