Member of Technical Staff - Research
Vals AI
Researcher And Research Engineer Role
We are looking for exceptional researchers and research engineers to design and build the next generation of AI benchmarks. You will create high-impact, challenging evaluations that push the boundaries of what we can measure in foundation models. This role is perfect for someone with deep research expertise who wants to see their work directly influence how the world evaluates AI systems.
You will lead the design and development of novel benchmarks that assess real-world capabilities of LLMs. Our benchmark shape how foundation models are developed and generative AI applications are built. We work with every major foundation model lab - along with leading financial institutions and the application-layer companies pushing the frontier forward. Our work has been featured by the Wall Street Journal, Washington Post, and Bloomberg.
We are building the standard for evaluating the ability of LLMs to perform real-world tasks. You will be at the forefront of defining what that standard looks like.
What You'll Do
- Design and develop novel, high-impact benchmarks that assess challenging real-world capabilities
- Conduct research to ensure our benchmarks are valid, reliable, and meaningful
- Collaborate with foundation model labs and enterprises to understand evaluation needs
- Analyze model performance across benchmarks and communicate findings
- Publish research findings and contribute to the broader evaluation research community
- Work closely with the infrastructure team to implement your benchmark designs at scale
- Stay current with the latest developments in LLM capabilities and evaluation methodologies
Requirements
Advanced research experience: Master's degree or PhD in Computer Science, NLP, Machine Learning, or related field. Undergrads with very strong research backgrounds may also be considered.
Publication track record: Published papers in reputable venues (NeurIPS, ICML, ACL, EMNLP, etc.) with focus on NLP, ML evaluation, or benchmarking
Research methodology: Strong understanding of experimental design, statistical analysis, and evaluation frameworks
Technical skills: Proficiency in Python for research and experimentation
Communication: Ability to clearly communicate complex research ideas to both technical and non-technical audiences
Collaboration: Experience working in research teams and integrating feedback
Portfolio: Demonstrated track record of impactful research work
Location: We are an in-person team based in San Francisco. We will support your relocation or transportation as needed.
Nice to Haves
Experience specifically in LLM evaluation or benchmarking research
Familiarity with foundation model architectures and capabilities
Experience working with industry partners or in applied research settings
Background in areas like human-computer interaction, psychology, or domain-specific evaluation
Experience at early-stage startups or research labs
Contributions to open-source evaluation tools or datasets
What We Offer
Highly competitive salary and meaningful ownership. Excellence is well rewarded.
Relocation and transportation support
Health/dental insurance coverage
Lunch and dinner provided, free snacks/coffee/drinks
Unlimited PTO
Opportunity to publish and present your work
About Us
Founding team : The core methodology behind this platform comes from NLP evaluation research we had done at Stanford. We raised a $5M seed from some of the top institutional and angel investors in the valley. Our team has prior work experience at NVIDIA, Meta, Microsoft, Palantir and HRT. Collectively, we have over 300 citations in our published work.
Tech stack : Our frontend is built in React with TSX. We use Django as our back-end framework. All of the infra is on AWS.
What We're Looking For
Intelligence over credentials : Raw talent and research ability are more important than where you got your degree. Academic pedigree is valuable only insofar as it is a proxy for research excellence.
Ownership : We don't have the scale or time to actively "manage" every project. Working in a small, talent-dense team, we expect everyone to show initiative to build where it's needed, not where it's asked. We strive for autonomy over consensus.
Intensity : The LLM landscape is constantly changing. Foundation model labs are continuously pushing the frontier, creating new capabilities that demand new evaluation approaches. The companies that will emerge as leaders are being built now. Those that win will have an incredibly high speed of execution.
Solution-oriented mindset : We're looking for researchers who see each evaluation challenge as an opportunity to design innovative solutions, not insurmountable problems.
- ...production workloads built to scale to gigawatt-class AI datacenters. Mission Gimlet Labs is seeking an Member of Technical Staff focused on AI research. As an AI Researcher, you will be evaluating and implementing techniques to drive performance and quality...Suggested
$150k
...Amazon's Frontier AI & Robotics (FAR) team is seeking a Member of Technical Staff to drive foundational research and build intelligent robotic systems from the ground up. In this role, you will operate at the intersection of cutting-edge AI research and real-world robotics...SuggestedLocal area- ...Job Description As a Member of Technical Staff (Research) at Trajectory, you will design and build the post‑training stack that lets our customers’ models continually learn from real production workflows. You will own end‑to‑end experiments across data, training, and evaluation...Suggested
- Member of Technical Staff, Pretraining Science Member of Technical Staff, Pre-Training Science Location: SF Bay Area or Tokyo, Japan Type: Full... ...of distributed systems, model architecture, and numerics research to the challenges of biology. We are building the infrastructure...SuggestedFull time
$200k
Member of Technical Staff, RL Research & Environments Posted Feb 28, 2026 | Full-time | Advanced (5-10 yrs) Magic’s mission is to build safe AGI that accelerates humanity’s progress on the world’s most important problems. We believe the most promising path to safe AGI...SuggestedFull timeRelocationVisa sponsorship$150k
Description Amazon’s Frontier AI & Robotics (FAR) team is seeking a Member of Technical Staff to drive foundational research and build intelligent robotic systems from the ground up. In this role, you will operate at the intersection of innovative AI research and real-...- Member of Technical Staff - Applied Research Patronus AI is a frontier lab developing simulation research and infrastructure to accelerate progress toward human‑aligned AGI. We are on a mission to simulate all of the world’s intelligence. We are the team behind some of...
- Job Description As a Member of Technical Staff (Research) at Trajectory, you will design and build the post‑training stack that lets our customers’ models continually learn from real production workflows. You will own end‑to‑end experiments across data, training, and evaluation...
$150k - $350k
Member of Technical Staff, Applied Research — Sieve Location: San Francisco, CA (Onsite) Compensation: $150,000 - $350,000 base + 0.05% - 0.4% equity Visa Sponsorship: H-1B, O-1, OPT supported Experience Level: 2+ years Employment Type: Full-Time Headcount: 4 open seats...Full timeH1bVisa sponsorship- Member of Technical Staff - Computational Biologist Valthos | Posted Mar 3 Full-time Negotiable Advanced (5-10 yrs) Computational Biologist... ...Contribute to shaping and executing the Valthos-wide research and development roadmap Identify large-scale biological datasets...Full timeWork at office
- ...people who deeply understand what's possible Massive leverage - the systems you build will multiply the output of every data team member and every stakeholder who needs data Direct impact - small team, no layers of approval. Idea to shipped system in days, not...
- ...About the Role As a Deployed Research Engineer at Sieve, you'll work on highly specific dataset problems for frontier AI labs and... ...or external teams to translate ambiguous needs into concrete technical systems Strong Python developer with hands-on experience in...
$148.5k - $223.9k
...of AI, and you are the future of Salesforce. Salesforce AI Research is looking for a Machine Learning Engineer to incubate next-generation... ...agentic AI systems with customers. With your strong technical competence, strategic thinking and customer engagement, you...$150k
...We're a new research lab in San Francisco, currently focused on developing new foundational... ...The Product Manager - Technical role for the AGI Autonomy Lab focuses on... ...with other employees, supervisors, and staff; adhere to standards of excellence despite...Local area- ...team. We're looking for someone highly technical (our current team includes 3 IOI medalists... ...Background Listen Labs is an AI-powered research platform that helps teams uncover... ...balance and trust. Room to Grow: As an early member of the team, you’ll have the opportunity...Flexible hours
$150k - $350k
About Us Sieve is the only AI research lab exclusively focused on video data. We combine exabyte-scale video infrastructure, novel video understanding techniques, and dozens of data sources to develop datasets that push the frontier of video modeling. Video makes up 80...- ...infrastructure for massive, petabyte-scale, multimodal datasets Rapidly iterate on experiments and ablations Stay up-to-date on research to bring new ideas to work What we’re looking for We value a relentless approach to problem-solving, rapid execution, and the ability...
- Member of Technical Staff: AI Research & Engineering in Media Integrity About Synhawk Synhawk builds omnimodal foundation models for communication integrity, aimed at infrastructure-side deployment in telco and banking sectors. Our platform analyzes the integrity of audio...Immediate startShift work
$200k
...believe the most promising path to safe AGI lies in automating research and code generation to improve models and solve alignment... ...many of the company's most important decisions. As a Member of Technical Staff on Evals, you will build both the platform and the...Visa sponsorshipRelocation package$180k
...analysis, empowering over 5,000 scientists across 150+ R&D labs to handle data from instrument-to-insights. We're seeking a Member of Technical Staff for Genomics to lead our genomics bench, pushing its capabilities to the frontier of what artificial intelligence can...Full timeWork at office$160k - $230k
...Senior Member of Technical Staff Harper is an AI-native commercial insurance company in San Francisco. We're not bolting AI onto insurance - we're rebuilding the entire business as software, on a simple bet: turning expert human judgment into compute is one of the...Work at officeRelocation$180k
...Member Of Technical Staff - RL Infrastructure Palo Alto, CA xAI's mission is to create AI systems that can accurately understand the universe... ...and automation frameworks to increase the productivity of researchers and engineers. Typical problems you will deal with...Temporary work- ...Member Of Technical Staff – Frontend Stuut is transforming accounts receivable for B2B companies—making collections smarter and faster for companies that have historically relied on manual processes that are labor intensive and costly. Our platform is gaining traction...Full timeFlexible hours
- ...Member Of Technical Staff - Image / Video Generation Freiburg (Germany) About Black Forest Labs We're the team behind Latent Diffusion... ...fast while staying true to what makes us different: research excellence, open science, and building technology that expands...Remote workWorldwide2 days per week
$150k - $280k
...Member of Technical Staff (Backend) San Francisco, CA Compensation: $150,000 – $280,000 + Competitive Equity Type: Full-Time Visa... ...orchestration - Self-healing data pipelines 4. Deep Research Without Hallucinations - Develop deep research pipelines...Full timeTemporary workH1bWork at officeVisa sponsorshipRelocation package- ...Member of Technical Staff, Product TL;DR: Listen is building the human layer of AI. We're Sequoia-backed, raised $100M, and our customers... ...parallel, and we surface what to build next. What used to take research teams weeks per study, we do in hours. Where it's going...Flexible hoursShift work
- ...Member Of Technical Staff, Platform Engineer You'll design, build, and own distributed systems and core platform infrastructure end-to-end across the stack - from user-facing product surfaces and real-time interactions to evaluation pipelines, model orchestration, and...
- ...massive scale and help define the infrastructure layer for the future of AI. About the role Gimlet Labs is seeking a Member of Technical Staff focused on distributed systems. In this role, you will build the core platform that schedules, routes, and operates AI...
- ...Member Of Technical Staff @ Lotus AI Lotus AI is a groundbreaking primary care app that integrates your medical records, AI, and real doctors... ...with a world-class group of engineers, clinicians, and AI researchers to build something with lasting impact to improve...
- ...Member Of Technical Staff, Training Infra Bay Area Ai Systems Inception creates the world's fastest, most efficient AI models. Our Mercury... ...than today's LLMs, with best-in-class quality. We are the AI researchers and engineers behind such breakthrough AI technologies as...Immediate startFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Member of Technical Staff - Research. Be the first to apply!
- technical support associate San Francisco, CA
- decision support analyst San Francisco, CA
- desktop support analyst San Francisco, CA
- senior technical analyst San Francisco, CA
- user support analyst San Francisco, CA
- customer support technician San Francisco, CA
- technical support analyst San Francisco, CA
- support analyst San Francisco, CA
- tech assistant San Francisco, CA
- technical support specialist San Francisco, CA

