Scientific Evals
$160k - $300kEdison Scientific
About Edison Scientific builds and commercializes AI agents for science. Scientific discovery moves too slowly, and autonomous AI agents are how we intend to fix that. We're assembling a team of top researchers and engineers across AI and biology to build an AI scientist.
Role We are seeking an ambitious, scientifically grounded person to join our team focused on developing rigorous benchmarks and training datasets that advance AI capabilities in biology. This role sits at the intersection of biology, data curation, and machine learning, and is ideal for someone with deep scientific training who is excited to shape how frontier AI systems learn to do science. This role is on-site at our San Francisco office in the Dogpatch neighborhood. Our office is a converted warehouse with high ceilings, open space, and a team that genuinely believes in what they're building. This position is part of the Evals team.
Responsibilities
Why join us?
Role We are seeking an ambitious, scientifically grounded person to join our team focused on developing rigorous benchmarks and training datasets that advance AI capabilities in biology. This role sits at the intersection of biology, data curation, and machine learning, and is ideal for someone with deep scientific training who is excited to shape how frontier AI systems learn to do science. This role is on-site at our San Francisco office in the Dogpatch neighborhood. Our office is a converted warehouse with high ceilings, open space, and a team that genuinely believes in what they're building. This position is part of the Evals team.
Responsibilities
- Design benchmarks that capture the complexity of real biological research, drawing on your domain expertise to identify what makes scientific reasoning hard. This will include open-ended scientific benchmarks and building on prior work like LAB-Bench and BixBench.
- Curate and vet biological datasets to ensure scientific rigor.
- Analyze model outputs, identify failure modes, and contribute to iterative improvements in both datasets and evaluation criteria.
- Collaborate with AI/ML researchers to translate scientific intuition into training signal, helping AI systems learn not just facts but how scientists think.
- Coordinate operations and manage workflows, including working with domain experts, tracking task progress, and maintaining documentation.
- Graduate-level training in biology, biochemistry, computational biology, or a related field, with hands-on research experience.
- Working knowledge of machine learning concepts, particularly deep learning and large language models.
- Comfortable with Python and building workflows for data processing, analysis, and experimentation.
- Possess strong scientific taste and able to identify what distinguishes expert-level reasoning from surface-level pattern matching.
- Detail-oriented and willing to take on high-value but occasionally tedious work.
- Energized by ambiguous, open-ended problems that require creativity, collaboration, and first-principles thinking to solve.
- Organized and communicative, able to manage multiple workstreams and coordinate across teams.
- Prior experience creating evaluation datasets, annotation guidelines, or working on human-in-the-loop data pipelines.
- Experience with bioinformatics pipelines, biological databases, or sequence analysis tools.
- Hands-on experience fine-tuning or evaluating large language models, or familiarity with RLHF and preference-based training.
- Publications or research experience in areas relevant to AI for science.
Why join us?
- Competitive salary and equity
- Full healthcare coverage - we pay 100% of premiums for you and your dependents
- Support for growing families, including a yearly new parent stipend and fertility coverage through Carrot
- 401(k) company matching
- $300 health and wellness benefit
- Lunch is on us every day you're in the office, and dinner is on us when you're working late
- Regular team offsites and company events
- A fast-moving, mission-driven culture where smart people do their best work and actually enjoy doing it
Vacancy posted 8 hours ago
Similar jobs that could be interesting for youBased on the Scientific Evals in San Francisco, CA vacancy
- ...identity, and risk workflows Design and run offline and online evals that measure model performance on real customer tasks, not just... ...rigorous experimentation Help create a strong culture of scientific experimentation, clear measurement, and continuous iteration Push...Scientific
$150k - $300k
...including Andrej Karpathy (Eureka AI, Tesla, OpenAI), Tri Dao (Chief Scientific Officer of Together AI), Dylan Patel (SemiAnalysis), Clem... ...Prototype in the Field: Rapidly designing and deploying agents, evals, and harnesses alongside customers to validate solutions. Using...ScientificRemote workVisa sponsorshipRelocation packageFlexible hours$275k - $350k
Overview About Edison Scientific builds and commercializes AI agents for science. Scientific discovery moves too slowly, and autonomous AI... ...and regulatory compliance. Collaborate with Engineering, Evals, and Science teams to benchmark and improve agent performance against...ScientificWork at office- ...scale our ML systems, train and evaluate models, and engineer scientific prototypes into production. While we prefer candidates willing... ...the path to foundation model development. Engineer meaningful evals and metrics which enable rapid model iteration. Design, build and...ScientificWork at office
- ...outcomes, rather than chasing generic benchmarks. Can look at evals, transcripts, and metrics and quickly form grounded hypotheses... ...data pipelines for training new models, evals, analysis, etc. Scientific Mindset. You formulate hypotheses, and you are good at evaluating...ScientificFull time
$175k
...complex projects are executed efficiently, transparently, and with scientific excellence. Translate technical ideas into actionable, well-... ...or frontier models, with contributions to areas like evals, multimodality, human-ai interaction, post-training, pre-training...ScientificLocal areaImmediate startVisa sponsorshipWork visaRelocation package- ...Ventures-backed company building AI agents that reason about complex scientific problems. We’re not a wrapper around existing models, we’re... ...we measure agent performance, building the harnesses that run evals at scale, and making sure our researchers can trust the signal...Scientific
$150k - $300k
...post-training stack: environments, secure sandboxes, verifiable evals, and our async RL trainer. We enable researchers, startups and... ...including Andrej Karpathy (Eureka AI, Tesla, OpenAI), Tri Dao (Chief Scientific Officer of Together AI), Dylan Patel (SemiAnalysis), Clem...ScientificRemote workWorldwideVisa sponsorshipRelocation packageFlexible hours- ...suites that gate every model and agent release — capability, behavior, regressions, and human-rated rubrics that catch what automated evals miss The dashboards and tooling that make researcher experiment loops fast and leadership decisions easy The bar — what counts as...Full timeRelocation package
- ...observations and interpret findings Conduct experiments under defined conditions to verify/reject various types of hypotheses using refined scientific methods Organize and store all chemicals, substances, fluids and compressed gases according to safety instructions Record all...Scientific
- ...frontier language models. Your work will define how we measure tool use, agentic behavior, and real-world reasoning. You’ll design and run evals, build rubrics and scorers, and turn failure analysis into actionable improvements for post-training, RLVR, and data pipelines....Work at office
- ...inform the next decision — you will build lightweight offline evals and shadow-mode testing infrastructure that let the team iterate... ...mentoring others, shaping team practices, or leading cross-functional scientific initiatives. ~ Strong ability to explain complex systems,...ScientificWork at officeLocal areaFlexible hours2 days per week3 days per week
- ...engineering are two sides of the same coin You strive to find simple, expressive metrics, and to measure them accurately You value scientific integrity, and seek to understand the true effect of different interventions Nice to have: Experience with JAX A strong proof-...Scientific
- ...Xterraai, based in San Francisco, is seeking research scientists to develop innovative AI systems that reason about complex scientific problems. This role combines research and engineering, allowing you to take ownership from ideation to production. Locally and remotely...ScientificRemote work
- ...Scientist to develop safety mechanisms for AI systems. The ideal candidate will hold a PhD in a relevant field and possess experience in scientific computing and modern biology. Responsibilities include designing evaluations, collaborating on safety systems, and addressing...Scientific
$160k - $250k
...milestones are clear, and progress is visible without slowing scientific velocity. Translate Research into Execution Turn technical ideas... ...or direct contributions to AI research (e.g., modeling, data, evals, systems, or related areas). Experience supporting research in...ScientificWork at officeRelocation packageFlexible hours$245k - $285k
...this role, you will: Design and execute capability evaluations ("evals") to assess the capabilities of new models Collaborate closely... ...OR equivalent professional experience Extensive experience in scientific computing and data analysis, with proficiency in programming (...ScientificFull timeWork at officeVisa sponsorshipFlexible hoursShift work- ...engineers to join their elite team. This role focuses on developing revolutionary cryopreservation technology, requiring strong scientific and engineering expertise. Suitable candidates will have a degree in Electrical or Mechanical Engineering, with 3+ years of experience...ScientificFlexible hours
- ..., corporate, government, and healthcare. Employing a research‑driven approach, the team pioneers innovative solutions that drive scientific and technological progress. As a member of this esteemed team, the successful candidate will have the opportunity to make a significant...ScientificFor contractorsWork at office
- ...Candidates should exhibit strong research intuition, proficiency in PyTorch and Python, and have a postgraduate degree in a relevant scientific discipline. The firm values both deep research and engineering excellence, making it an exciting place for innovative thinkers....Scientific
- ...Organizing and conducting research projects. Collecting and analyzing zoological data. Writing scholarly articles and contributing to scientific journals. Advancing animal conservation efforts. Studying the interaction between animals and their ecosystems. Researching human...Scientific
- ...cryopreservation efforts. The ideal candidate will have a graduate degree or significant industry experience, with a passion for novel scientific discovery. Benefits include competitive compensation, comprehensive insurance, and flexible time off. #J-18808-Ljbffr UntilScientificFlexible hours
$147.6k - $274k
...and computational sciences. Candidates will design and train foundation models, leveraging massive datasets and contributing to scientific publications. The role requires a PhD in a related field, proven experience with Python and deep learning libraries, and a collaborative...Scientific- ...and bioinformatics, aiming to foster collaboration across disciplines. The ideal candidate will leverage data analysis to support scientific research and contribute significantly to therapeutic developments, ensuring reliable data usage for internal projects and...ScientificFull time
- ...contributing to significant revenue growth. Ideal candidates will possess a strong background in product management, financial analysis, and customer collaboration, enabling successful product launches in a competitive landscape. #J-18808-Ljbffr ThermoFisher ScientificScientific
- Sky Mavis is hiring an Infectious Diseases/Senior Medical Science Liaison to lead scientific engagement and medical strategy related to multi-drug-resistant pathogens. This role involves building relationships with key opinion leaders and healthcare professionals while...Scientific
- ...industry lab experience and expertise in protein expression using mammalian systems. The position offers a collaborative environment with significant equity ownership and relocation benefits. Join us to change the way scientific research happens. #J-18808-Ljbffr MedraScientificRelocation package
$172.34k - $233.17k
JobRx, Inc. is seeking a Senior Medical Science Liaison focused on Rare Diseases in Northern California. You'll act as a key scientific resource while engaging with healthcare providers and decision-makers to enhance patient outcomes. The role requires a Doctorate or extensive...ScientificFlexible hours- ...candidate will join a dynamic team, managing high-throughput workflows and engaging in molecular biology projects. This is a unique opportunity to drive significant scientific initiatives in a mission-driven environment backed by top-tier venture capital. #J-18808-Ljbffr...Scientific
- ...equivalent experience in relevant fields such as Machine Learning or Computational Biology. Responsibilities include translating scientific questions into modeling experiments and collaborating with cross-functional teams. Competitive compensation and a supportive culture...Scientific
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Scientific Evals. Be the first to apply!
Related searches
- associate scientific director San Francisco, CA
- scientific manager San Francisco, CA
- scientific communications San Francisco, CA
- scientific software engineer San Francisco, CA
- scientific advisor San Francisco, CA
- scientific research San Francisco, CA
- scientific marketing manager San Francisco, CA
- senior scientific director San Francisco, CA
- scientific programmer San Francisco, CA
- chief scientific officer San Francisco, CA


