Member of Technical Staff - Evaluations
Reflection AI, Inc
Our Mission Reflection's mission is to build open superintelligence and make it accessible to all . We're developing open weight models for individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind, OpenAI, Google Brain, Meta, Character.AI, Anthropic and beyond.
About the Role
About the Role
- Conduct critical comparative analysis to advance our understanding of model capabilities
- Build and refine evaluation systems and processes that create tight feedback loops between data, evals, and model behavior
- Develop generalizable evaluation frameworks that capture what matters for reasoning, alignment, and usefulness.
- Collaborate closely with pre-training, post-training, and applied teams to translate insights into model improvements.
- Push the boundaries of what's measurable, from synthetic evals to human feedback and real-world interaction data.
- Strong statistical analysis and experimental design skills to rigorously measure model improvements
- Familiarity with LLM evaluation methodologies: static benchmarks, human preference evals, and/or agentic tasks.
- High agency and thrive in a fast-paced startup environment; bias for impact over process.
- Excited to work in a new frontier lab, defining how we measure and accelerate progress toward more capable models.
- Collaborative, detail-oriented, and motivated by building the feedback loops that make models truly improve.
- Top-tier compensation: Salary and equity structured to recognize and retain the best talent globally.
- Health & wellness: Comprehensive medical, dental, vision, life, and disability insurance.
- Life & family: Fully paid parental leave for all new parents, including adoptive and surrogate journeys. Financial support for family planning.
- Benefits & balance: paid time off when you need it, relocation support, and more perks that optimize your time.
- Opportunities to connect with teammates: lunch and dinner are provided daily. We have regular off-sites and team celebrations.
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff - Evaluations in New York, NY vacancy
- ...provide the core infrastructure to tune, evaluate, and serve specialized models at scale... ...more to be announced soon. Our Technical Staff develops the foundational technology that... ...seems like a fit, please apply! As a Member of Technical Staff, you will contribute...SuggestedLive inWork at officeRelocationVisa sponsorship
$200k - $270k
...long-term success for both clients and candidates. Member of Technical Staff Location: New York City Company Stage of Funding:... ...applications Build and improve LLM-powered systems, including evaluations, monitoring, and reliability tooling Analyze production...SuggestedWork at officeVisa sponsorship- ...a typical "Applied Scientist" or "ML Engineer" role. As a Member of Technical Staff, Applied ML, you will: # Work directly with enterprise... ...CPT, post-training, retrieval + agent integrations, model evaluations, and SOTA modeling techniques. # Influence the...SuggestedFull timeWork at officeRemote workFlexible hours
- ...Member of Technical Staff — Internal AI Harness Stuut is transforming accounts receivable for B2B companies—making collections smarter and... ...including HubSpot, Slack, Fathom, and Linear. Implement evaluation frameworks, logging, and feedback loops to continuously...SuggestedFull timeFlexible hoursShift work
$139.9k - $274.8k
...create trustworthy AI that scale. We are looking for a Member of Technical Staff who is truly AI‑native—someone who experiments constantly,... ...models, multimodal models), including prompt engineering, evaluation, or fine‑tuning. Hands‑on experience with AI-assisted coding...SuggestedOngoing contractLocal area- ...help create something truly transformative. The Role As a Member of Technical Staff, you'll be a core technical contributor building high‑impact... ...precise prompting systems, fine‑tune models, and develop evaluation frameworks to deliver consistent, reliable results from...Local area
- ...Activant, 1984 Ventures and Page One. The Role We’re hiring a Member of Technical Staff - AI/ML to design, build, and deploy AI-powered systems... ..., or similar) — feature engineering, model selection, evaluation, calibration Have strong opinions on AI/ML evals — golden...Full timeFlexible hours
$175k - $220k
...Member Of Technical Staff, Cloud Infrastructure New York, NY; San Mateo, CA At Fireworks, we're building the future of generative AI infrastructure... ...into robust infrastructure solutions. Continuously evaluate and integrate cloud-native and open-source technologies (e...- ...provide the core infrastructure to tune, evaluate, and serve specialized models at scale... ...more to be announced soon. Our Technical Staff develops the foundational technology that... ...and help document findings Nearly all members of our Technical Staff work across both...InternshipLive inWork at office
- ...and outside of the Asset Data role: Technical Skills: Drive adoption of real-time asset... ...at Anchorage Digital by 2x Member of Technical Staff (Infrastructure) Business Analyst - Workforce... ...Technical Staff, Data Analysis and Evaluation We’re unlocking community knowledge in...Full timeWork at officeRemote work
- ...US-based 501(c)(3). Job Description: We are looking for a Member of Technical Staff, Research to investigate, design, test and develop state of... ...logging results for later review. Develop a framework to evaluate the models’ learning using visual and statistical tools to...Remote work
$180k - $238.1k
...with engineers to tackle all aspects of evaluating database performance under a wide spectrum... ...first 30 days, you will become an integrated member of our performance engineering team. You’... ...experience level ranges from mid to staff level. At a minimum, this role requires:...Local areaRemote workWorldwideFlexible hours- ...let's talk anyway. - if you find something here that resonates, mention it in your application. About the Role Members of Technical Staff at Anterior own problems end-to-end - from system design through to production. You'll build and scale the core platform...ApprenticeshipFlexible hours
- ...context. You think critically and consider both the business and technical tradeoffs of your solutions. Not ideological about... ...design product experiences that help automate common workflows ~ Evaluating LLMs across a diversity of life science specific tasks ~ Government...
- ...Pace Technical Staff Role Pace is an AI-native business process outsourcer for insurers. We combine the speed of AI agents with expert... ...of the largest companies in the world. We're looking for a Member of Technical Staff who will partner with our team on product....
$160k - $320k
...be able to concisely and accurately share knowledge with their teammates. About the Role We're seeking a remarkable Member of Technical Staff to join our team in managing and enhancing reliability, automating processes, and conjuring excellent experiences for platforms...Work at office$119.8k - $234.7k
...proven expertise, demonstrated through impactful publications or technical leadership on high-scale projects. Possess strong... ...architectures, conduct experiments, champion measurement and evaluation, innovate datasets and data pipelines. Improve training and...Ongoing contractWork at officeLocal area- ...Member Of Technical Staff, Product Listen is building the human layer of AI. We're Sequoia-backed, raised $100M, and our customers include... ...what McKinsey does for $1M per engagement. The bottleneck is evaluating those qualitative outputs. Once you have the eval, you can...Flexible hoursShift work
- ...Member of Technical Staff - IT Engineer Reflection AI is looking for a Member of Technical Staff - IT Engineer. In this role, you'll be expected to manage a broad range of operational, strategic, and project-based responsibilities with maximum autonomy and minimal oversight...Work at officeRelocation package
$119.8k - $234.7k
...Overview As a Member of Technical Staff - Software Engineer & Machine Learning, you will work building AI Insights, a Copilot analytics product... ...-on with observability (metrics, tracing, logs) and model evaluation frameworks. Qualifications Required Qualifications:...Ongoing contractWork at officeLocal area- ...directly shape how users interact with ATG, shaping the future of mobile AI experiences. As a key member of our team, you'll push the boundaries of iOS development, combining technical excellence with design finesse. From pixel-perfect UI to smooth animations and thoughtful...Shift work
- ...data generation and reinforcement learning pipelines at scale. Build high-performance inference platforms capable of serving and evaluating models across thousands of GPUs. Optimize throughput, latency, and GPU utilization for large language model inference and...Relocation package
$189.59k
...Portugal; Singapore; and Sioux Falls, South Dakota. Learn more at anchorage.com, on X @Anchorage, and on LinkedIn. Job title: Member of Technical Staff, Banking SolutionsCompany name: Anchor LabsJob site address: New York, New YorkJob Requirements: Position requires a...Bank staffRemote work$160k - $320k
...be able to concisely and accurately share knowledge with their teammates. About the Role We're seeking a remarkable Member of Technical Staff to join our team to design a central intelligent trading unit that can trade autonomously for tens of thousands of users....Work at office- ...backed by top-tier investors including a16z, Khosla, Activant, 1984 Ventures and Page One. The Role We're hiring a Member of Technical Staff - Applied AI, Fullstack to design, build, and scale end-to-end systems that power Stuut's platform for B2B financial...Full timeFlexible hours
$185k - $200k
...collaborative culture and exchange knowledge with a highly experienced technical organization. Ensure that CockroachDB remains scalable,... ...In your first 30 days, you will become an integrated member of our engineering team. You'll spend time learning about the Storage...Local areaRemote workFlexible hours- ...About the Role Own the red-teaming and adversarial evaluation pipeline for Reflection's models, continuously probing for failure... ..., or equivalent practical experience in AI Safety. Deep technical understanding of LLM safety, including adversarial attacks, red...Relocation package
- ...sponsor quarterly in‑person collaboration days to work together and further deepen our Village. A successful member of this team would help bridge business and technical uses of data, creating an easy to use platform for Data and Data Tooling for Engineering, Reporting,...Work at officeRemote work
- ...datasets, metadata, provenance, and versions so experiments are reproducible and it’s clear what data went into which training and evaluation runs Own CI/CD and development tooling for the data stack (GitHub, Python, PyTorch), and automate repetitive workflows to reduce...
$125k - $185k
...About the job Founding AI Engineer / Member of Technical Staff YC - Startup Role: Founding AI Engineer LocationPrimary: New York... ...Contribute across the stack when needed (APIs, internal tools, evaluation dashboards) to keep the overall AI surface area robust,...Temporary workWork at office
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Member of Technical Staff - Evaluations. Be the first to apply!
Related searches
- technical support assistant New York, NY
- technical analyst New York, NY
- end user support technician New York, NY
- IT assistant New York, NY
- oracle technical analyst New York, NY
- help desk assistant New York, NY
- IT support technician New York, NY
- operations support technician New York, NY
- desktop support analyst New York, NY
- support analyst New York, NY

