Member of Technical Staff, Evals
$200kMagic
Overview Magic’s mission is to build safe AGI that accelerates humanity’s progress on the world’s most important problems. We believe the most promising path to safe AGI lies in automating research and code generation to improve models and solve alignment more reliably than humans can alone. Our approach combines frontier-scale pre-training, domain-specific RL, ultra-long context, and inference-time compute to achieve this goal. About The Role Evals builds the internal platform that teams across Magic use to evaluate the performance of internal and external models. The team supports pre-training, post-training, data, inference, and product, and sits on the critical path of many of the company's most important decisions. As a Member of Technical Staff on Evals, you will build both the platform and the evaluations themselves. You'll develop infrastructure for large-scale evaluations, data ablations, and dataset quality analysis, while designing and validating the methodologies used to measure model performance. Sweating the details matters on this team. Many benchmarks, papers, and open-source evaluation frameworks contain subtle bugs or flawed assumptions that lead to misleading conclusions. We care deeply about correctness, reproducibility, and measurement quality. Evals are essential to the success of the company. By building trustworthy evaluation systems, you will help Magic make better research decisions, build better datasets, and ship better products. What You'll Work On Build and maintain the internal evals platform used across Magic Design, implement, and validate eval tasks for pre-training, post-training, reinforcement learning, inference, and product systems Develop infrastructure for running large-scale evaluations Build systems to measure dataset quality and identify opportunities to improve training data Improve evaluation correctness, reproducibility, and reliability Audit and improve upon public benchmarks, evaluation methodologies, and open-source implementations Partner with research, data, inference, and product teams to define metrics that accurately reflect model quality Build tooling and frameworks that enable teams across Magic to make decisions based on trustworthy measurements What We're Looking For Experience building production systems, internal platforms, or developer infrastructure Experience working with machine learning systems, evaluation frameworks, data infrastructure, or research tooling Track record of owning technical projects end-to-end Skepticism toward results that cannot be reproduced, validated, or explained Ability to reason critically about benchmarks, metrics, and experimental methodology Experience designing, implementing, or operating systems that run at scale Comfortable navigating ambiguity and determining whether a measurement is actually capturing the behavior it claims to measure Excitement about helping researchers and engineers make better decisions through trustworthy measurements Compensation, Benefits, And Perks (US) Annual salary range between $200K - $550K depending on experience Equity is a significant part of total compensation, in addition to salary 401(k) plan with 6% salary matching Generous health, dental, and vision insurance for you and your dependents Unlimited paid time off Visa sponsorship and relocation support for candidates moving to San Francisco A small, fast-moving, highly collaborative team working on frontier AI systems Magic strives to be the place where high-potential individuals can do their best work. We value quick learning and grit just as much as skill and experience. Our culture Integrity. Words and actions should be aligned Hands-on. At Magic, everyone is building Teamwork. We move as one team, not N individuals Focus. Safely deploy AGI. Everything else is noise Quality. Magic should feel like magic Compensation Range: $200K - $550K #J-18808-Ljbffr Magic
$180k
Member of Technical Staff - RL Infrastructure About xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization...SuggestedTemporary work$150k - $300k
...environments, secure sandboxes, verifiable evals, and our async RL trainer. We enable... ...systems into our RL training stack. Core Technical Responsibilities LLM Serving Multi‑tenant... ...believe in open development and encourage team members to contribute to the broader AI community...SuggestedWork at officeRemote workVisa sponsorshipRelocation packageFlexible hoursShift work- ...requirements, and very few precedents to copy from. About the Role Members of Technical Staff (MTS) are the senior engineers who build the platform that... ...as variations of the same primitive. Observability and evals. The harness that tells us whether the system is working:...Suggested
- ...Member Of Technical Staff @ Lotus AI Lotus AI is a groundbreaking primary care app that integrates your medical records, AI, and real doctors... ...and fine-tuning, model tooling, data pipelines, retrieval/evals, and product workflows. You'll be close to the core system...Suggested
- ...Member of Technical Staff, Product TL;DR: Listen teaches AI what people actually think and want. We're Sequoia-backed, raised $100M, and our... ...humans actually want, taking action, and iterating. Agent Evals. Every part of our product is built AI-first. Study...SuggestedFlexible hoursShift work
$185k - $255k
Member of Technical Staff - Reinforcement Learning Optimized deploys AI agents into the most critical supply chains in the world: the operations... ...looks like. • Evaluate long-horizon behavior: You'll build evals that measure agent reliability across long, high-stakes workflows...$300k
Member of Technical Staff - RL Algorithms About V max V max is an applied research lab developing AI capable of open-ended learning. We are building... .... Collaborate with researchers working on environments, evals, interpretability, reward modeling, and infrastructure to...Work at officeLocal areaShift work$300k
Member of Technical Staff - RL Infrastructure About V max V max is an applied research lab developing AI capable of open-ended learning. We are... ...: distributed rollouts, training orchestration, inference, evals, data pipelines, observability, and reliability. You will create...Work at officeLocal area- ...cover our use cases. In this role, you will build specialized evals to improve answer quality across Perplexity, covering search-based... ...directly shape product changes, collaborating closely with technical leadership to measure and improve Answer Quality Qualifications...
- Member of Technical Staff, Applied Research About Us At Fireworks, we’re building the future of generative AI infrastructure. Our platform delivers... ...with evaluation methodologies for LLMs (benchmarks, custom evals, error analysis). Proficiency in diagnosing system‑wide...
$176k - $253k
Senior Member of Technical Staff, AI Quality Harper is an AI-native commercial insurance company in San Francisco. We're not bolting AI onto insurance... ...means the work is scrutinized and the bar is high — your evals are what let everyone else ship fast without flying blind....Permanent employmentWork at officeRelocation$150k - $300k
...environments, secure sandboxes, verifiable evals, and our async RL trainer. We enable... ...fast, robust, and reliable at scale. Core Technical Responsibilities Infrastructure Development... ...believe in open development and encourage team members to contribute to the broader AI community...Work at officeRemote workVisa sponsorshipRelocation packageFlexible hours- ...What we are looking for? Seeking a Member of Technical Staff - Backend with 5+ years of experience. We are looking for an exceptional builder who seeks outsized responsibility and impact and has a demonstrated history of thoughtful, pragmatic decision‑making. This individual...Work experience placement
- ...Pixeltable Inc. Member of Technical Staff San Francisco, CA·Full time Apply for Member of Technical Staff As a founding member of the engineering team, you will impact the design and direction of Pixeltable at a formative stage, contributing to some of our most foundational...Full timePart timeWork at officeWork from homeFlexible hours2 days per week
$227.5k - $401k
...your career. We are motivated individuals who tackle unique technical challenges at scale and solve them as a team, delivering... ...AI research within the financial technology sector. As a Member of Technical Staff, you will operate with a high degree of autonomy and responsibility...Work at officeImmediate startRelocationFlexible hours$250k
...”, or “how does this relate to alignment evals”. You reliably notice important but subtle... ..., scalable systems and make sound technical decisions. You lead large projects from ideation... ...invest in the growth of your team members. You can hold a team to high standards while...H1bWork at officeWork from homeHome officeRelocation packageFlexible hours3 days per week- ...small team in SF. We build for ourselves — everyone at Shapes uses Shapes every single day, and everyone talks to users. Member of Technical Staff is the title we use for engineers who own hard problems end to end across the stack. Representative projects Designing systems...
- ...most capable AI models are being pointed at B2B SaaS. We are pointing ours at the frontier of science. Role Overview As a Member of Technical Staff you will shape Conductor's core offerings: AI software that controls quantum hardware. From agentic quantum algorithm discovery...
- ...impact: Play a key role in transforming factory floor operations. Wide ownership: Cross-layer problems with no silos. Customer proximity: Embed with factory operators, iterate and validate fast. Meritocracy: Any problem can be solved by any team member. #J-18808-Ljbffr...
- ...AI frontier — you won't just observe the cutting edge of AI, your work will define what cutting edge means. We're hiring Members of Technical Staff to design the evaluations that set the standard for how AI is measured, produce analysis that shapes how companies and the...
- ...products. We power the next generation of AI experiences that will reshape how people discover and buy online. Role As a Member of Technical Staff, you will ship core systems, set engineering culture, and move the mission from prototype to platform. You will work across...Work at office
$200k
...Join to apply for the Member of Technical Staff role at Listen Labs . TL;DR: We are seeing strong market demand and an aggressive 6‑month product roadmap, so we are expanding our engineering team. We're looking for someone highly technical (our current team includes...Flexible hours- ...brag to your friends about your hyper-optimized AI coding workflows tinker and build software for the love of the game feel equally strong obligations to both 1) choose good and 2) to win. think that this role should be renamed "member of tomo staff" #J-18808-Ljbffr...Immediate start
- ...humans within a native office suite, and deploys them in secure environments for Fortune 500 companies. About the Role As a Member of Technical Staff, you will be part of the team responsible for the work platform. You will think end-to-end about what it means to build...Work experience placementH1bWork at officeVisa sponsorship
$130k - $200k
...We’re a team of AI engineers and seasoned architects, bridging domain expertise with frontier technology. The Role Being a Member of Technical Staff at SketchPro means the problem in front of you will keep changing. You might spend a week designing how an agent...Work at officeShift work- ..., Qualcomm Ventures, and General Catalyst, alongside engineers from companies such as Apple, Ramp, Stripe, and Meta. As a Member of Technical Staff , you will own products end‑to‑end across a full‑stack TypeScript and React platform, work directly with Fortune 100 customers...Work at office
$225k - $300k
...Member of Technical Staff Location: San Francisco, CA Onsite Policy: Full-time onsite Comp & Benefits: $225K - $300K base + 0.5% - 2% equity This company is rebuilding consumer underwriting infrastructure from the ground up using AI-powered systems across document intelligence...Full time$140k - $200k
...Member of Technical Staff Harper is an AI-native commercial insurance company in San Francisco. We're not bolting AI onto insurance - we're rebuilding the entire business as software, on a simple bet: turning expert human judgment into compute is one of the largest...Work at officeRelocation$180k
...Member Of Technical Staff - Inference Palo Alto, CA About Xai Xai's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence...Temporary work- ...Member Of Technical Staff Humans& is a human-centric frontier AI lab. We believe AI can be reimagined, centering around people and their relationships with each other. We are looking for researchers and engineers who have done exceptional work at the frontier of...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Member of Technical Staff, Evals. Be the first to apply!
- salesforce technical analyst San Francisco, CA
- desktop support analyst San Francisco, CA
- personal computer support technician San Francisco, CA
- technical support specialist San Francisco, CA
- support analyst San Francisco, CA
- customer support technician San Francisco, CA
- support technician San Francisco, CA
- application support technician San Francisco, CA
- technical solutions specialist San Francisco, CA
- help desk administrator San Francisco, CA

