Staff Research Engineer, Model Efficiency
Cohere
Who are we? Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI. We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers. Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products. Join us on our mission and shape the future! Why this role? Large Language Models (LLMs) continue to push the boundaries of what AI systems can do — but inference is still the bottleneck. The Model Efficiency team is responsible for pushing the limits of LLM inference efficiency across our foundation models. We explore and ship breakthroughs across the model execution stack, including: model architecture and MoE routing optimization decoding and inference-time algorithm improvements software/hardware co-design for GPU acceleration performance optimization without compromising model quality We have offices in Toronto, Montreal, San Francisco, New York, Paris, Seoul and London. We embrace a remote-friendly environment, and as part of this approach, we strategically distribute teams based on interests, expertise, and time zones to promote collaboration and flexibility. You'll find the Model Efficiency team concentrated in the EST and PST time zones, these are our preferred locations. As a Staff Research Engineer, you will develop, prototype, and deploy techniques that materially improve how fast and efficiently our models run in production. You may be a good fit for the model efficiency team if you: Have a PhD in Machine Learning or a related field Understand LLM architecture, and how to optimize LLM inference given resource constraints Have significant experience with one or more techniques that enhance model efficiency Strong software engineering skills An appetite to work in a fast-paced high-ambiguity start-up environment Publications at top-tier conferences and venues (ICLR, ACL, NeurIPS) Passion to mentor others If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs. Full-Time Employees At Cohere Enjoy These Perks An open and inclusive culture and work environment Work closely with a team on the cutting edge of AI research Weekly lunch stipend, in-office lunches & snacks Full health and dental benefits, including a separate budget to take care of your mental health 100% Parental Leave top-up for up to 6 months Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend ✈️ 6 weeks of vacation (30 working days!) #J-18808-Ljbffr Cohere
- A leading AI research company in San Francisco is seeking a Staff Research Engineer to enhance the efficiency of large language models. In this role, you will develop and implement advanced techniques to optimize model performance in production. Ideal candidates will hold...SuggestedRemote work
- A leading AI research firm in San Francisco is seeking a Member of Technical Staff specialized in Model Efficiency. In this role, you will enhance LLM inference systems by tackling performance issues and collaborating with cross-functional teams. Ideal candidates have over...SuggestedRemote work
- Jaide Health is seeking an engineer for their Model Efficiency team in San Francisco. The role focuses on building reliable ML systems while enhancing core performance metrics across model execution. You'll work with advanced performance techniques such as GPU/CUDA optimizations...SuggestedRemote job
- ...company based in San Francisco, California. The Role: As a Research Engineer - Model Architectures , you will be a core contributor to Zyphra’... ...pipelines and the hardware requirements to design efficient architectures for GPU hardware Strong grasp of proper experimental...SuggestedWork at officeRelocation package
$315k
...growing group of committed researchers, engineers, policy experts, and business... ...role Anthropic's production models undergo sophisticated post-training... ..., build, and run robust, efficient pipelines for model fine-... ...: Currently, we expect all staff to be in one of our offices...SuggestedWork at officeVisa sponsorshipFlexible hours$220.8k - $298.8k
# Staff Applied Research EngineerHybrid - San FranciscoApply**Our Mission & Values... ...time and money by making efficiency a priority.**Our Culture &... ...with a **thoughtful hybrid model** because we believe... ...Drata is seeking an Applied AI Engineer to drive the quality and effectiveness...Work at officeImmediate startWorldwideMonday to FridayFlexible hours- ...Job Description Zyphra is an artificial intelligence company based in San Francisco, California. The Role: As a Research Engineer - Language Model Pre-Training , you'll shape our language model roadmap through end-to-end pretraining development. You will work...Work at officeRelocation package
- ...Member of Technical Staff, Model Efficiency Who are we? Our mission is to scale intelligence to serve humanity. We’re training and deploying... ...do what’s best for our customers. Cohere is a team of researchers, engineers, designers, and more, who are passionate about their...Full timeWork at officeRemote workFlexible hours
$320k
Anthropic in New York City is seeking a Research Engineer to develop evaluations for Claude’s capabilities. The ideal candidate should have... ...results during training runs. The role offers a hybrid work model and competitive compensation ranging from $320,000 to $485,00...Remote job- Refresh AI is seeking a Research Engineer in San Francisco to push the boundaries of benchmarking technology. You will build benchmarks that labs use for evaluating coding abilities and computer-use capability. Your role will require expertise in reinforcement learning...Full time
- A leading AI research firm in San Francisco is seeking a Research Engineer specializing in Model Architectures. You will design and rigorously test innovative model architectures, improving core modeling capabilities and collaborating closely with pre-training teams. Candidates...
- ...At Liquid, we’re not just building AI models—we’re redefining the architecture of intelligence... ...out of MIT, our mission is to build efficient AI systems at every scale. Our Liquid... ...team is a community of world-class engineers, researchers, and builders creating the next...
- ...is seeking talented individuals for AI research roles focused on open superintelligence.... ...contribute to the development of foundational models. The ideal candidate will hold a... ...a related field, possess solid software engineering skills, and have experience with large-scale...
$192k - $260k
A leading data and AI company is seeking a Staff Engineer to design and implement core systems for Foundation Model Serving. The ideal candidate will have over 10 years of experience in building large-scale distributed systems and will collaborate closely across teams...- Reflection is seeking a talented individual to conduct critical analysis and build evaluation frameworks to improve model capabilities. The ideal candidate will possess strong statistical analysis skills and familiarity with LLM evaluation methodologies. We offer top-tier...Relocation package
- Introducing Moonlake, AI for creating world simulations. Modeling & architecture Build and iterate on 2D/3D/image/video/audio diffusion architectures Work on conditioning: text/image/pose/layout/control signals, multi-modal encoders, guidance strategies. Training &...
- Gravity Engineering Services Pvt Ltd. is seeking a Member of Technical Staff in San Francisco, California. In this role, you will design and build the infrastructure necessary for models to learn from production workflows continually. You will manage end-to-end experiments...
$220.8k - $298.8k
...automation. Drata is seeking an Applied AI Engineer to drive the quality and effectiveness of our AI systems through rigorous research, experimentation, and evaluation. In this... ..., and classic ML: chunking, embedding models, hybrid search, metadata filtering, structured...Flexible hours$200k
A technology company in San Francisco is seeking a Software Engineer for their RL Research & Environments team. The role focuses on designing and improving data and evaluation systems to enhance model capabilities. Candidates should have a strong software engineering background...$300k
...and grow as a team. About the Team The Research team at Decagon innovates on building the... ...\u2019re looking for people with strong engineering skills, writing bug-free machine learning... ...happen. In this role, you will Develop models for customer support tasks that exceed the...Work at office- ...profitable Enterprise AI Customer Support startup with their search for senior/staff ML research engineers. The role will be onsite in their SF office. What you'll do: Develop models for customer support tasks that exceed the performance of closed source models...Work at office
- ...leading conversational AI platform in San Francisco seeks an AI/ML Engineer to build advanced systems for unprecedented performance. The... .../ML projects. You'll design state-of-the-art methods, develop models for customer support, and tackle complex challenges while working...
- Perplexity is seeking a Research Engineering Manager to lead the team of all-star AI researchers and engineers responsible for developing the models that drive our products. Our team has developed some of the most advanced models for agentic research, query understanding...
- ...Francisco is seeking a candidate for a unique role at the intersection of AI research and systems engineering. You will design experiments, build task generation systems, and evaluate model failures. This is a hands-on role that requires the ability to transform research...
- ...person who takes the newest open-source models (image, video, 3D, audio, multimodal...)... ...under the hood You've ported models from research into production and gotten the outputs to... ...to run natively in the ComfyUI core engine Design and build the native nodes that...
- ...focused company in San Francisco seeks candidates with expertise in AI simulation development. The role emphasizes optimizing training efficiency, enhancing GPU performance, and ensuring low-latency inference. Applicants should be proficient in methodologies for gradient...
$264.8k - $331k
...around the world. The Enterprise ML Research Lab works on the front lines of this AI... ...all of our enterprise clients. As a Staff Agent Post-Training MLRE, you will build... ...to training foundation healthtech search models. If you are excited about shaping the future...Full time- ...generation systems, run evaluations, inspect model failures, and develop methods for mining... ...will consume real-world trajectories or researcher hypotheses, materialize realistic data,... ...of empirical AI research, systems engineering, and model evaluation. You may be a strong...
- ...product roadmap, so we are expanding our engineering team. We're looking for someone highly... ...Background Listen Labs is an AI-powered research platform that helps teams uncover insights... ...of responses. Customer Preference Model & Synthetic Personas We're bringing Jeff...Flexible hours
- ...Salesforce, etc. We are a small team of engineers wrangling problems from context to search... ...real tool calling data, measuring where models suck in long horizon tool execution... ...harness and app sandboxes Qualifications research you can independently execute against the...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Research Engineer, Model Efficiency. Be the first to apply!
- software engineer staff San Francisco, CA
- staff devops engineer San Francisco, CA
- assistant engineer San Francisco, CA
- assistant engineering manager San Francisco, CA
- staff design engineer San Francisco, CA
- project engineer assistant project manager San Francisco, CA
- technology administrator San Francisco, CA
- staff data engineer San Francisco, CA
- assistant chief engineer San Francisco, CA
- senior staff systems engineer San Francisco, CA


