Senior ML Systems Engineer, Frameworks & Tooling
Cohere
Who are we? Cohere is the leading security-first enterprise AI company. We build cutting-edge foundation AI models and end-to-end products that are designed to solve real-world business problems. We're training and deploying frontier models for enterprises who are building AI systems. We believe that our work is instrumental to the widespread adoption of AI and we are looking for folks that want to be part of that. We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. Cohere is a team of researchers, engineers, designers, and more, who are all passionate about their craft. We are a global technology company co-headquartered in Toronto and San Francisco, with key offices in London, New York City, Montreal, Seoul, Germany and Paris. Join us! We're looking for a senior engineer to help build, maintain and evolve the training framework that powers our frontier-scale language models. This role sits at the intersection of large-scale training, distributed systems, and HPC infrastructure. You will design and maintain the core components that enable fast, reliable, and scalable model training - and build the tooling that connects research ideas to thousands of GPUs. If you enjoy working across the full stack of ML systems, this role gives you the opportunity and autonomy to have massive impact. What You'll Work On
If any of the above doesn't line up exactly with your experience, we still encourage you to apply.
We strive to create an inclusive work environment for all; we welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs. We may use AI-enabled tools to screen and assess applicants against the criteria for this position. This helps our recruiters identify potentially qualified candidates, but it doesn't limit the applications our recruiters may review or consider.
- Build and own the training framework responsible for large-scale LLM training.
- Design distributed training abstractions (data/tensor/pipeline parallelism, FSDP/ZeRO strategies, memory management, checkpointing).
- Improve training throughput and stability on multi-node clusters (e.g., GB200/300, AMD, H200/100).
- Develop and maintain tooling for monitoring, logging, debugging, and developer ergonomics.
- Collaborate closely with infra teams to ensure our cluster, container environments, and hardware configurations support high-performance training.
- Investigate and resolve performance bottlenecks across the ML systems stack.
- Build robust systems that ensure reproducible, debuggable, large-scale runs.
- Strong engineering experience in large-scale distributed training or HPC systems.
Deep familiarity with JAX internals, distributed training libraries, or custom kernels/fused ops. - Experience with multi-node cluster orchestration (Slurm, Ray, Kubernetes, or similar).
- Comfort debugging performance issues across CUDA/NCCL, networking, IO, and data pipelines.
- Experience working with containerized environments (Docker, Singularity/Apptainer).
- A track record of building tools that increase developer velocity for ML teams.
- Excellent judgment around trade-offs: performance vs complexity, research velocity vs maintainability.
- Strong collaboration skills - you'll work closely with infra, research, and deployment teams.
- Experience with training LLMs or other large transformer architectures.
- Contributions to ML frameworks (PyTorch, JAX, DeepSpeed, Megatron, xFormers, etc.).
- Familiarity with evaluation and serving frameworks (vLLM, TensorRT-LLM, custom KV caches).
- Experience with data pipeline optimization, sharded datasets, or caching strategies.
- Background in performance engineering, profiling, or low-level systems.
- You'll work on some of the most challenging and consequential ML systems problems today.
- You'll collaborate with a world-class team working fast and at scale.
- You'll have end-to-end ownership over critical components of the training stack.
- You'll shape the next generation of infrastructure for frontier-scale models.
- You'll build tools and systems that directly accelerate research and model quality.
- Build a high-performance data loading and caching pipeline.
- Implement performance profiling across the ML systems stack
- Develop internal metrics and monitoring for training runs.
- Build reproducibility and regression testing infrastructure.
- Develop a performant fault-tolerant distributed checkpointing system.
- Cohere is remote-friendly. We have offices in Toronto, San Francisco, New York City, London, Paris, Montreal, and more coming soon.
- For those in the office: a daily lunch program, plenty of snacks, and regular community and social events.
- For those not near an office: a co-working benefit so you can work alongside others in your city.
If any of the above doesn't line up exactly with your experience, we still encourage you to apply.
We strive to create an inclusive work environment for all; we welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs. We may use AI-enabled tools to screen and assess applicants against the criteria for this position. This helps our recruiters identify potentially qualified candidates, but it doesn't limit the applications our recruiters may review or consider.
Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Senior ML Systems Engineer, Frameworks & Tooling in Paris, NY vacancy
$115.7k - $150.5k
...is looking for an Information Systems Security Manager (ISSM) to... ...expertise in the Risk Management Framework (RMF), defense cybersecurity... ...ISSM will work closely with engineering and program leadership to ensure... ...package submission tools (e.g., eMASS) leading to successful...SuggestedTemporary workFor contractorsWork experience placementCasual workLocal areaRelocation package- ...help build intelligent systems that improve data quality... ..., and better tools so they can focus their... ...Statistics, Mathematics, Engineering, Information Systems, or... ...predictive models or analytical frameworks that influenced... ...production-grade analytics, ML, or AI-assisted systems...Suggested
$65.12k - $97.68k
...and manage consolidation sales activity using CRM and calendaring tools, including engagement status, next steps, and outreach cadence,... ...preferred, with the ability to learn and effectively use CRM systems to manage opportunities, document activity, and support sales outcomes...SeniorTemporary workWork at officeFlexible hours3 days per week- ...Senior Backend Engineer, AI Platform Chubb's AI Platform team is... ...within the Crucible SDLC framework, and hold the quality... ...agentic coding tools — Claude Code, GitHub... ...Collaborate with AI/ML engineers, frontend engineers... ...leverage the type of system rather than work around...SeniorFull timeContract workTemporary work
- ...Senior UI Engineer We're looking for a Senior UI Engineer who... ...will evolve. The AI tooling will evolve. You embrace... ...tools, learning new frameworks, and pushing the team... ...backend engineers, AI/ML platform engineers, designers... ..., component design systems, CI/CD pipeline...SeniorFull timeTemporary work
$23.51 - $28.17 per hour
...NY is seeking an experienced Senior Assembler to join the Manufacturing... ...Work from wire diagrams, engineering drawings (blueprints),... ...products. Use hand and power tools as required to complete assemblies... ...advanced technology and systems, supporting the U.S. Armed Forces...SeniorTemporary workFor contractorsWork experience placementCasual workLocal area$50 - $60 per hour
...independent contractor. We're currently expanding into an exciting new area – teaching AI Assistant models to be a more useful tool for finance professionals. We're seeking experienced finance professionals with advanced degrees (MBA+) and professional experience...SeniorHourly payContract workFor contractorsWork experience placementRemote work- ...Senior Master Technician We are seeking a Senior Master Technician who is Ford-certified... ...complex mechanical issues using diagnostic tools and equipment. Provide accurate and... ...preferred. Thorough knowledge of automotive systems, mechanics, and components. Strong...SeniorLocal areaFlexible hours
$123.3k - $221.95k
...Intelligence (AI) Security Engineer The Principal... ...securing machine learning (ML), generative artificial... ...(GenAI), and agentic systems in production, with emphasis... ...(RAG) pipelines, agent frameworks, application... ..., agent orchestration, tool calling, and multi-model...Work from homeHome office$80.9k - $101.1k
...Statements of Work and driving the SOW through final review and Engineer Technical Review (ETR) release. Issuing RFI and RFQ packages... ...records in accordance with US Government Certified Purchasing System Requirements. Supporting proposal costing efforts for subcontract...SeniorContract workTemporary workFor contractorsWork experience placementFor subcontractorCasual workLocal area$34.62 - $43.27 per hour
...Job Description: Saab, Inc. is seeking a Senior Depot Specialist to join our team supporting defense products and systems deployed worldwide. The Senior Depot Specialist will provide responsive support for hardware and information needs to customers and Saab colleagues...SeniorHourly payTemporary workFor contractorsWork experience placementCasual workLocal areaRemote workWorldwide$20 - $20.25 per hour
Hourly rate ranges from $20.00 - $20.25 per hour and is dependent upon qualifications and experience. Benefits include: Company Paid Sick Time, Paid Vacation Time, Paid Holidays, Bereavement Pay, Jury Duty Pay, Contest Prize Awards, 401K Plan with Company Match, Medical...SeniorHourly payLocal area$22 - $36 per hour
...Senior Quality Engineering Technician Resonetics is a global leader in advanced engineering, prototyping, product development, and micro manufacturing... ...processes, with increased ownership of quality systems, data-driven decision-making, and cross-functional support...SeniorContract workWork at office$18.65 per hour
Housekeeper/Homemaker/Companion Are you looking for a career with flexibility? Are you dependable and caring? Then CareGivers is looking for someone just like you! Previous housekeeping experience preferred but not required! Note: Must have a reliable vehicle & valid...SeniorLocal areaFlexible hoursShift workDay shiftWeekday work$72k - $91k
...Senior Claims Specialist - Auto Bodily Injury (NY venues; hybrid) At Utica National Insurance Group, 1,400 employees countrywide take our corporate promise to heart every day: To make people feel secure, appreciated, and respected. Utica National Insurance Group...SeniorFull timeWork at officeWork from homeHome officeFlexible hours$77k - $105k
...Senior Property Claims Specialist - Commercial (Inside Desk) At Utica National Insurance Group, 1,400 employees countrywide take our corporate promise to heart every day: To make people feel secure, appreciated, and respected. Utica National Insurance Group is an...SeniorFull timeWork experience placementWork at officeHome officeFlexible hours$78k - $139k
Project Superintendent Pike Construction Services is currently seeking experienced building construction Project Superintendents to join our growing team. We believe our people are the most important asset and we are committed to creating a dynamic and challenging work...SeniorFor subcontractorWork at office$89.5k - $166.9k
...Company: Marsh McLennan Agency Description: Senior Client Advisor Our not-so-secret sauce. Award-winning, inclusive, Top Workplace culture doesn't happen overnight. It's a result of hard work by extraordinary people. The industry's brightest talent...SeniorMinimum wageLocal areaNight shift$90.3k - $127k
...Senior Product Specialist At Utica National Insurance Group, 1,400 employees countrywide take our corporate promise to heart every... ...with all departments to ensure timely creation of products and systems for delivery to our customers and to the public. Assist with...SeniorFull timeWork at officeRemote workHome officeFlexible hours$92.4k - $172.1k
...Company: Marsh McLennan Agency Description: Employee Benefits Senior Consultant Our not-so-secret sauce. Award-winning, inclusive, Top Workplace culture doesn't happen overnight. It's a result of hard work by extraordinary people. The industry's brightest...SeniorMinimum wageWork at officeLocal areaNight shift$83.8k - $125.6k
...Senior Financial Analyst Country United States of America State / County New York City New... ...airfoil castings, forged components, aerostructures and highly engineered, critical fasteners for aerospace applications. In addition, we...SeniorPermanent employmentFull timeWork at officeWorldwideFlexible hours$150k - $175k
...Job Details: SENIOR SUPERINTENDENT - $75MM PROJECT - TOP 10 PLACES TO WORK IN NY The Superintendent will provide overall administrative and technical management at the construction project site. MUST HAVE EXPERIENCE IN PROJECTS LARGER THAN $50MM, UNIVERSITY EXPERIENCE...SeniorFor subcontractor- ...Job Description Job Description Job Title: Senior Sales & Service Team Lead The Mission We are looking for a servant-leader to... ...Optimization: Champion the proper use of the agency management system, while ensuring the team is using it productively and efficiently...SeniorWork at officeLocal area
$88k - $141k
...of the overall project Complete tasks based on clearly defined project milestones and support requirements Expand or modify system to serve new purposes or improve workflows Review and analyze system and performance indicators to locate problems and correct errors...SeniorFull timeShift workDay shift
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior ML Systems Engineer, Frameworks & Tooling. Be the first to apply!
Related searches




