Distributed Machine Learning Engineer
$150kInstitute of Foundation Models
About the Institute of Foundation Models
We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to advance research, nurture the next generation of AI builders, and drive transformative contributions to a knowledge-driven economy. As part of our team, you'll have the opportunity to work on the core of cutting-edge foundation model training, alongside world-class researchers, data scientists, and engineers, tackling the most fundamental and impactful challenges in AI development. You will participate in the development of groundbreaking AI solutions that have the potential to reshape entire industries. Strategic and innovative problem-solving skills will be instrumental in establishing MBZUAI as a global hub for high-performance computing in deep learning, driving impactful discoveries that inspire the next generation of AI pioneers. The Role The Distributed ML Engineer will play a role at the forefront of optimizing performance for the machine learning software stacks, especially at training and inference, and support the team to develop new and cutting-edge systems. The ideal candidate will have a strong background in parallel computing, and hands-on experience in system level coding, debug methodologies, and large-scale machine learning experience. Key Responsibilities- Understand, analyze, profile, optimize, and provide guidance to the team on deep learning workloads on state-of-the-art hardware and software platforms to improve their efficiency with different levels of optimization
- Design and implement performance benchmarks and testing methodologies to evaluate application performance
- Build tools to automate workload analysis, workload optimization, and other critical workflows
- Triage system issues and identify bottleneck and inefficiencies by analyzing the sources of issues and the impact on hardware, network and propose solutions to enhance GPU utilization
- Support the team to develop appropriate kernels and systems for new model architectures and algorithms
- Participate in, or lead design reviews with peers and stakeholders to decide amongst available technologies.
- Review code developed by other developers and provide feedback to ensure best practices (e.g., style guidelines, checking code in, accuracy, testability, and efficiency).
- Contribute to existing documentation or educational content and adapt content based on product/program updates and user feedback.
- Represent MBZUAI at industry conferences and events, showcasing the institution's cutting-edge HPC and deep learning capabilities and establishing MBZUAI as a global leader in AI research and innovation.
- Perform all other duties as reasonably directed by the line manager that are commensurate with these functional objectives.
- Ph.D. in CS, EE or CSEE with 1+ years working experience, OR
- Masters in CS, EE or CSEE or equivalent experience with 2+ year working experience
Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Distributed Machine Learning Engineer in United States vacancy
$136.32k - $287.41k
...vLLM to every enterprise. The Red Hat AI Inference Engineering team accelerates AI for the enterprise and brings operational... ...build, optimize, and scale LLM deployments. As a Machine Learning Engineer focused on distributed vLLM infrastructure in the llm-d project, you will...SuggestedPermanent employmentFull timeContract workWork experience placementWork at officeRemote workFlexible hours- ...Research carries out foundational research on Protocol Learning: multi-participant training of foundation models... ...sustaining economics. We’re looking for Senior/Staff engineers with 5+ years of experience in distributed systems and ML large‑scale training. You’ll be...SuggestedRemote workVisa sponsorship
- ...Overview We’re looking for a Machine Learning Systems Engineer to strengthen the performance and scalability of our distributed training infrastructure. In this role, you'll work closely with researchers to streamline the development and execution of large-scale training...Suggested
- ...As a Research Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large-scale multimodal training and evaluation at OpenAI. You’ll manage distributed data pipelines, collaborate closely with researchers to translate requirements...Suggested
$295k
...improve peoples' lives. About the Role As a Research Engineer, Distributed Data Systems, you will design and scale the... ...orchestration, distributed storage, streaming infrastructure, machine learning infrastructure while ensuring scalability, reliability, and...SuggestedWork at officeRelocation package- A mission-driven technology company in California is seeking experienced Senior/Staff Engineers proficient in building distributed ML systems. Applicants should possess strong experience in optimizing large-scale training under low-bandwidth conditions, with expertise...Remote work
- A leading AI research company in San Francisco seeks Senior/Staff Engineers skilled in distributed systems and large-scale ML training. Responsibilities include designing systems optimized for low-bandwidth conditions and implementing robust training strategies. Ideal...Remote work
$170k - $200k
...transformer-only architectures, combining rigorous engineering with learning systems proven in globally deployed solutions... ...Systems Engineer to architect and build the distributed infrastructure that powers large-scale machine learning workflows across the organization....Local area$145.5k - $232.5k
...functional team of applied scientists and engineers delivers production-grade AI systems... ...the role Zillow is seeking a Machine Learning Engineer to join the Applied Reasoning... ...Contribute to best practices in distributed ML systems, scalable architecture, and...Permanent employmentLive inWork at officeLocal areaRemote work$100k
...Description Do you have demonstrated machine learning experience and want to apply that experience... ...team of scientists and engineers? Are you ready to help the US secure... ...art deep learning techniques to solve distributed resource allocation problems. Have...Temporary workWork experience placementInterim roleRelocation packageFlexible hours- ...Machine Learning Engineer II When our values align, there's no limit to what we can achieve. At Parexel, we all share the same goal - to improve... ...fundamentals including data structures, algorithms, and distributed systems. Advanced Python experience. Machine...Contract workRemote workWork from homeFlexible hours
$160k - $250k
...opportunities where software, computer vision, and machine learning can meaningfully augment or automate complex engineering judgment. As a Senior Machine Learning... ...or Express frameworks Prior experience with distributed training using cloud infrastructure Prior experience...Permanent employmentLocal areaRelocation packageFlexible hours- ...Machine Learning Engineer Remote (or Hybrid in Houston TX) | Full-time | Startup Environment Join the Team at Geminus The Machine Learning... ...services on the Geminus platform using containerized or distributed systems. • Conduct validation and benchmarking against...Full timeWork at officeRemote work
$150k - $215k
...small agile team combining world‑class engineers with veteran strategists who bring... ...t mean standing still. About the Role Machine learning is core to Vannevar's enrichment capabilities... ...skills, including experience with distributed systems, APIs, and cloud infrastructure...Permanent employmentContract workFor contractorsFor subcontractorWork at officeRemote work- ...Machine Learning Engineer At CloudWalk, our Security team doesn't just react to threats; we engineer systems that anticipate and neutralize... ...with LLMs and Agents. As a member of a fully remote and distributed team, you are expected to complete tasks autonomously,...Remote work
$160k - $230k
...skills and experience — talk with your recruiter to learn more. Base pay range $160,000.00/yr - $230,000.00/yr... ...Recruiter | C++ · Rust · Core Linux · Low Latency · Network Engineering AI/ML Solutions Architect – Distributed Training & GPU Infrastructure Location: Remote from...Full timeRemote work- ...Description Tyto Athene is seeking a driven and adaptable Machine Learning Engineer to help shape the future of cybersecurity through... ...collaboration skills ~ Ability to work independently with distributed teams in a fast-paced, agile environment ~ Eagerness...Remote workWorldwide
- ...Our client is looking for machine learning engineers to develop and implement machine learning models and algorithms to drive actionable insights... ...stakeholders. Preferred Qualifications Experience with distributed computing frameworks such as Apache Spark or Hadoop....
- ...Forward Deployed ML Engineer We are looking for a Forward Deployed... ...~ Adapt Stratum's deep learning models to a given mine. ~ Develop... ...and maintain high-quality machine learning code using Python.... ...gold, silver, copper, etc. are distributed (and how much!) using only small...Remote work
- ...Machine Learning Engineer Roper Technologies is seeking a Machine Learning Engineer to help design, build, and deploy advanced AI systems... ...using AI tools Agent frameworks and orchestration tools Distributed systems or microservices architecture Model monitoring...Remote work
$184.15k
...Overview Explore top remote machine learning engineer jobs and find flexible roles such as llm engineer, nlp engineer, computer vision engineer... ...lately, offering a snapshot of the employers investing in distributed machine learning engineer talent. Top regions hiring...Remote workFlexible hours- ...multidisciplinary organization, with work spanning distributed systems on AWS to geospatial... ...As AI becomes embedded in modern engineering workflows, we value engineers who can... ...engineering workflow. What You'll Do As a Machine Learning Engineer III on the Routing Cost team,...Immediate startRemote work
- ...AI presents: mass-manufactured social engineering. Countless scams, deepfakes, and other... ...digital threats. We're looking for a machine learning engineer to help build and scale the... ...working with large-scale datasets and distributed data processing frameworks Understand...Work at officeRemote workFlexible hours
- ...ML Engineer We are looking for a ML Engineer to work closely with the ML Architect to... ...develop on ML frameworks (TensorFlow, Scikit-Learn, Pytorch), Experimentation platform and... ...to: Develop large-scale distributed machine learning systems that are scalable, performant...Remote work
- ...Machine Learning Engineer Eli Health is making continuous hormone monitoring possible, enabling users to support their daily and long-term... ...our office and R&D facilities are in Montreal, we have a distributed team. We prioritise asynchronous workflows and minimise meetings...Work at officeRemote work
- ...across the US and Europe — manufacturing, distribution, high-tech. Our teams build and... ...Responsibilities Designing, building, and optimizing machine learning models for production use, with a... ...‑on experience in machine learning engineering Strong proficiency in Python and core...Remote work
- ...Chicagoland Area preferred Who We Are: K1x is the leading data distribution platform for alternative investments. Simply put, our... ...role: We are seeking a highly skilled and experienced Machine Learning Engineer to join our dynamic team. The ideal candidate will have a...Remote work
- ...energy on serving their customers and communities. As a Machine Learning Engineer, you will help build and operate production systems that... ...Experience building APIs, backend services, or working with distributed systems Familiarity with cloud platforms such as AWS,...
- ...About the job Machine Learning Engineer Glint Tech Solutions is Hiring an experienced Machine Learning Engineer to join our client's... ...Hands-on experience with Kubernetes and container orchestration Strong understanding of scalability and distributed systems
- ...and backed by top-tier VCs including Y Combinator. As a Machine Learning Engineer, you'll work alongside our founders and team members to... ...complexities of multi-scale biological systems You will work on distributed training systems to scale our models to billions of...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Distributed Machine Learning Engineer. Be the first to apply!
Related searches
- machine learning ai engineer United States
- lead machine learning engineer United States
- machine learning engineer United States
- entry level machine learning engineer United States
- staff machine learning engineer United States
- junior machine learning research engineer United States
- junior machine learning engineer United States
- machine learning software engineer United States
- ai ml engineer United States
- senior ml engineer United States


