AI Infrastructure Engineer
42dot
AI Infrastructure Engineer
At 42dot, our AI Infrastructure Engineer manages the high-performance AI infrastructure orchestrating thousands of GPUs across multiple data centers. You will contribute to the scaling, monitoring, and operational optimization required to maintain a robust and world-class computing environment.
Responsibilities
- Operate and maintain a large-scale GPU cluster consisting of thousands of GPUs across multiple data centers using Kubernetes and Slurm.
- Monitor and diagnose failures across the GPU hardware and software stacks to ensure high availability and rapid recovery.
- Develop automation tools and scripts using Python or Shell to streamline repetitive infrastructure management tasks and improve operational efficiency.
- Manage GPU resource quotas and provide technical support to ML researchers to ensure optimal utilization of computing resources.
- Participate in the architectural design and performance tuning of distributed training environments for large-scale autonomous driving models.
Qualifications
- Strong proficiency in Linux operating systems, including a solid understanding of kernel operations, process management, and system security.
- Practical experience with containerization technologies (Docker) and orchestration (Kubernetes), including building, managing, and troubleshooting containerized environments.
- Solid understanding of networking fundamentals, including TCP/IP and with the ability to perform basic network troubleshooting.
- Ability to write clean and maintainable scripts in Python or Shell for automation and system administration.
- Logical approach to problem-solving with the persistence to identify and resolve root causes in complex, large-scale systems.
- Strong communication skills to effectively collaborate with cross-functional teams and external partners.
Interview Process
- Resume Screening - Coding Test - Virtual Interview (approximately 1 hour) - Onsite or Virtual Interview (approximately 3 hours) - Final Offer
- Please note that the interview process may vary depending on the position and is subject to change based on scheduling and other circumstances.
- Interview schedules and results will be communicated individually via the email address provided in your application.
Additional Information
- Please upload all required documents in PDF format.
- Veterans and applicants eligible for employment protection will receive preferential consideration in accordance with applicable laws and regulations.
- In compliance with the Act on Employment Promotion and Vocational Rehabilitation for Persons with Disabilities, registered individuals with disabilities will receive preferential consideration.
- 42dot does not accept unsolicited resumes from search firms. We will not pay any fees for resumes submitted without prior agreement.
- A 3-month probationary period may apply.
Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the AI Infrastructure Engineer in United States vacancy
$157.49k - $174.71k
...AI Infrastructure Engineer Intelligent Data Management: Use AI tools to analyze, map, and automate the data migration from the existing workflows and system Design modern, flexible data architectures, not locked to legacy patterns Leverage AI to detect...SuggestedRemote workFlexible hours$100k - $150k
...AI Infrastructure Engineer Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable...SuggestedFull timeH1bRemote workVisa sponsorship$200k - $300k
...AI Training Infrastructure Engineer – Humanoid Whole Body Control San Jose, CA Figure is an AI Robotics company developing autonomous general-purpose humanoid robots. The goal of the company is to ship humanoid robots with human level intelligence. Its robots are...SuggestedFull timeWork at office$1,000 per month
...Join Elliptic's Ai Platform Team This is an opportunity to join Elliptic's AI Platform... ...to help build the foundational infrastructure that will power how Elliptic's products... ...and act. You will be one of the first engineers working on a centralised AI platform whose...SuggestedRemote workHome office$170k - $210k
...AI Infrastructure Engineer Utilidata is a fast-growing AI company enabling AI data centers to dynamically orchestrate power and unlock more compute capacity from existing energy infrastructure. For over a decade, we have applied AI to the electric grid — bringing real...SuggestedLocal areaRemote workFlexible hours- Mercor is seeking talented Performance Engineers in Beaumont, Texas, to join their advanced AI Lab's GenAI team. This position requires deep expertise in low-level systems optimization, particularly in C++, Python, and Rust, with a focus on enhancing AI training and inference...
- ...we partner with global logistics company leveraging AI, Machine Learning, and Data Engineering to optimize warehouse operations, predictive maintenance... .... Role: Build and maintain scalable AI infrastructure, enabling teams to run ML experiments, deploy machine...Long term contractRemote work
$60 per hour
...A leading AI development company is looking for proficient programmers to join their remote team. You will work on challenging coding tasks to train AI systems, with responsibilities including designing solutions, writing quality code, and evaluating AI-generated outputs...Remote work- ...next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded... ...We are seeking a DevOps / Platform Engineer to join our team building and operating large-scale GPU compute infrastructure that powers AI and ML workloads. The ideal candidate...
$140k - $252k
...screenshot-based VLM agents, with the larger goal of integrating with Tesla's broader AI ecosystem. We're seeking an ML/RL Infra Engineer to build scalable, reliable infrastructure that powers these agents and enables seamless, high-volume rollouts for model evaluation...Hourly payFull timeTemporary workFlexible hours- ...Founders Fund–backed NVIDIA cloud partner building the infrastructure platform that powers AI at scale. We connect AI Factories—high-performance GPU... ...onboarding. Your job is to change that. As an AI Infrastructure Engineer, you'll work directly with AI platform customers to get...Remote work
$60 per hour
A leading AI development firm is seeking proficient programmers to join their team. This remote role allows for flexible scheduling, letting you choose your projects and work when it suits you. Responsibilities include solving coding challenges for AI training and providing...Remote workFlexible hours$60 per hour
...A leading AI development company seeks proficient programmers to engage in innovative tasks involving state-of-the-art AI models. Responsibilities include designing coding problems, writing high-quality code, and evaluating AI-generated outputs. This fully remote role...Remote workFlexible hours- ...AI Infrastructure Engineer IV At ASI, we are revolutionizing industries with state-of-the-art autonomous robotics solutions. Within the fields of agriculture, construction, landscaping, and logistics, we deliver technologies that enhance safety, productivity, and efficiency...Local area
- ...AI Infrastructure Specialist As vCluster's AI Infrastructure Specialist, you will work directly with customers at the earliest and most... ...role exists to make that happen. As an AI Infrastructure Engineer, your role will include: Lead Technical Deployments: Drive...Remote workFlexible hours
$163.5k - $212.4k
...flagship sedan, and the ET5, a mid-size smart electric sedan. About the Position We are looking for a senior AI Inference Infrastructure Software Engineer with strong hands-on experience building, optimizing, and deploying high-performance, scalable inference systems...Full timeTemporary workImmediate startFlexible hours$60 per hour
A technology company is looking for proficient programmers to contribute to the development of AI systems. This remote position allows for a flexible schedule and offers competitive pay up to $60 per hour. Responsibilities include solving coding problems, writing code,...Hourly payRemote workFlexible hours- ...AI Infrastructure Engineer At BNY, our culture allows us to run our company better and enables employees' growth and success. As a leading global financial services company at the heart of the global financial system, we influence nearly 20% of the world's investible...Work experience placementWorldwideFlexible hours
$60 per hour
A growing AI development company is seeking proficient programmers to contribute to cutting-edge AI systems. This fully remote role allows flexibility in choosing projects and working hours, with competitive pay up to $60 per hour based on performance. Responsibilities...Hourly payRemote work$144k - $198k
...ADI ensures today's innovators stay Ahead of What's Possible™. Learn more at and on LinkedIn and Twitter (X). Senior AI Infrastructure Engineer, Developer Experience Analog Devices, Inc. (NASDAQ: ADI) is a global semiconductor leader that bridges the physical...Permanent employmentWork at officeShift workDay shift$60 per hour
A tech-driven company seeks proficient programmers to develop and advance AI systems, offering remote work and a flexible schedule. Responsibilities include designing coding challenges, evaluating AI-generated code, and writing clear code snippets. Candidates should have...Remote workFlexible hours$60 per hour
...A technology company is looking for proficient programmers to assist in developing cutting-edge AI systems. This fully remote role allows you to work from anywhere with a flexible schedule. You'll design and solve coding challenges, evaluate AI code, and contribute to...Hourly payRemote workFlexible hours- ...About Obvio AI Each year, more than 40,000 people in the U.S. leave home and never... ...and lifecycle layer. Stand up the infrastructure that loads versioned CV models and handles... ...back without pipeline downtime. Set the engineering standard. This is an early hire. You'll...Local area
$151.8k
...What you can expect We are seeking an experienced AI Infrastructure Engineer to join our AI Incubation team. You will be focused on building and optimizing large-scale training infrastructure for Large Language Models (LLMs). The ideal candidate will combine engineering...Work at officeRemote work- ...AI Engineer The AI Engineer will design, develop, and deploy scalable machine learning and AI-driven analytics capabilities. Responsibilities include multi-source data fusion, entity resolution and behavioral modeling, predictive and prescriptive intelligence analytics...Remote work
- ...HTEC Group is hiring for a software development position focused on next-generation AI compute platforms. You will design and implement software components across various stacks while collaborating with compiler developers and ML scientists. Candidates should have at...
- ...transform critical institutions with applied AI. We care that industries that power the... ...bring: Forward-deployed expertise in engineering, product, and research Mosaic, our in... ...About the role We're hiring an AI Infrastructure Engineer to own the infrastructure,...Contract work
$124k - $420k
...What to Expect As a Software Engineer for the Optimus team, you will build the tools and infrastructure to make and measure improvements to neural network architecture, visualize data, assist with exporting and deploying neural networks to the bot, and evaluate experimental...Hourly payFull timeTemporary workFlexible hours$100k - $150k
...technologies to create scalable, secure, and user-friendly applications. As we continue to grow, we’re looking for a skilled AI Infrastructure Engineer to join our dynamic team and contribute to our mission of transforming business processes through technology. This is...Full timeH1bLocal areaImmediate startRemote workVisa sponsorshipWork visa- ...A leading U.S. technology firm is hiring an AI Infrastructure Engineer for a full-time remote position with H-1B visa sponsorship available. This role involves designing and optimizing AI platforms, deploying Kubernetes and Docker container environments, and enhancing...Full timeH1bRemote workVisa sponsorship
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Infrastructure Engineer. Be the first to apply!
Related searches
- ai research engineer United States
- machine learning ai engineer United States
- ai engineer remote United States
- ai prompt engineer United States
- ai developer United States
- ai engineer United States
- ai ml engineer United States
- senior ai engineer United States
- junior infrastructure engineer United States
- principal infrastructure engineer United States

