AI Infrastructure Engineer

vCluster

AI Infrastructure SpecialistAs vCluster's AI Infrastructure Specialist, you will work directly with customers at the earliest and most critical stage of their journey: from bare metal GPU nodes through to a production-ready deployment. This is not a traditional professional services role; you operate pre-sale as part of a proof-of-value engagement scoped to reach production. You will be one of the first team members a neocloud or AI Factory engages with at a technical depth, and the playbooks you develop will scale the motion for the next hire and customer.vCluster is gaining rapid traction with GPU AI Clouds and enterprises building AI Factories: organizations that need to offer Kubernetes as a managed service on bare metal GPU infrastructure, and need to do it fast. This role exists to make that happen.Role ResponsibilitiesLead Technical Deployments: Drive end-to-end technical deployments for GPU neocloud and AI Factory customers, from initial bare metal configuration to a validated vCluster environment.Infrastructure Optimization: Configure and troubleshoot bare metal GPU node infrastructure, including CNI configuration, GPU Operator setup, distributed storage backends, and RDMA/InfiniBand.Validation: Deploy and validate Kubernetes and vCluster to provide GPU-powered managed K8s.Knowledge Transfer: Work alongside customer teams to build self-sufficiency, ensuring they can operate and grow the platform independently.Scaling through Documentation: Document reusable playbooks and deployment architectures so your learnings become the next customer's head start.Feedback Loop: Collaborate with Engineering and Product to surface recurring infrastructure challenges, acting as a direct feedback loop from the field into the roadmap.Strategic Partnering: Join Sales in the pre-sales process where deep infrastructure work is required to achieve a meaningful proof of value.This role could be a fit for you if you bring:Production K8s Mastery: 5+ years of experience deploying and operating Kubernetes in production, ideally on bare metal or in high-complexity environments.GPU Fluency: Practical knowledge of NVIDIA GPU Operators, CUDA tooling, and systems-level configuration for GPU nodes.Networking Fundamentals: Deep understanding of CNI plugins, overlay networks, load balancing, and connectivity diagnosis in layered environments.Storage Expertise: Experience with persistent volume configuration, CSI drivers, and distributed systems like Ceph, Rook, Weka, or Longhorn.Operational Agility: Comfort operating in ambiguous, fast-moving environments where you are often writing the playbook in real time.Modern Tech Mindset: You thrive in environments that reject legacy tech and prefer a modern stack where you can solve a variety of problems from pipelines to internal services.Bonus points for:Automation Skills: Experience writing automation scripts with Bash, Python, or Go.Kubernetes Depth: Relevant certifications such as CKA (Certified Kubernetes Administrator) or experience writing Kubernetes Operators.AI/ML Familiarity: Experience with inference serving, GPU scheduling, and the tooling around LLM deployment.Documentation: Experience building AI Automation in documentation to contribute to a shared knowledge base.About VCluster LabsWe are a venture-backed tech startup striving to be the leading force in enabling platform engineers. We raised +$30M from top-tier VCs such as KhoslaVentures (first investor in OpenAI, GitLab, Stripe, Doordash) and are in a hyper-growth phase looking for motivated people to complement our team. Our headquarters are in SanFrancisco (Salesforce Tower), but our team is distributed around the globe and we have a remote-first work culture.We're the company behind vCluster, an open-source technology for virtualizing Kubernetes (+10k GitHub stars). Open source is part of our DNA.The adoption of our commercial product based on vCluster has grown extremely fast (multi-million dollar revenue) and our customer base includes some of the biggest companies in the world, including 6 Global Fortune 500 companies as well as some of the fastest-growing tech unicorns.BenefitsCompetitive Salary: We offer a competitive compensation package, including equity.Platinum-Level Insurance: Health, dental, vision, and life Insurance, including plans for you and eligible dependents (benefits vary depending on country).Flexible Working Schedule: You have a doctor's appointment or need to head to the supermarket to get groceries at 2pm? We won't have an issue with that. To us, results matter more than clocking in and out at the same time every day.Workplace Flexibility: We're very flexible about where you work. We know things can change in life and we're happy to adjust the work environment for you along the way.Culture & ValuesOpen Source, Open Mind: We are actively contributing to and maintaining open-source projects. Internally, we foster meritocracy — the strongest ideas win, no matter who or where they come from.Build Tomorrow's Standards, Intentionally: We don't just ship software; we define the state-of-the-art of tomorrow. We are fearless in tearing down old approaches to build something better, but we are disciplined in how we do it because we know our users rely on our technology to run mission-critical infrastructure platforms.Create Wow: We measure success by the experience we generate, both inside and outside the company. For our customers, this means impressive speed and intuitive experiences. For our team, this means going the extra mile to support one another and to continuously drive each other to new heights.Own the Outcome: We understand that our responsibility doesn't end when a task is checked off; it ends when the value is delivered. We connect our daily individual actions to the broader success of the company and our customers.Compensation Range: $150K - $200KJ-18808-Ljbffr

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the AI Infrastructure Engineer in San Francisco, CA vacancy

AI Infrastructure Engineer
$190k - $270k
...AI Infrastructure Engineer As an AI Infrastructure Engineer at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering...
Suggested
Full time
Work experience placement
Together AI
San Francisco, CA
1 day ago
AI Infrastructure Engineer
...AI Infrastructure Engineer Spellbrush, the world's leading generative AI studio behind nijijourney, is looking for an AI Infrastructure Engineer to join us in building out end-to-end ML infrastructure to run our models on all platforms. What You'll Do Design...
Suggested
Work experience placement
Work at office
Visa sponsorship
Spellbrush
San Francisco, CA
5 days ago
Senior AI Infrastructure Solutions Engineer Remote
An innovative AI infrastructure startup is seeking a Sales Engineer to lead technical discovery and drive successful evaluations with clients. The ideal candidate will have significant experience in customer-facing technical roles focused on AI and machine learning infrastructure...
Suggested
Remote work
Andromeda
San Francisco, CA
3 days ago
Staff AI Infrastructure Engineer
$230k - $360k
...Lead Infrastructure and Reliability Engineer (Systems & Scale) A new class of intelligence is emerging, systems that understand and generate the world... ...define how reliability works for a new generation of AI infrastructure. The decisions you make here will influence...
Suggested
Immediate start
Luma AI
San Francisco, CA
3 days ago
AI Infrastructure Security Engineer
...About Brain Co. Brain Co. is an applied AI startup co-founded by Jared Kushner and Elad Gil, and backed by leading Silicon... ...millions of people. About the Role: As our Security Engineer, Infrastructure, you'll secure the platform layer end-to-end including cloud...
Suggested
Worldwide
Brainco
San Francisco, CA
1 day ago
Data & AI Networks Engineer (Python/Bash) - SF Contract
$50 - $70 per hour
...Mercor is looking for a full-time Network Engineer in San Francisco to work with AI systems. You will manage network data, analyze behaviors, and create scripts for data processing. Ideal candidates should have experience in network engineering and programming skills in...
Hourly pay
Full time
Contract work
Mercor Inc
San Francisco, CA
2 days ago
AI Infrastructure Engineer — Scale & Reliability
$190k - $270k
AI Chopping Block, Inc. is hiring an AI Infrastructure Engineer to ensure smooth operations of user-facing services and production systems in San Francisco. The ideal candidate will have over 5 years of relevant experience and a Bachelor's degree in Computer Science or...
AI Chopping Block, Inc.
San Francisco, CA
17 hours ago
AI Infrastructure Engineer — Ownership & Impact
$180k - $250k
AI Engineer Location: San Francisco Onsite Policy: 5 days a week Comp & Ben : $180k - $250k base + 0.3% - 0.8% equity + visa sponsorship... ...around existing models. This company is building the infrastructure and agentic systems that automate real-world commercial real...
Visa sponsorship
Relocation package
Trades Workforce Solutions
San Francisco, CA
1 day ago
Senior AI Agentic Infrastructure Engineer (Remote + Equity)
AI Chopping Block, Inc. seeks a Senior Software Engineer for their Agentic Infrastructure team in San Francisco. This role involves architecting and building AI systems that enable autonomous planning and execution across the platform. Ideal candidates have 4-7 years of...
Remote job
Flexible hours
AI Chopping Block, Inc.
San Francisco, CA
4 days ago
Founding AI Infrastructure Engineer - Simulations & Equity
A well-funded AI infrastructure startup in San Francisco is seeking a Founding Engineer to design and scale distributed backend systems integral to training advanced AI agents. Ideal candidates will have experience in ML pipelines, systems thinking, and a strong foundation...
Jack & Jill/External ATS
San Francisco, CA
1 day ago
AI-Native Infrastructure Engineer
Roboflow in San Francisco is seeking a versatile Infrastructure Engineer to enhance our core infrastructure and scale our cloud operations. You will engage with cutting-edge AI technologies and collaborate with product, operations, and security teams. The role demands expertise...
Roboflow
San Francisco, CA
3 days ago
Founding AI Infrastructure Engineer
MintMCP, located in San Francisco, is looking for a versatile builder to own end-to-end features on our AI infrastructure. This role involves backend work and some frontend tasks, where you'll leverage AI tools to enhance productivity. The ideal candidate thrives in a horizontal...
MintMCP
San Francisco, CA
1 day ago
Senior AI Infrastructure Engineer for Reliability & Scale
$190k - $270k
AI Chopping Block, Inc. is hiring an AI Infrastructure Engineer in San Francisco, California. This full-time role involves ensuring smooth operation of user-facing services and production systems, alongside building and running infrastructure with Ansible, Terraform, and...
Full time
AI Chopping Block, Inc.
San Francisco, CA
17 hours ago
Autonomous AI Infrastructure Engineer
...Corp, based in San Francisco, is looking for a Core Engineer to design and operate foundational infrastructure for their autonomous agent platform. This role emphasizes... ...and systems thinking, crucial for ensuring AI reliability. Qualified candidates will have a strong...
Rox Data Corp
San Francisco, CA
1 day ago
AI Infrastructure Engineer — Scalable Training & Inference
An innovative AI company is seeking a Software Engineer to develop infrastructure that supports AI training and inference workflows. This role requires strong object-oriented programming skills and a solid foundation in data structures and algorithms. The ideal candidate...
SpreeAI
San Francisco, CA
3 days ago
AI Infrastructure Engineer Intern — Training & Inference
A leading AI fashion-tech company is seeking a Software Engineer Intern to focus on building infrastructure for AI systems. This role involves designing scalable models, developing APIs, and optimizing for performance and reliability. An ideal candidate will have a strong...
Internship
Immediate start
SpreeAI
San Francisco, CA
3 days ago
Senior AI Infrastructure Engineer
A leading AI research firm in San Francisco seeks a Staff Infrastructure Engineer to identify and resolve infrastructure bottlenecks and design large-scale systems for AI training. The ideal candidate has over 3 years of experience in infrastructure engineering and strong...
Menlo Ventures
San Francisco, CA
1 day ago
Frontier AI Infrastructure Engineer — Secret Clearance
A California-based technology company is seeking a Software Engineer for Frontier AI Infrastructure to create secure, scalable backend systems and collaborate with government agencies. The ideal candidate must have an active secret clearance and a strong background in full...
Scale AI, Inc.
San Francisco, CA
7 hours ago
Senior AI Agentic Infrastructure Engineer - Remote
Handshake is seeking a Senior Software Engineer for its Agentic Infrastructure team in San Francisco. You will build the backbone for AI agents, designing key systems that ensure functionality and safety across Handshake's platform. The ideal candidate has 4-7 years of...
Remote job
Flexible hours
Handshake
San Francisco, CA
17 hours ago
Senior AI Training Infrastructure Engineer
An innovative AI lab is seeking an experienced engineer to manage and optimize large-scale training infrastructure. You will build core systems that support researchers, focusing on distributed training, performance optimization, and data pipelines. Ideal candidates should...
Cognition
San Francisco, CA
4 days ago
AI Infrastructure Engineering (Cloud, DevOps)
...Francisco, CA (Onsite | Remote) About Virtue AI Virtue AI sets the standard for... .... What You'll Do As an AI infra Engineer, you will own the reliability, scaling,... ...with product developers to align infrastructure and inference behavior with product requirements...
Remote work
Virtue AI
San Francisco, CA
3 days ago
Member of Technical Staff (AI Infrastructure Engineer)
...AI Infra Engineer We are looking for an AI Infra engineer to join our growing team. We work with Kubernetes, Slurm, Python, C++, PyTorch, and primarily on AWS. As an AI Infrastructure Engineer, you will be partnering closely with our Inference and Research teams to...
Perplexity AI
San Francisco, CA
4 days ago
AI Engineer, Platform
...Meet Eloquent AI At Eloquent AI, we're building the next generation of AI Operators... ...alongside world-class talent in AI, engineering, and product as we redefine the future... ...languages. ~ Strong knowledge of cloud infrastructure (AWS, GCP, or Azure) and scalable architectures...
Eloquent AI
San Francisco, CA
1 day ago
Staff AI Infrastructure Engineer
$190k - $270k
AI Chopping Block, Inc. in San Francisco is looking for an AI Infrastructure Engineer responsible for maintaining user-facing services and production systems. This role requires 5+ years of related experience, proficiency in Ansible, Terraform, and Kubernetes, and offers...
AI Chopping Block, Inc.
San Francisco, CA
1 day ago
AI Security Infrastructure Engineer
Cerebras is seeking engineers to architect AI infrastructure aimed at combating adversarial AI threats. Your role involves building scalable systems for real-time threat detection with a focus on performance and adaptability. Ideal candidates will have 2-10 years of experience...
Cerebras
San Francisco, CA
1 day ago
Remote Senior AI Platform Engineer - Equity Eligible
...MaintainX is seeking a Senior AI Platform Developer to build scalable backend services for their AI-powered products. In this remote... ...development experience and the ability to work with cloud infrastructure. MaintainX offers a competitive salary, health benefits, and a...
Remote work
Flexible hours
MaintainX
San Francisco, CA
1 day ago
AI Infrastructure Engineer: Agent Training & Production
$255k - $405k
A leading AI research firm in San Francisco is seeking a Software Engineer for the Agent Infrastructure team. This role involves building scalable systems for training AI models and launching agentic products. Candidates should have deep experience in AI infrastructure,...
Work at office
OpenAI
San Francisco, CA
2 days ago
AI Agent Infrastructure Engineer
Xterraai is looking for an AI Research Engineer to enhance its geospatial intelligence systems. This role involves building agent infrastructure, developing evaluation frameworks, and designing data systems while collaborating with researchers and geoscientists. The ideal...
Xterraai
San Francisco, CA
1 day ago
AI Platform Engineer
...Rapidata Job Opportunity Compute is no longer AI's largest bottleneck, it now is human knowledge and feedback. At Rapidata we... ...looking for a super driven person at the intersection of product, engineering and customer use cases, deeply understanding our platform and...
Work at office
Rapidata
San Francisco, CA
13 days ago
AI Platform Engineer, Infrastructure
...Brain Co. Brain Co. is an applied AI startup co-founded by Jared Kushner and... ...and scale Kubernetes- and Terraform-based infrastructure across customer environments. Define... ...technical leaders on architecture and mentor engineers. You Might Be a Great Fit If You......
Brainco
San Francisco, CA
5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Infrastructure Engineer. Be the first to apply!