Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Heterogeneous AI Infra & Cluster Engineer

Gimlet Labs

Gimlet Labs in San Francisco is seeking an Infrastructure Platform Engineer to design and operate complex clusters for AI inference. This hands-on role involves working with diverse hardware architectures and orchestration systems to create scalable infrastructure solutions. The ideal candidate has strong Linux and Kubernetes experience, coupled with automation and debugging skills. Help shape AI infrastructure systems positively impacting future workloads. #J-18808-Ljbffr Gimlet Labs

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Heterogeneous AI Infra & Cluster Engineer in San Francisco, CA vacancy
  • Linuxcareers is seeking an Infrastructure/Cluster Engineer to design and operate large-scale clusters that enable AI inference at scale. The role focuses on managing diverse hardware architectures and building robust infrastructure. The ideal candidate will possess deep... 
    Suggested

    Linuxcareers

    San Francisco, CA
    5 days ago
  •  ...interact with the web by building AI agents that can reliably do...  ...Responsibilities: Scale infra for post-training of multimodal...  ...Work closely with product engineers to translate cutting‑edge AI capabilities...  ...with ML infrastructure (GPU clusters) and supporting networking (... 
    Suggested
    Work at office
    Relocation
    Visa sponsorship

    Yutori

    San Francisco, CA
    2 days ago
  • Neura Market is seeking an HPC Engineer to build and configure large-scale HPC clusters for AI workloads. This role requires working 4 days a week onsite in San Francisco/Bellevue, where you will collaborate closely with teams to troubleshoot and improve systems. The ideal... 
    Suggested

    Neura Market

    San Francisco, CA
    2 days ago
  • $293k

     ...are seeking a Tokens-as-a-Service (TaaS) Engineer to help build the systems that convert...  ...analysis, and model porting across heterogeneous infrastructure environments. Build tooling...  ...Preferred Skills Experience with GPU clusters, AI infrastructure, performance benchmarking... 
    Suggested

    Slope

    San Francisco, CA
    4 days ago
  • Senior Infrastructure Engineer - Bland As a Senior Infrastructure Engineer at Bland, responsibilities...  ...in regulated industries. Lead - AI/ML Stack Infrastructure Lead the team...  ...designing and operating production Kubernetes clusters optimized for AI/ML workloads with GPU... 
    Suggested
    Temporary work

    AI Chopping Block, Inc.

    San Francisco, CA
    4 days ago
  •  ...accelerate the progress of AI applications out into...  ...their laptop to the cluster without needing to be a...  ...Senior Site Reliability Engineer to join the Infrastructure...  ...laptop. As part of the Infra team, we build the...  ...management systems for heterogeneous compute clusters Develop... 

    Anyscale

    San Francisco, CA
    1 day ago
  • Nohi, based in San Francisco, is seeking a Senior Full-Stack Engineer to lead the architecture and implementation of core systems such as products, orders, and payments. You will drive the end-to-end development, ensuring reliability across both frontend and backend. The... 

    Nohi

    San Francisco, CA
    1 day ago
  • A leading AI technology firm in San Francisco is seeking an AI Infra Engineer to enhance their infrastructure. The successful candidate will design and maintain Kubernetes clusters and manage Slurm for distributed training. Important skills include extensive experience... 

    Perplexity

    San Francisco, CA
    5 days ago
  • $190k - $270k

    AI Chopping Block, Inc. is seeking an AI Infrastructure Engineer in San Francisco. This role requires maintaining user-facing services and production systems, specializing in systems while ensuring their reliability and scalability. Candidates should have 5+ years of experience... 

    AI Chopping Block, Inc.

    San Francisco, CA
    2 days ago
  • $320k - $405k

     ...interpretable, and steerable AI systems. We want AI to be safe...  ...group of committed researchers, engineers, policy experts, and business...  ...Infrastructure Engineer, Node Infra About the role Anthropic's Infrastructure...  ..., stand up and scale clusters from thousands to hundreds of... 
    Visa sponsorship

    Menlo Ventures

    San Francisco, CA
    5 days ago
  • $250k

    LeoForce is seeking a Senior Software Engineer in San Francisco to design agentic systems and improve AI-native workflows. This hybrid role allows flexibility with 2-3 days in the office each week. Ideal candidates will have 2-8 years of software engineering experience... 
    Work at office

    LeoForce

    San Francisco, CA
    1 day ago
  • $202.5k - $247.5k

     ...sharing localhost or running AI workloads in production. We’re...  ...below worth your time. About the Infra Platform Team The Infra...  ...team builds the systems ngrok engineers rely on to build, deploy, and...  ...environments that run a full Kubernetes cluster of the ngrok stack, closely... 
    Permanent employment
    Full time
    Work at office
    Local area
    Remote work
    Home office
    Flexible hours

    jobr.pro

    San Francisco, CA
    2 days ago
  • $335k

    OpenAI in San Francisco seeks a System Engineer to architect and operationalize essential infrastructure for AI systems. The role demands 7+ years in systems engineering...  ...experience debugging and a solid grasp of clustering and scaling in production environments. Offers... 
    Relocation package

    OpenAI

    San Francisco, CA
    4 days ago
  •  ...seeking an experienced Infrastructure Security Engineer to design and implement security in a...  ...the long-term security roadmap for advanced AI systems. Your role includes auditing, hardening, and securing Kubernetes clusters, implementing centralized security controls,... 

    Xcede

    San Francisco, CA
    3 days ago
  •  ...A technology company specializing in AI infrastructure is seeking a Member of Technical...  ...building compiler pipelines for heterogeneous hardware while ensuring performance and...  ...ideal candidate should have solid software engineering skills and experience in compiler systems... 

    Gimlet Labs

    San Francisco, CA
    4 days ago
  • A leading AI research organization in San Francisco is looking for a Senior Mechanical Infrastructure Engineer to develop reliable and efficient thermal systems for high-density AI data centers. The ideal candidate should have over 10 years of experience in mechanical... 

    OpenAI

    San Francisco, CA
    1 day ago
  •  ...for the world's most dynamic AI companies, like Cursor, Notion...  ...and help build the platform engineers turn to to ship AI products....  ...operating system for distributed, heterogeneous AI hardware. We believe that...  ...performance on bleeding‑edge clusters (H100/H200, B200/B300, GB200/... 
    Flexible hours

    Baseten

    San Francisco, CA
    4 days ago
  • OutSystems, Inc. is looking for a Site Reliability Engineer to join their team in San Francisco, CA. The ideal candidate will lead the onboarding of services and teams to reliability tenets while establishing SLOs and SLAs. Proficiency in Python and experience with Kubernetes... 
    Flexible hours

    OutSystems, Inc.

    San Francisco, CA
    2 days ago
  • Happyrobot Inc. is looking for an Infrastructure Engineer in San Francisco, California. This role involves leading the stability and observability...  ...as familiarity with monitoring tools. Join us at a high-growth AI startup backed by top investors, where you will have ownership... 

    Happyrobot Inc.

    San Francisco, CA
    2 days ago
  • $190k - $270k

    AI Chopping Block, Inc. is looking for an AI Infrastructure Engineer based in San Francisco, California. The role involves ensuring production systems run smoothly, building infrastructure with tools like Ansible and Kubernetes, and implementing operational processes.... 

    AI Chopping Block, Inc.

    San Francisco, CA
    2 days ago
  • AI Talent Now in San Francisco is looking for an engineer to take ownership of projects that innovate AI evaluation techniques. You will design and build scalable systems, collaborate with various teams, and ensure code quality through reviews and documentation. This role... 

    AI Talent Now

    San Francisco, CA
    2 days ago
  • Sr. Site Reliability Engineer Job type: Full Time · Department: Platform...  ...) Optura is healthcare’s AI orchestration platform. We...  ...operating it at scale across heterogeneous environments Multi‑cloud fluency...  ...tooling (Replicated, Cluster API, Talos, Rancher, OpenShift... 
    Full time
    Remote work

    Neara

    San Francisco, CA
    3 days ago
  • $190k - $270k

    AI Chopping Block, Inc. is looking for an AI Infrastructure Engineer to ensure smooth operation of user-facing services and production systems. You will handle infrastructure with tools like Ansible, Terraform, and Kubernetes while participating in on-call rotations for... 

    AI Chopping Block, Inc.

    San Francisco, CA
    2 days ago
  • $225k

    Australia-Employment is seeking a Senior Software Engineer in San Francisco to design and implement AI-native workflows and infrastructure. This role offers a competitive salary between $225,000 and $450,000 per year. You will work on diverse challenges in a hybrid environment... 

    Australia-Employment

    San Francisco, CA
    3 days ago
  • Senior Site Reliability Engineer - AI Infrastructure Location: Global Remote / San Francisco • Full-Time About Andromeda Andromeda Cluster was founded by Nat Friedman and Daniel Gross to...  ...). Own capacity planning across heterogeneous GPU fleets optimized for training throughput... 
    Full time
    Remote work

    Cortes 23

    San Francisco, CA
    2 days ago
  • $150k

    Tzafon is seeking a skilled engineer to enhance their machine intelligence systems in San Francisco. As part of the team, you'll be responsible for building evaluation infrastructure, designing data pipelines, and implementing fine-tuning processes. Ideal candidates have... 

    Tzafon

    San Francisco, CA
    4 days ago
  •  ...building the next hyperscaler for AI agents. About the role You...  ...tightening the loop between infra change and production behavior...  ...looking for an infrastructure engineer who actually wants to live in...  ...- You've operated Kubernetes clusters past the tutorial stage: real... 
    Live in
    Work from home

    E2B

    San Francisco, CA
    3 days ago
  •  ...Technical Staff to contribute to model training pipelines and produce state-of-the-art models. Candidates should possess strong software engineering skills, especially in Python and ML frameworks like JAX and Pytorch. Experience with distributed training infrastructures is... 
    Remote job

    Jaide Health

    San Francisco, CA
    5 days ago
  • A cutting-edge AI infrastructure startup in San Francisco is seeking a Senior Multidisciplinary Engineer to innovate how physical structures are built. The ideal candidate will leverage deep engineering expertise across mechanical, electrical, and systems engineering to... 

    Jack & Jill/External ATS

    San Francisco, CA
    2 days ago
  • CoffeeSpace, an AI infrastructure startup in San Francisco, is seeking a Senior Founding Engineer to shape backend systems and data evaluation for frontier AI. You will collaborate directly with founders and have impactful ownership in a fast-growing environment. The ideal... 

    CoffeeSpace

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Heterogeneous AI Infra & Cluster Engineer. Be the first to apply!