Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Tech Lead, AI Compute Infrastructure

HeyGen

Tech Lead, AI Compute Infrastructure

Los Angeles, Palo Alto, San Francisco, Toronto, Singapore

About HeyGen

At HeyGen, our mission is to make visual storytelling accessible to all. Over the last decade, visual content has become the preferred method of information creation, consumption, and retention. But the ability to create such content, in particular videos, continues to be costly and challenging to scale. Our ambition is to build technology that equips more people with the power to reach, captivate, and inspire audiences.

We are seeking a seasoned Technical Leader to build and scale the foundational compute infrastructure that powers our state-of-the-art AI models—from multimodal training data pipelines to high-throughput, low-latency video generation.

Responsibilities

You will be the core engineer responsible for building the robust, efficient, and scalable platform that enables our research and production teams to rapidly iterate on HeyGen's generative video models. Your contributions will directly impact model performance, developer productivity, and the final quality of every AI-generated video.

  • Optimize GPU Utilization: Design and implement mechanisms to aggressively optimize GPU and cluster utilization across thousands of devices for inference, training, data processing and large-scale deployment of our state-of-art video generation models.

  • Develop Large-Scale AI Job Framework: Build highly scalable, reliable frameworks for launching and managing massive, heterogeneous compute jobs, including multi-modal high-volume data ingestion/processing, distributed model training, and continuous evaluation/benchmarking.

  • Enhance Observability: Develop world-class observability, tracing, and visualization tools for our compute cluster to ensure reliability, diagnose performance bottlenecks (e.g., memory, bandwidth, communication).

  • Accelerate Pipelines: Collaborate closely with AI researchers and AI engineers to integrate innovative acceleration techniques (e.g., custom CUDA kernels, distributed training libraries) into production-ready, scalable training and inference pipelines.

  • Infrastructure Management: Champion the adoption and optimization of modern cloud and container technologies (Kubernetes, Ray) for elastic, cost-efficient scaling of our distributed systems.

Minimum Requirements
  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.

  • 5+ years of full-time industry experience in large-scale MLOps, AI infrastructure, or HPC systems.

  • Experience with data frameworks and standards like Ray, Apache Spark, LanceDB

  • Strong proficiency in Python and a high-performance language such as C++ for developing core infrastructure components.

  • Deep understanding and hands-on experience with modern orchestration and distributed computing frameworks such as Kubernetes and Ray.

  • Experience with core ML frameworks such as PyTorch, TensorFlow, or JAX.

Preferred Qualifications
  • Master's or PhD in Computer Science or a related technical field.

  • Demonstrated Tech Lead experience, driving projects from conceptual design through to production deployment across cross-functional teams.

  • Prior experience building infrastructure specifically for Generative AI models (e.g., diffusion models, GANs, or large language models) where cost and latency are critical.

  • Proven background in building and operating large-scale data infrastructure (e.g., Ray, Apache Spark) to manage petabytes of multi-modal data (video, audio, text).

  • Expertise in GPU acceleration and deep familiarity with low-level compute programming, including CUDA, NCCL, or similar technologies for efficient inter-GPU communication.

What HeyGen Offers
  • Competitive salary and benefits package.
  • Dynamic and inclusive work environment.
  • Opportunities for professional growth and advancement.
  • Collaborative culture that values innovation and creativity.
  • Access to the latest technologies and tools.

HeyGen is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Tech Lead, AI Compute Infrastructure in Palo Alto, CA vacancy
  • $140k - $220k

     ...Software Engineer, Compute Infrastructure Glean is the Work AI platform that helps everyone work smarter with AI....  ...2026), Forbes AI 50, and Gartner's Tech Innovators in Agentic AI, Glean continues...  ...for critical platform services, lead incident response when needed, and... 
    Suggested
    Work at office
    Home office
    Flexible hours

    Colorwave Inc

    Mountain View, CA
    3 days ago
  • $140k - $220k

     ...Glean: Glean is the Work AI platform that helps...  ...gives organizations the infrastructure to govern, scale, and customize...  ...AI 50, and Gartner’s Tech Innovators in Agentic...  ...a Software Engineer, Compute Infrastructure to help...  ...platform services, lead incident response when... 
    Suggested
    Work at office
    Home office
    Flexible hours

    Glean.info

    Mountain View, CA
    3 days ago
  • $164.2k - $205.2k

     ...and running the world's best data and AI infrastructure platform so our customers can use deep...  ...getting started. At Databricks, the Compute Infrastructure organization builds and...  ...engineering excellence and platform mindset. Lead cross-team initiatives that span... 
    Suggested
    Local area
    Worldwide

    Databricks

    Mountain View, CA
    1 day ago
  • $166k - $244k

    Senior Software Engineer, AI/ML GenAI, Google Cloud Compute Infrastructure Google Sunnyvale, CA, USA Apply Bachelor’s degree or equivalent practical experience. 5 years of experience programming in Python or C++. 3 years of experience with ML infrastructure (e.g.,... 
    Suggested
    Full time

    Google Inc.

    Sunnyvale, CA
    2 days ago
  •  ...About Obvio AI Each year, more than 40,000 people in the U.S. leave home and never...  ...Scale the inference fleet. Build the compute layer that parallelizes processing across...  ...and lifecycle layer. Stand up the infrastructure that loads versioned CV models and handles... 
    Suggested
    Local area

    Obvio

    San Carlos, CA
    4 days ago
  • $235.03k - $352.29k

     ...Technical Lead Manager, ML Platform Infrastructure Mountain View, California (HQ) Nuro is a self-driving technology...  ...driver, combining cutting-edge AI with automotive-grade hardware. Nuro...  ...have seamless access to the compute and data resources required to build... 

    Nuro

    Mountain View, CA
    1 day ago
  • $174k - $252k

     ...Senior Software Engineer, Google Cloud Compute Infrastructure Benefits for this role include: Health,...  ...maintain, and enhance software solutions. The AI and Infrastructure team is redefining...  ...teams are shaping the future of world‑leading hyperscale computing, with key teams... 
    Full time
    Temporary work
    Worldwide

    Reporter Newspapers

    Sunnyvale, CA
    4 days ago
  • $214k - $295k

     ...Staff Software Engineer, Data Infrastructure, AI Compute Platform Redwood City, CA (Hybrid) Biohub is the first large-scale initiative bringing frontier AI models, massive compute, and frontier experimental capabilities under one roof. We're building a general-purpose... 
    Work at office
    Worldwide
    Relocation package
    Flexible hours
    3 days per week

    Biohub

    Redwood City, CA
    1 day ago
  • $152k - $228k

     ...and profound opportunity for AI to drive positive change in the...  ...Fidelity, T. Rowe Price, and other leading investors About the Role...  ...time performance on actual robot compute hardware before it reaches the road. You will own the infrastructure that makes this possible.... 
    Temporary work
    Immediate start
    Flexible hours

    Nuro

    Mountain View, CA
    26 days ago
  •  ...Senior Software Engineer - Test Infrastructure Latitude AI develops automated driving technologies,...  ...Latitude team, you'll work alongside leading experts across machine learning and robotics...  ...platforms, mapping, sensors and compute systems, test operations, systems and... 
    Work at office
    Immediate start

    Latitude AI

    Palo Alto, CA
    1 day ago
  • $130k - $150k

     ...Security Software Engineer, Applied Computing (Starshield) SpaceX was founded under the belief...  ...Software Engineer, you will leverage AI to automate security‑related efforts and...  ...and fix security issues in Starshield infrastructure and systems Provide guidance and perform... 
    Permanent employment
    Temporary work
    Immediate start
    Flexible hours
    Weekend work

    SPACE EXPLORATION TECHNOLOGIES CORP

    Palo Alto, CA
    3 days ago
  • $157k - $235k

     ...play a critical role in scaling our ML Infrastructure, optimizing training and inference systems...  ...systems to ensure fast and efficient AI model serving Build infrastructure...  ...Bachelor's degree in a technical field such as computer science or equivalent experience ~2+... 
    Live in
    Work at office
    Local area

    Snapchat

    Palo Alto, CA
    4 days ago
  • $160.36k - $240.54k

     ...Software Engineer, Onboard Infrastructure Mountain View, California (HQ) Nuro is a self-...  ...scalable driver, combining cutting-edge AI with automotive-grade hardware. Nuro licenses...  ...'s onboard software for our sensor and compute platform, including device drivers,... 

    Nuro

    Mountain View, CA
    20 hours ago
  •  ...Software Engineer - Embedded Runtime Infrastructure Latitude AI develops automated driving technologies...  ...Latitude team, you'll work alongside leading experts across machine learning and...  ...cloud platforms, mapping, sensors and compute systems, test operations, systems and... 
    Work at office
    Immediate start

    Latitude AI

    Palo Alto, CA
    20 hours ago
  • $174k - $252k

    Senior Software Engineer, AI/ML, AI and Infrastructure Apply X Note: By applying to this position you will have an opportunity to share your preferred...  ...). Preferred qualifications: Master's degree or PhD in Computer Science or related technical field. 5 years of experience... 
    Full time
    Worldwide

    Google Inc.

    Mountain View, CA
    4 days ago
  • $132k - $198k

     ...Software Engineer, Software Update Infrastructure Mountain View, California...  ..., combining cutting-edge AI with automotive-grade...  ...fleets. Our engineers work on the tech stack across the cloud and robots...  .... ~ Bachelor's degree in Computer Science, Electrical Engineering... 

    Nuro

    Mountain View, CA
    4 days ago
  • $180k - $260k

     ...for redefining the foundations of distributed computing. As AI workloads grow increasingly complex, traditional infrastructure struggles to meet the demands of performance...  ...the Role We are seeking an experienced Tech Lead to lead the architecture, development, and scaling... 

    Clockwork.io

    Palo Alto, CA
    a month ago
  • $145k - $192k

     ...Senior Software Engineer, Infrastructure About Ladder We saw a problem within...  ...time underwriting leveraging AI and, in doing so, reduced the...  ...pay down infrastructure tech debt. You'll partner with your...  ...infrastructure/platform roles BS, MS in Computer Science or related technical... 
    Currently hiring
    Work at office
    Remote work
    Work from home
    Flexible hours

    Ladder

    Palo Alto, CA
    4 days ago
  • $120k - $300k

     ...accelerates the global adoption of safe, AI-driven machines. Founded in 2017,...  ...every intelligent machine is world-class infrastructure — come help us design it. You will implement...  ...someone who has: A Bachelor's degree in Computer Science, Software Engineering, or equivalent... 
    Full time
    Temporary work
    For contractors
    For subcontractor
    Casual work
    Work at office
    Remote work
    Day shift

    Decisive Point

    Mountain View, CA
    4 days ago
  • $180k - $260k

     ...for redefining the foundations of distributed computing. As AI workloads grow increasingly complex, traditional infrastructure struggles to meet the demands of performance...  ...looking for a passionate and experienced Tech Lead - Frontend / Full Stack to join our growing... 

    Clockwork.io

    Palo Alto, CA
    26 days ago
  •  ...are the most immediate and profound opportunity for AI to drive positive change in the physical world....  ...Google, Softbank, Fidelity, T. Rowe Price, and other leading investors About the Role Evaluation Infrastructure plays a critical role at Nuro, directly enabling L4... 
    Temporary work
    Work experience placement
    Immediate start
    Flexible hours

    Nuro

    Mountain View, CA
    20 days ago
  •  ...enterprise. To usher in this new era, we seek AI‑native thinkers across every function...  ...the "Evals Engine": Build the automated infrastructure required to run massive‑scale golden set...  ...: Education: Bachelor's degree in Computer Science or a related technical field; Masters... 

    Snowflake Computing

    Menlo Park, CA
    4 days ago
  • $160.36k - $240.54k

     ...building the world’s most scalable driver, combining cutting-edge AI with automotive-grade hardware. Nuro licenses its core...  ...looking for senior engineers to build/scale Nuro's large-scale computing infrastructure in the cloud/data center. This system is the foundation of... 

    Icehouseventures

    Mountain View, CA
    4 days ago
  • $115k - $210k

     ...place their items on our kiosks and our AI rings up their entire order in less than...  ...Summary We’re looking for a backend infrastructure developer to help us build the software...  ...coding experience ~ B.S. or higher in Computer Science (or equivalent work experience)... 
    Temporary work
    Work experience placement
    Work at office
    Immediate start
    Flexible hours

    Mashgin

    Palo Alto, CA
    more than 2 months ago
  • $160.36k - $240.54k

     ...and profound opportunity for AI to drive positive change in the...  ..., T. Rowe Price, and other leading investors. About the Role...  ...Platform, Simulation, and Technical Infrastructure. Data Platform: The Data...  ...organizations: generic compute platform to host mission-critical... 
    Immediate start
    Flexible hours

    Nuro

    Mountain View, CA
    26 days ago
  • $205k - $310k

     ...Backend Platform Tech Lead Palo Alto, CA • Engineering • Hybrid • Full...  .... Our technology leverages AI to process data generated on...  ...cloud platform on public cloud infrastructure and deploying it commercially...  ...role. BS/MS degree in Computer Science or related majors. Strong... 
    Full time

    Clutch Canada

    Palo Alto, CA
    4 days ago
  • $193.93k - $352.29k

     ...Staff/Senior Software Engineer, Offboard Infrastructure Mountain View, California (HQ)...  ...scalable driver, combining cutting-edge AI with automotive-grade hardware. Nuro licenses...  ...entire engineering organizations: generic compute platform to host mission-critical... 

    Nuro

    Mountain View, CA
    20 hours ago
  • $235k - $352k

     ...scalable driver, combining cutting-edge AI with automotive-grade hardware. Nuro...  ...About the Role As a Staff Technical Lead on Onboard Infrastructure, you will help define and build the...  ...technical leadership across sensor and compute integration, onboard runtime systems,... 

    Nuro

    Mountain View, CA
    20 hours ago
  •  ...we're building the next generation of computer user agents - AI systems that can actually use your...  ...'re looking for a generalist backend/infrastructure engineer who thrives in ambiguity, has...  ...time to split/refactor services and lead that evolution. Explore new directions... 

    Simular Inc

    Palo Alto, CA
    3 days ago
  • $250k - $300k

     ...Glean is the Work AI platform that helps everyone...  ...gives organizations the infrastructure to govern, scale, and...  ...Forbes AI 50, and Gartner's Tech Innovators in Agentic...  ...Role: The Tech Lead Manager of the Agentic...  ...experience ~ BS/BA in Computer Science or related field... 
    Home office
    Flexible hours

    Glean.info

    Mountain View, CA
    20 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Tech Lead, AI Compute Infrastructure. Be the first to apply!