Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Platform Support Engineer (US)

$115k - $140k

Lightning AI

Platform Support Engineer (US)

San Francisco, California, United States; Seattle, Washington, United States

Who We Are

Lightning AI is the company behind PyTorch Lightning. Founded in 2019, we build an end-to-end platform for developing, training, and deploying AI systems—designed to take ideas from research to production with less friction.

Through our merger with Voltage Park, a neocloud and AI Factory, Lightning AI combines developer-first software with cost-efficient, large-scale compute. Teams get the tools they need for experimentation, training, and production inference, with security, observability, and control built in.

We serve solo researchers, startups, and large enterprises. Lightning AI operates globally with offices in New York City, San Francisco, Seattle, and London, and is backed by Coatue, Index Ventures, Bain Capital Ventures, and Firstminute.

What We're Looking For

Lightning AI is looking to hire a Platform Support Engineer to join our US Customer Experience team, supporting ML engineers running large-scale training and inference workloads across cloud infrastructure, Kubernetes, and GPU platforms in production environments.

This role sits at the intersection of ML systems, cloud infrastructure, Kubernetes, and customers. You'll support engineers training models, deploying inference systems, and scaling GPU workloads in production. You are not a ticket router or traditional support engineer. You are a technical partner to ML teams - helping diagnose failures, improve reliability, and guide customers through complex distributed systems problems.

The problems range from Kubernetes scheduling and GPU orchestration to distributed PyTorch failures, inference latency, networking bottlenecks, storage performance, and platform reliability. You'll gain exposure to a wide variety of real world AI workloads across industries and help shape the infrastructure powering the next generation of ML applications.

What You'll Do

Work Directly With ML Engineers

  • Partner directly with customer engineering teams running training and inference workloads in production
  • Help customers diagnose and resolve complex distributed systems and ML infrastructure issues
  • Act as a technical advisor during high impact incidents and platform degradation events
  • Translate infrastructure level issues into actionable guidance for ML engineers
  • Build credibility with customers through strong technical reasoning and clear communication

Debug ML Infrastructure & Distributed Workloads

  • Investigate failures involving distributed training, Kubernetes orchestration, GPU allocation, networking, and storage systems
  • Troubleshoot PyTorch, CUDA, NCCL, and inference serving related issues
  • Analyze logs, metrics, traces, and system behavior to isolate root causes
  • Debug containerized workloads running across Kubernetes and bare metal GPU environments
  • Support customers scaling workloads across multi node GPU systems
  • Diagnose performance bottlenecks involving compute, memory, networking, or storage

Improve Reliability & Platform Operations

  • Identify recurring patterns across customer issues and drive long term reliability improvements
  • Contribute to post incident reviews and operational improvements
  • Build internal tooling, automation, documentation, and runbooks
  • Partner closely with infrastructure, networking, and platform engineering teams
  • Help improve observability, operational visibility, and troubleshooting workflows
  • Improve the customer experience through better processes and technical guidance

What This Role Is Not

To set clear expectations:

  • This is not a traditional help desk or ticket routing support role
  • This is not purely customer success or account management
  • This is not a backend engineering role
  • This is not a passive escalation position

This role is for engineers who enjoy solving difficult technical problems while working closely with other engineers.

What You'll Need
Required Qualifications
Infrastructure & Systems
  • Strong software engineering and systems troubleshooting background
  • Experience with Kubernetes and containerized environments
  • Linux systems knowledge, including networking, storage, process management, and performance tuning
  • Experience with cloud infrastructure and distributed systems
  • Experience with observability and debugging tools such as Prometheus, Grafana, or OpenTelemetry
ML Infrastructure Experience
  • Hands on experience operating machine learning workloads in production or research environments
  • Experience with distributed ML systems and tooling such as PyTorch, CUDA, or NCCL
  • Familiarity with GPU infrastructure and orchestration
  • Experience troubleshooting performance, reliability, or scaling issues in ML infrastructure
  • Understanding of the operational challenges involved in running ML systems at scale
Collaboration
  • Strong communication skills and ability to work directly with highly technical customers and engineering teams
  • Comfortable operating in fast moving, highly ambiguous environments
  • Enjoys solving complex technical problems collaboratively
Ideal Experience
  • Experience with large scale model training or distributed inference systems
  • Familiarity with Ray, Kubeflow, Slurm, or similar distributed scheduling platforms
  • Experience with InfiniBand, RDMA, or high-performance networking
  • Experience operating bare metal infrastructure
  • Familiarity with storage systems commonly used in ML environments
  • Experience working at an AI infrastructure, cloud, MLOps, or developer tooling company
  • Contributions to platform engineering, developer infrastructure, or operational tooling projects
  • Experience writing automation, tooling, or scripts in Python or similar languages

This role is hybrid out of our Seattle or San Francisco offices, with an in-office requirement of at least 2 days per week and occasional team and company offsites. The role follows a Monday–Friday schedule, with working hours from 8:00 AM to 5:00 PM PST. We are not able to provide visa sponsorship for this role at this time.

We are committed to offering competitive compensation that reflects the value each team member brings to our mission. Final offers are based on factors such as experience, skills, geographic location, and role expectations. In addition to base salary, our total rewards package for eligible roles includes a discretionary bonus, a meaningful equity component, and comprehensive benefits.

The anticipated annual base salary range for this role is:

$115,000 - $140,000 USD

Benefits and Perks

We offer a comprehensive and competitive benefits package designed to support our employees' health, well-being, and long-term success. Benefits may vary by location, team, and role.

Benefits include:

  • Comprehensive medical, dental and vision coverage (U.S.); Private medical and dental insurance (U.K.)
  • Retirement and financial wellness support (U.S.); Pension contribution (U.K.)
  • Generous paid time off, plus holidays
  • Paid parental leave
  • Professional development support
  • Wellness and work-from-home stipends
  • Flexible work environment

At Lightning AI, we are committed to fostering an inclusive and diverse workplace. We believe that diverse teams drive innovation and create better products. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other protected characteristic. We are dedicated to building a culture where everyone can thrive and contribute to their fullest potential.

Vacancy posted 20 hours ago
Similar jobs that could be interesting for youBased on the Platform Support Engineer (US) in Seattle, WA vacancy
  •  ...Enterprises LLC is looking for a qualified candidate to support operational reliability of productivity platforms, ensuring high availability and performance. This...  ...in regulated environments is preferred. Join us to help deliver high-speed broadband solutions worldwide... 
    Suggested
    Worldwide

    Amazon Kuiper Manufacturing Enterprises LLC

    Bellevue, WA
    2 days ago
  • $80.9k - $122.3k

     ...of AI, and you are the future of Salesforce. Technical Support Engineer - GVC Cloud (US Citizen Only) These roles have Government...  ...future of business with AI + Data + CRM. Through our #1 CRM platform, Customer 360, we help organizations across every industry... 
    Suggested
    Work at office
    Local area
    Shift work

    Salesforce

    Seattle, WA
    1 day ago
  • Database Support Engineer - US West TigerData, formerly Timescale, is hiring a Database Support Engineer for our global, remote‑first team....  ...TigerData, formerly Timescale, is building the fastest PostgreSQL platform for modern workloads. Trusted by more than 2,000 customers,... 
    Suggested
    Remote work
    Flexible hours

    Tiger Data (creators of TimescaleDB)

    Seattle, WA
    13 hours ago
  •  ...is an orbital energy grid intelligence platform. It is designed to ingest satellite...  ...platform — and we need an exceptional engineer to help us build the MVP. Engagement Details...  ...environment. Architect RESTful endpoints to support the platform’s core intelligence... 
    Suggested
    Full time
    Contract work
    For contractors
    Remote work

    Brain Trust Inc

    Seattle, WA
    3 days ago
  • Technical Support Engineer - GVC Cloud (US Citizen Only) U.S. Citizenship required. We’re Salesforce, the Customer Company, inspiring the future of business with AI + Data + CRM. Through our #1 CRM platform, Customer 360, we help organizations across every industry transform... 
    Suggested
    Work at office
    Shift work

    Salesforce, Inc..

    Seattle, WA
    4 hours ago
  • B Capital is seeking a Technical Support Engineer to provide excellent customer experiences through effective problem-solving and support for Salesforce technology. This role requires U.S. Citizenship and 2+ years of technical support experience. The ideal candidate will... 

    B Capital

    Seattle, WA
    4 days ago
  •  ...pipelines. Drive defect triage, risk assessment, and Go/No Go decisions for releases and pilots. Collaborate with Marketing, Data, Engineering, and Product teams to ensure business aligned quality. Provide quality metrics, dashboards, and test insights to leadership and... 

    eTeam

    Seattle, WA
    4 days ago
  •  ...Tel.: (***) ***-**** Ext 13578 ****@*****.*** Location: Onsite Seattle, WA Role: UAT Engineer JOB DESCRIPTION "Required Skills: - Experience: 4+ years of experience in UAT or QA roles - Technical... 
    Local area
    Remote work
    Relocation

    Redolent

    Bellevue, WA
    4 days ago
  • $85k - $95k

     ...ultimately, safer nations. Connect with a career that matters, and help us build a safer future. Department OverviewThe Software...  ...respond faster with smarter and safer decisions. We deploy and support products such as Emergency Call Handling, 911 Equipment, Computer... 
    Remote work
    Relocation

    Motorola Solutions

    Seattle, WA
    4 days ago
  • $60k - $85k

    A nonprofit-focused financial support organization is hiring a remote technical support specialist to assist community lenders across the US. The ideal candidate should possess strong communication skills, technical proficiency, and experience in customer support. Responsibilities... 
    Part time
    Remote work

    The10minutecareersolution

    Seattle, WA
    4 days ago
  • $210k - $250k

     ...Support Engineer San Francisco, CA | New York City, NY | Seattle, WA About Anthropic Anthropic...  ...customer engineer and an internal platform team without losing either Are comfortable...  ...in this work. Your safety matters to us. To protect yourself from potential... 
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    Seattle, WA
    4 days ago
  • $68.4k - $90k

     ...company that values your contributions and supports your growth, we would like to meet you....  ...Experience with ticketing and incident management platforms, for managing user requests and...  ...-disclosure. What You Can Expect from Us Our dedication to the Employee Experience... 
    Permanent employment
    Temporary work
    Work at office
    Remote work
    Flexible hours
    Weekend work

    CI Financial Careers

    Seattle, WA
    1 day ago
  • $60k - $85k

     ...seeks to fill a full-time, remote, technical support role. Candidates should be available to...  ...nonprofit organizations around the US and the world that make social-good loans...  ...thorough working knowledge of the software platform and our customer base of community lenders... 
    Full time
    Part time
    Live in
    Work at office
    Remote work
    Home office
    Flexible hours
    Day shift

    Beyond Hosting

    Seattle, WA
    4 days ago
  • $40 - $45 per hour

     ...IT Support Specialist Payrate: $40.00- $45.00/hr. The primary...  ...Partner with remote IT, Security, Engineering, and Workplace teams to...  ...identity and access management platforms such as Entra ID (Azure AD),...  ...telephone number(s) you provided to us belong to you and that you... 
    Hourly pay
    Full time
    Work at office
    Local area
    Remote work
    Flexible hours

    Aditi Consulting

    Bellevue, WA
    20 hours ago
  • $122.3k - $158.5k

     ...technical expertise, we use cutting-edge engineering, automation, and intelligence to tackle...  ...securing game services, combating fraud, or supporting fair play, we're at the frontline of...  ...future of interactive entertainment. Join us and help keep the world of play - and EA... 
    Full time
    Local area
    Remote work

    Electronic Arts

    Kirkland, WA
    13 hours ago
  • $86.62k - $129.88k

     ...Support Engineer Kirkland, WA Radar Reinvented. Echodyne offers the world's first compact solid-state true beam-steering radar for...  ...offer competitive compensation and benefits to our full-time, US-based employees, including: RSU (Restricted Stock Units)... 
    Full time
    Temporary work
    Internship
    Work at office
    Flexible hours

    Echodyne Corp

    Kirkland, WA
    4 days ago
  • $83k - $132k

     ...Bare Metal Support Engineer Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA CoreWeave...  ...by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables...  ...want to learn from you, too. Come join us! The base salary range for this role... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours
    Shift work

    CoreWeave

    Bellevue, WA
    1 day ago
  • $40.9k

     ...AV Support Engineer The AV Support Engineer will be responsible for providing in person service and support to all Amazonians utilizing AV...  ...Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from... 
    For contractors
    Local area
    Flexible hours

    Phenom People

    Bellevue, WA
    1 day ago
  •  ...Support Engineer The client is a leader in ICT research and development. It develops optical networking; wireless networking, broadband access...  ...archiving process The successful candidate will help us establish a set of practices designed to optimize the management... 

    Netpace

    Bellevue, WA
    21 days ago
  • $170k - $230k

     ...group of committed researchers, engineers, policy experts, and business...  .... We're seeking an IT Support Engineer who combines deep technical...  ...improvements that help us scale. You'll work closely with...  ..., Android, and our core SaaS platforms (Google Workspace, Slack, GitHub... 
    Work at office
    Immediate start
    Visa sponsorship
    Flexible hours

    Anthropic

    Seattle, WA
    1 day ago
  • $92.5k - $140.5k

     ...LiveRamp is the data collaboration platform of choice for the world's most...  ...wherever data lives to support the widest range of data collaboration...  ...of our Dedicated Support Engineer team, and work to solve...  ....S. LiveRampers) More about us: LiveRamp's mission is to connect... 
    Work from home
    Flexible hours
    Night shift
    Weekend work

    LiveRamp

    Seattle, WA
    2 days ago
  • $130k - $195k

     ...A leading AI solutions company is seeking a Technical Support Engineer to enhance customer support for technical users, including AI engineers and infrastructure architects. You will help debug complex production applications, collaborate with engineering teams, and develop... 
    Remote work

    LangChain

    Seattle, WA
    4 days ago
  • $85k - $95k

     ...infrastructure management for small‑to‑mid‑sized businesses. As we continue to expand our services, we are eager to find a Tier 3 Support Engineer who embodies our core values: Teamwork: We achieve our goals collaboratively. Growth Mindset: We are focused on continuous... 
    Remote job
    Work at office
    Immediate start

    FusionTek

    Seattle, WA
    13 hours ago
  • $68.4k - $90k

     ...company that values your contributions and supports your growth, we would like to meet you....  ...Experience with ticketing and incident management platforms, for managing user requests and...  ...non-disclosure. What You Can Expect From Us Our dedication to the Employee Experience... 
    Permanent employment
    Temporary work
    Work at office
    Remote work
    Flexible hours
    Weekend work

    Corient

    Seattle, WA
    4 days ago
  • $65k - $85k

     ...Location: Seattle, WA (onsite 5x/week) About Us Thrive is an innovative technology...  ...Managed Services. Our corporate culture, engineering talent, customer-centric approach, and focus...  ...printer problems; work with next level support to resolve complex issues; conduct hardware... 
    Work at office
    Local area
    Remote work
    Weekday work

    ABACODE

    Seattle, WA
    13 hours ago
  •  ...Senior Platform/DevOps Engineer (Kubernetes-Linux) Bellevue Office, Sunset Corporate Campus Armada...  ...brilliant minds in the world to join us. Working at Armada means taking...  ...operation of our Kubernetes-based platform supporting our Galleon mobile data centers and... 
    Work at office
    Local area
    Flexible hours

    Armada

    Bellevue, WA
    1 day ago
  •  ...enhance team efficiency. The ideal candidate has a passion for accessibility, strong technical skills in HTML, CSS, and JavaScript, and experience with assistive technologies. Join us to make a meaningful impact in our customer success journey. #J-18808-Ljbffr Centaur Labs

    Centaur Labs

    Bellevue, WA
    2 days ago
  • $137k - $205.6k

    Metronome is seeking a Technical Support Engineer to provide customer service and support for their billing platform. You will handle customer escalations, troubleshoot issues, and develop internal tools to automate workflows. The ideal candidate should have at least 2... 

    Metronome

    Seattle, WA
    3 days ago
  •  ...customer relationship management firm is seeking a Technical Support Engineer in Seattle, WA. This role demands U.S. Citizenship and involves providing exceptional customer support for the Salesforce platform. Responsibilities include leading technical troubleshooting,... 

    Salesforce, Inc..

    Seattle, WA
    13 hours ago
  • $106.61k - $284.28k

     ...and quality in everything we do. Join us and be part of something bigger - helping...  ...CVS Health as a Sr. Manager, Frontline Support Engineering to lead our organization's efforts to...  ...resolution time). Experience with support platforms (e.g., ServiceNow, Zendesk, Salesforce... 
    Hourly pay
    Full time
    Temporary work
    Work experience placement
    Local area

    Hispanic Alliance for Career Enhancement

    Seattle, WA
    13 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Platform Support Engineer (US). Be the first to apply!