Platform Support Engineer
$115k - $140kNeura Market
Who We Are Lightning AI is the company behind PyTorch Lightning. Founded in 2019, we build an end-to-end platform for developing, training, and deploying AI systems—designed to take ideas from research to production with less friction. Through our merger with Voltage Park, a neocloud and AI Factory, Lightning AI combines developer-first software with cost-efficient, large-scale compute. Teams get the tools they need for experimentation, training, and production inference, with security, observability, and control built in. We serve solo researchers, startups, and large enterprises. Lightning AI operates globally with offices in New York City, San Francisco, Seattle, and London, and is backed by Coatue, Index Ventures, Bain Capital Ventures, and Firstminute. What We’re Looking For We’re looking for engineers who understand the realities of running machine learning workloads at scale. This role sits at the intersection of ML systems, cloud infrastructure, Kubernetes, and customers. You’ll support engineers training models, deploying inference systems, and scaling GPU workloads in production. You are not a ticket router or traditional support engineer. You are a technical partner to ML teams - helping diagnose failures, improve reliability, and guide customers through complex distributed systems problems. The problems range from Kubernetes scheduling and GPU orchestration to distributed PyTorch failures, inference latency, networking bottlenecks, storage performance, and platform reliability. You’ll gain exposure to a wide variety of real world AI workloads across industries and help shape the infrastructure powering the next generation of ML applications. What You'll Do Work Directly With ML Engineers Partner directly with customer engineering teams running training and inference workloads in production Help customers diagnose and resolve complex distributed systems and ML infrastructure issues Act as a technical advisor during high impact incidents and platform degradation events Translate infrastructure level issues into actionable guidance for ML engineers Build credibility with customers through strong technical reasoning and clear communication Debug ML Infrastructure & Distributed Workloads Investigate failures involving distributed training, Kubernetes orchestration, GPU allocation, networking, and storage systems Troubleshoot PyTorch, CUDA, NCCL, and inference serving related issues Analyze logs, metrics, traces, and system behavior to isolate root causes Debug containerized workloads running across Kubernetes and bare metal GPU environments Support customers scaling workloads across multi node GPU systems Diagnose performance bottlenecks involving compute, memory, networking, or storage Improve Reliability & Platform Operations Identify recurring patterns across customer issues and drive long term reliability improvements Contribute to post incident reviews and operational improvements Build internal tooling, automation, documentation, and runbooks Partner closely with infrastructure, networking, and platform engineering teams Help improve observability, operational visibility, and troubleshooting workflows Improve the customer experience through better processes and technical guidance What This Role Is Not To set clear expectations: This is not a traditional help desk or ticket routing support role This is not purely customer success or account management This is not a backend engineering role This is not a passive escalation position This role is for engineers who enjoy solving difficult technical problems while working closely with other engineers. What You’ll Need Required Qualifications Infrastructure & Systems Strong software engineering and systems troubleshooting background Experience with Kubernetes and containerized environments Linux systems knowledge, including networking, storage, process management, and performance tuning Experience with cloud infrastructure and distributed systems Experience with observability and debugging tools such as Prometheus, Grafana, or OpenTelemetry ML Infrastructure Experience Hands on experience operating machine learning workloads in production or research environments Experience with distributed ML systems and tooling such as PyTorch, CUDA, or NCCL Familiarity with GPU infrastructure and orchestration Experience troubleshooting performance, reliability, or scaling issues in ML infrastructure Understanding of the operational challenges involved in running ML systems at scale Collaboration Strong communication skills and ability to work directly with highly technical customers and engineering teams Comfortable operating in fast moving, highly ambiguous environments Enjoys solving complex technical problems collaboratively Nice-to-Haves Experience with large scale model training or distributed inference systems Familiarity with Ray, Kubeflow, Slurm, or similar distributed scheduling platforms Experience with InfiniBand, RDMA, or high-performance networking Experience operating bare metal infrastructure Familiarity with storage systems commonly used in ML environments Experience working at an AI infrastructure, cloud, MLOps, or developer tooling company Contributions to platform engineering, developer infrastructure, or operational tooling projects Experience writing automation, tooling, or scripts in Python or similar languages This role is hybrid out of our Seattle or San Francisco offices, with an in-office requirement of at least 2 days per week and occasional team and company offsites. The role follows a Monday–Friday schedule, with working hours from 8:00 AM to 5:00 PM PST. We are not able to provide visa sponsorship for this role at this time. We are committed to offering competitive compensation that reflects the value each team member brings to our mission. Final offers are based on factors such as experience, skills, geographic location, and role expectations. In addition to base salary, our total rewards package for eligible roles includes a discretionary bonus, a meaningful equity component, and comprehensive benefits. The anticipated annual base salary range for this role is: $115,000 - $140,000 USD Benefits and Perks We offer a comprehensive and competitive benefits package designed to support our employees’ health, well-being, and long-term success. Benefits may vary by location, team, and role. Benefits include: Comprehensive medical, dental and vision coverage (U.S.); Private medical and dental insurance (U.K.) Retirement and financial wellness support (U.S.); Pension contribution (U.K.) Generous paid time off, plus holidays Paid parental leave Professional development support Wellness and work-from-home stipends Flexible work environment At Lightning AI, we are committed to fostering an inclusive and diverse workplace. We believe that diverse teams drive innovation and create better products. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other protected characteristic. We are dedicated to building a culture where everyone can thrive and contribute to their fullest potential. #J-18808-Ljbffr Neura Market
$119k - $224k
...About this role: Wells Fargo is seeking a Lead Infrastructure Engineer to join our AI Platforms and model Support Group as part of Digital Technology and Innovations. Learn more about the career areas and business divisions at wellsfargojobs.com. The Lead Infrastructure...SuggestedWork experience placement$94k - $118k
...Sr. Data Platform Support Engineer At Early Warning, we've powered and protected the U.S. financial system for over thirty years with cutting-edge solutions like Zelle®, Paze℠, and so much more. As a trusted name in payments, we partner with thousands of institutions...SuggestedHourly payWork at officeImmediate startVisa sponsorshipWork visaFlexible hours- A fast-growing technology firm in San Francisco is seeking a Support Engineer to enhance customer onboarding and ensure reliable API integrations... ...on role, you will help property managers succeed with our AI platform, drive continuous improvements, and maintain clear...Suggested
- A global cybersecurity leader located in San Francisco is searching for a Tier 2 Technical Support Engineer to join their Customer Support team. The role involves diagnosing and resolving complex technical issues, offering advanced troubleshooting for OPSWAT solutions,...Suggested
- Technical Support Engineer- 12 Months Contract to Hire Rippling gives businesses one place to run HR, IT, and Finance. It brings together all... ...brokers, employers, and employees with any benefit. Our platform powers the discovery, purchase, and utilisation of benefits products...SuggestedContract workImmediate start
- Unstructured Technologies Inc. seeks a Technical Support Engineer to join their team in San Francisco. This role involves troubleshooting issues on their SaaS platform, assisting customers with onboarding, and managing account-related requests. The ideal candidate has...Remote job
$151k - $176k
...Technical Support Engineering Manager Merge is the leading provider of agentic tools and customer-facing integrations for frontier LLMs, Fortune 500 organizations, and B2B SaaS companies. Our platform offers two core products: Merge Unified, which enables businesses...Work at officeHome office- ...reliably from algorithm to silicon. Our platform accelerates deployment of DSP, RF, communications... ...GPUs, FPGAs, ASICs, and edge SoCs. We support automotive, aerospace, defense, and... ...experienced Senior Platform DevOps Engineer (CI/CD + Developer Experience) to join...Local areaRemote workRelocation packageFlexible hours
$127.5k - $210k
...Overview Meter is scaling fast, and we’re redefining the future of enterprise networking. If you’re a network engineer who thinks support is maintenance mode, think again. This is where you’ll work on some of the most complex and varied network designs in the industry...Remote work$110k - $125k
...About Zip Zip is the AI platform for enterprise procurement - built for humans and agents working together. By orchestrating... ...LinkedIn Top Startups. Your Role As a Senior Technical Support Engineer (TSE) on the Customer team, you play a mission-critical role...Home officeFlexible hours- ...Technical Support Engineer San Francisco, CA About Starburst Starburst delivers enterprise intelligence at scale by giving organizations secure, governed access to all their data, wherever it lives. Built for distributed data environments, Starburst helps enterprises...Local areaFlexible hours
- ...infrastructure cybersecurity, delivers an end-to-end platform that gives public and private sector... ...is now searching for Tier 2 Technical Support Enginee r to join our Customer Support... ...customers (typical customers are engineers and IT personnel). Provide status updates...Contract workLocal areaRemote work
- ...Nexthop AI Support Engineer Nexthop AI is a team of industry-leading professionals with deep hardware and software expertise spanning silicon, systems, network operations and cloud development, dedicated to building innovative, bleeding-edge solutions for large-scale...
- ...in the technology services sector, is seeking a dedicated IT Support Engineer - L2 to join their dynamic team. As an IT Support Engineer -... ...ensuring timely solutions for end-users. Utilize ServiceNow platform to manage and track support requests effectively. Support...Weekly payTemporary workFlexible hours
$118k - $137k
...combines hands-on systems administration with thoughtful end-user support, helping teammates seamlessly access the tools and systems they... ...provisioning through retirement Administer and maintain MDM platforms such as Kandji, Jamf, Intune, or equivalent tooling Oversee...Work at officeLocal areaHome officeFlexible hours- ...Technical Support Engineer You'll be joining as a Technical Support Engineer and be responsible for delighting our customers by solving their technical challenges. Delivering a world-class customer experience for inbound support questions over Slack and Pylon...Remote work
- ...IT Support Engineer Rootshell Enterprise Technologies Inc. is a recognized provider of professional IT Consulting services in the US. We are actively seeking an IT Support Engineer for one of our clients. Location: San Francisco, CA (Onsite) Job description:...Work at officeFlexible hoursAfternoon shiftEarly shift
$137k - $205.6k
...Technical Support Engineer New York City; San Francisco Bay Area About Us Metronome is the leading usage-based billing platform built for modern software companies. With Metronome, companies can launch products faster, offer any pricing model, and streamline...Work experience placement$80 - $84 per hour
...Job Title: Fleet Support Engineer Location: San Francisco, CA 94103 Duration: 6+ Months (possible extension) Pay rate range: $80.00 - $84.00/hour Shift: Sunday through Thursday, 8:00AM - 5:00PM Job Description: The Fleet Support Engineer on the...Remote workShift work- ...raised $125M in funding from top angels, operators, and security leaders. About the Role We're looking for a Technical Support Engineer to join as Socket's first dedicated support hire in the US. You'll be the go-to person for developers and security teams...Remote workFlexible hours
- ...Plain? Plain is redefining customer support for the next generation of B2B companies... ...'re building the fastest, most powerful platform to help companies move beyond reactive support... ...taker" role. As one of our Support Engineers, you'll be a clear owner of support...Work at officeImmediate startDay shift3 days per weekEarly shift
- ...Performance Engineer Opportunity At LiteLLM LiteLLM is an open-source LLM Gateway with... ...performance engineer to help scale the platform to handle 5K RPS (Requests per second).... ...the Role We're looking for a Technical Support Engineer to help our customers troubleshoot...
$120k - $190k
About Glean: Glean is the Work AI platform that helps everyone work smarter with AI. What began as the industry’s most advanced enterprise... ...Role: Glean is looking for a talented Designated Technical Support Engineer to join our rapidly expanding, venture-backed startup. We are...Work experience placementHome officeFlexible hoursShift work- ...Fleet Support Engineer Could you be the full-time onsite Fleet Support Engineer in Pittsburgh, PA, US we're looking for? The successful candidate may be located in Pittsburgh or West Mifflin, PA; or any of the following airports (Atlanta, San Francisco, Denver, or...Full time
$62.18k - $94.05k
...Connected Operations™ Cloud, which is a platform that enables organizations that depend on... ...company, you’ll have the autonomy and support to make an impact as we build for the long... ...for an experienced Technical Support Engineer to provide world-class hardware and software...Full timeWork experience placementRemote workRelocation packageFlexible hoursShift workNight shiftWeekend work- ...Braintrust is the AI observability platform. By connecting evals and observability in... ...Braintrust, surprisingly good developer support is one of our most important strategic advantages... ...We're looking for Developer Support Engineers - both mid-level and senior - who are...Local areaFlexible hours
$120k - $160k
...targets to another layer of defense. Fable is the human risk platform that directly shapes employee behavior. Designed for... ...shape the future of security. The Role As Fable's first Support Engineer, you will own the end-to-end technical support experience for...Work experience placementFlexible hours$47 - $50.87 per hour
...outcomes for their employees globally. With our innovative platform, we’ve been able to generate a net positive ROI [ for... ..., everywhere. Reporting to the Associate Director, IT Support, this IT Support Engineer will serve as the primary, and sole, IT presence at our...Hourly payWork at officeLocal areaRemote workRelocationSleeping nights- ...Factory is bringing autonomy to software engineering. We are looking for a Technical Support Engineer to ensure customers get maximum value from our developercentric platform. This position sits at the intersection of customer success, product management and engineering...
$28 - $29 per hour
...Job Title: IT Support Engineer Location: San Francisco, CA Client - Scale AI Location - San Francisco, CA (fully onsite) Pay : $28-29/hr Job Description IT Support Engineer Scale is looking to grow our IT team! We are looking for someone who...Work at officeRemote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Platform Support Engineer. Be the first to apply!
- client platform engineer San Francisco, CA
- senior platform engineer San Francisco, CA
- data platform engineer San Francisco, CA
- platform engineering manager San Francisco, CA
- platform developer San Francisco, CA
- platform engineer San Francisco, CA
- IT network engineer San Francisco, CA
- operations support system engineer San Francisco, CA
- IT software engineer San Francisco, CA
- IT developer San Francisco, CA

