Member of Technical Staff (AI Infrastructure Engineer)
Perplexity
We are looking for an AI Infra engineer to join our growing team. We work with Kubernetes, Slurm, Python, C++, PyTorch, and primarily on AWS. As an AI Infrastructure Engineer, you will be partnering closely with our Inference and Research teams to build, deploy, and optimize our large-scale AI training and inference clusters Responsibilities Design, deploy, and maintain scalable Kubernetes clusters for AI model inference and training workloads Manage and optimize Slurm-based HPC environments for distributed training of large language models Develop robust APIs and orchestration systems for both training pipelines and inference services Implement resource scheduling and job management systems across heterogeneous compute environments Benchmark system performance, diagnose bottlenecks, and implement improvements across both training and inference infrastructure Build monitoring, alerting, and observability solutions tailored to ML workloads running on Kubernetes and Slurm Respond swiftly to system outages and collaborate across teams to maintain high uptime for critical training runs and inference services Optimize cluster utilization and implement autoscaling strategies for dynamic workload demands Qualifications Strong expertise in Kubernetes administration, including custom resource definitions, operators, and cluster management Hands-on experience with Slurm workload management, including job scheduling, resource allocation, and cluster optimization Experience with deploying and managing distributed training systems at scale Deep understanding of container orchestration and distributed systems architecture High level familiarity with LLM architecture and training processes (Multi-Head Attention, Multi/Grouped-Query, distributed training strategies) Experience managing GPU clusters and optimizing compute resource utilization Required Skills Expert-level Kubernetes administration and YAML configuration management Proficiency with Slurm job scheduling, resource management, and cluster configuration Python and C++ programming with focus on systems and infrastructure automation Hands-on experience with ML frameworks such as PyTorch in distributed training contexts Strong understanding of networking, storage, and compute resource management for ML workloads Experience developing APIs and managing distributed systems for both batch and real-time workloads Solid debugging and monitoring skills with expertise in observability tools for containerized environments Preferred Skills Experience with Kubernetes operators and custom controllers for ML workloads Advanced Slurm administration including multi-cluster federation and advanced scheduling policies Familiarity with GPU cluster management and CUDA optimization Experience with other ML frameworks like TensorFlow or distributed training libraries Background in HPC environments, parallel computing, and high-performance networking Knowledge of infrastructure as code (Terraform, Ansible) and GitOps practices Experience with container registries, image optimization, and multi-stage builds for ML workloads Required Experience Demonstrated experience managing large-scale Kubernetes deployments in production environments Proven track record with Slurm cluster administration and HPC workload management Previous roles in SRE, DevOps, or Platform Engineering with focus on ML infrastructure Experience supporting both long-running training jobs and high-availability inference services Ideally, 3-5 years of relevant experience in ML systems deployment with specific focus on cluster orchestration and resource management #J-18808-Ljbffr Perplexity
$100k - $300k
...Cogent Security Cogent is an Applied AI Lab building the next generation of AI... ...are looking for talented, ambitious AI/ML Engineers who are excited to build in the Applied AI... ...Onboard, support and uplevel future team members Mentor and grow future junior team members...Suggested$150k - $250k
...servicing with the industry’s most advanced AI credit-servicing agents. We are backed... ...Product Hunt), Charlie Songhurst (Board Member, Meta), and Michael Jones (Former Chair,... ...the United Nations, UChicago, and Oxford engineers and researchers. Our omnichannel...SuggestedFull timeInternshipWorldwide- Member of Technical Staff - Applied AI Engineer Valthos | Posted Mar 3 Full-time Negotiable Advanced (5-10 yrs) Valthos Inc. Valthos is an applied... ...build, deploy, and scale model training and evaluation infrastructure Visualize and communicate results within Valthos...SuggestedFull timeWork at office
- Member of Technical Staff: AI Research & Engineering in Media Integrity About Synhawk Synhawk builds omnimodal foundation models for communication integrity, aimed at infrastructure-side deployment in telco and banking sectors. Our platform analyzes the integrity of audio...SuggestedImmediate startShift work
$150k - $250k
Founding Member of Technical Staff (Platform Engineering) I’m currently partnered with a well‑funded, early‑stage applied AI company building at the frontier of reinforcement learning... ...and scaling RL training + inference infrastructure for real‑world usage Designing...SuggestedVisa sponsorshipRelocation package$200k - $400k
...Infrastructure Engineer Opportunity We are looking for an Infrastructure Engineer who thrives on... ...resource allocation to ensure our real-time AI features hit their latency targets.... ...: Ability to write clear technical specs for both internal teams and external...Flexible hours$200k - $350k
...Edison Scientific builds and commercializes AI agents for science. Scientific discovery... ...assembling a team of top researchers and engineers across AI and biology to build an AI... ...reliability and adoption, and be the go-to technical contact for AI within the client organization...Work at officeRemote work$220k
We build and run the inference engine behind every Perplexity query and deploy dozens of model architectures at scale with tight latency... ..., text-generation, and multimodal models in our inference infrastructure, from weight loading, request scheduling and KV-cache...$220k - $405k
Overview Perplexity is seeking an energetic engineer to join our highly driven Agents engineering team. The Agents team consists of AI/ML, backend, and full-stack engineers who... ...and leverage cutting‑edge AI models, infrastructure, and browser technologies to advance the...Flexible hours$100k - $300k
...Founding- and Staff-level Engineers We are looking for Founding- and Staff-level Engineers to design... ...strategies that enable Applied AI use cases like semantic search and retrieval... ...experience as a hands-on engineer and technical leader leading multiple projects ~...- ...Cloud Security Engineer Perplexity is seeking a highly experienced and hands-on Cloud... ...to build and maintain secure, scalable infrastructure that empowers engineers to innovate quickly... ...languages ~ Bonus: Experience with AI/ML infrastructure and multi-cloud environments...
$220k - $405k
Perplexity is seeking an experienced Software Engineer focusing on building the next-gen AI Foundation & Platform to help revolutionize the way people... ...end-to-end AI data, evaluation and personalization infrastructure and flywheel which powers almost all agent products....Worldwide- ...AI Infrastructure SpecialistAs vCluster's AI Infrastructure Specialist, you... ...will be one of the first team members a neocloud or AI Factory engages with at a technical depth, and the playbooks you... ...Feedback Loop: Collaborate with Engineering and Product to surface...Remote workFlexible hours
$180k - $300k
...time Location Type Hybrid Department Platform & Infrastructure Compensation $180K - $300K • Offers Equity Perplexity... ...queries to enterprise-scale integrations. As a Staff Backend Engineer, you will shape the technical foundation of Perplexity’s external platform. You’...Full timeWorldwide- About Liquid AI Spun out of MIT CSAIL, we build general-purpose AI systems that run... ...we sell, but the internal data and agent infrastructure that makes Liquid run at the speed of a... ...models. You need real experience with prompt engineering, tool use, evals, and the practical...
$200k - $350k
...Scientific builds and commercializes AI agents for science.... ...team of top researchers and engineers across AI and biology to build... ...operating the core platform infrastructure that powers autonomous scientific... ...at the senior level is about technical ownership and leverage—...Work at office$160k - $235k
...Senior AI Engineer, AI Platform San Francisco, CA; USA (Remote) Affinity stitches together billions of data points from massive datasets... ...flexibility with meaningful in-person collaboration. Team members within commuting distance are expected in-office 2–3 days per...Work at officeRemote workWorldwideFlexible hours2 days per week3 days per week- ...heterogeneous neocloud for AI workloads. As AI systems... ...homogeneous, vertically integrated infrastructure. Gimlet addresses this by... ...Gimlet Labs is seeking an Member of Staff focused on AI Research (Intern... ...degree in computer science, engineering, or comparable area of study...Internship
$200k
...Join to apply for the Member of Technical Staff role at Listen LabsTL;DR: We... ..., so we are expanding our engineering team. We're looking for someone... ...: Listen Labs is an AI-powered research platform... ...across the LLM pipeline, infrastructure, backend, and UX.You have...Flexible hours$200k - $350k
...Member of ML Technical Staff Title of Role: Member of ML Technical Staff Location: San Francisco... ...Stage of Funding: Venture-Backed - AI Office Type: Onsite Salary: $20... ...to the continuous improvement of engineering practices. Analyze model performance...Work at officeVisa sponsorship- ...Staff AI Platform Engineer Laurel is on a mission to return time. As the leading AI Time platform... ...ownership. We empower every team member to understand the business levers behind... .... Experience designing AI infrastructure, not just models (REQUIRED) You think...Work at officeRemote workVisa sponsorship2 days per week
- Member of Technical Staff - Software Engineer Valthos | Posted Mar 3 Full-time Negotiable Advanced (5-10 yrs)... ...and deploy software and biological AI systems to safeguard humanity. The... ...managing container-based scalable infrastructure, building REST APIs, integrating open...Full timeWork at office
- ...Perplexity is AI for people who expect more. This role brings... ...great data scientist, analytics engineer, or data engineer - the kind... ...notices. You'll build the infrastructure that turns a small data team... ...the output of every data team member and every stakeholder who needs...
- ...Job Description What we are looking for? Seeking a Member of Technical Staff - Backend with 5+ years of experience. We are looking for... ...Seniority 5+ years of experience in backend software engineering, with a focus on Python in well-established engineering teams...Work experience placement
$130k - $240k
...SketchPro SketchPro is building the first AI junior architect. We integrate deeply... ...of architecture. We’re a team of AI engineers and seasoned architects, bridging... ...frontier technology. The Role Being a Member of Technical Staff at SketchPro means the problem in...Work at officeShift work$150k - $250k
...servicing with the industry’s most advanced AI credit-servicing agents. We are backed... ...Product Hunt), Charlie Songhurst (Board Member, Meta), and Michael Jones (Former Chair,... ...the United Nations, UChicago, and Oxford engineers and researchers. Our omnichannel...Full timeWork experience placementInternshipWorldwide$150k - $350k
...Description Job Description Member of Technical Staff, Applied Research — Sieve... ...Sieve Sieve is the only AI research company exclusively focused on video data infrastructure and video intelligence.... ...technical applied research engineering role sitting between research...Full timeH1bVisa sponsorship- ...Member of Technical Staff, Model EfficiencyWho are we?Our mission is to scale intelligence to serve humanity... ...and enterprises who are building AI systems to power magical experiences... ...customers.Cohere is a team of researchers, engineers, designers, and more, who are...Full timeWork at officeRemote workFlexible hours
$185k - $250k
...Activant, 1984 Ventures and Page One. The Role We’re hiring a Member of Technical Staff - Fullstack to design, build, and scale end-to-end systems... ...solutions. You’ll play a critical role in shaping Stuut’s engineering culture and product experience, ensuring our full stack...Full timeFlexible hours$220k - $405k
...Employment Type Full time Department AI Compensation $220K - $405K... ...energetic researchers and engineers to join our Secure... ...broader AI ecosystem. As a member of SII, you'll conduct original... ...privacy threats across AI systems, infrastructure, and user-facing products....Full timeLocal area
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Member of Technical Staff (AI Infrastructure Engineer). Be the first to apply!
- technical support assistant San Francisco, CA
- technical analyst San Francisco, CA
- end user support technician San Francisco, CA
- IT assistant San Francisco, CA
- help desk assistant San Francisco, CA
- IT support technician San Francisco, CA
- operations support technician San Francisco, CA
- desktop support analyst San Francisco, CA
- support analyst San Francisco, CA
- technical associate San Francisco, CA


