Member of Technical Staff (AI Infrastructure Engineer)
Perplexity
We are looking for an AI Infra engineer to join our growing team. We work with Kubernetes, Slurm, Python, C++, PyTorch, and primarily on AWS. As an AI Infrastructure Engineer, you will be partnering closely with our Inference and Research teams to build, deploy, and optimize our large-scale AI training and inference clusters Responsibilities Design, deploy, and maintain scalable Kubernetes clusters for AI model inference and training workloads Manage and optimize Slurm-based HPC environments for distributed training of large language models Develop robust APIs and orchestration systems for both training pipelines and inference services Implement resource scheduling and job management systems across heterogeneous compute environments Benchmark system performance, diagnose bottlenecks, and implement improvements across both training and inference infrastructure Build monitoring, alerting, and observability solutions tailored to ML workloads running on Kubernetes and Slurm Respond swiftly to system outages and collaborate across teams to maintain high uptime for critical training runs and inference services Optimize cluster utilization and implement autoscaling strategies for dynamic workload demands Qualifications Strong expertise in Kubernetes administration, including custom resource definitions, operators, and cluster management Hands-on experience with Slurm workload management, including job scheduling, resource allocation, and cluster optimization Experience with deploying and managing distributed training systems at scale Deep understanding of container orchestration and distributed systems architecture High level familiarity with LLM architecture and training processes (Multi-Head Attention, Multi/Grouped-Query, distributed training strategies) Experience managing GPU clusters and optimizing compute resource utilization Required Skills Expert-level Kubernetes administration and YAML configuration management Proficiency with Slurm job scheduling, resource management, and cluster configuration Python and C++ programming with focus on systems and infrastructure automation Hands-on experience with ML frameworks such as PyTorch in distributed training contexts Strong understanding of networking, storage, and compute resource management for ML workloads Experience developing APIs and managing distributed systems for both batch and real-time workloads Solid debugging and monitoring skills with expertise in observability tools for containerized environments Preferred Skills Experience with Kubernetes operators and custom controllers for ML workloads Advanced Slurm administration including multi-cluster federation and advanced scheduling policies Familiarity with GPU cluster management and CUDA optimization Experience with other ML frameworks like TensorFlow or distributed training libraries Background in HPC environments, parallel computing, and high-performance networking Knowledge of infrastructure as code (Terraform, Ansible) and GitOps practices Experience with container registries, image optimization, and multi-stage builds for ML workloads Required Experience Demonstrated experience managing large-scale Kubernetes deployments in production environments Proven track record with Slurm cluster administration and HPC workload management Previous roles in SRE, DevOps, or Platform Engineering with focus on ML infrastructure Experience supporting both long-running training jobs and high-availability inference services Ideally, 3-5 years of relevant experience in ML systems deployment with specific focus on cluster orchestration and resource management #J-18808-Ljbffr Perplexity
- Member of Technical Staff - Applied AI Engineer Valthos | Posted Mar 3 Full-time Negotiable Advanced (5-10 yrs) Valthos Inc. Valthos is an applied... ...build, deploy, and scale model training and evaluation infrastructure Visualize and communicate results within Valthos...SuggestedFull timeWork at office
- Member of Technical Staff: AI Research & Engineering in Media Integrity About Synhawk Synhawk builds omnimodal foundation models for communication integrity, aimed at infrastructure-side deployment in telco and banking sectors. Our platform analyzes the integrity of audio...SuggestedImmediate startShift work
$220k - $405k
...innovates at the frontier of AI infrastructure, search, and orchestration... ...Perplexity is seeking strong engineers with a passion for delivering... ...these interfaces. As a member of our team, you’ll work on... ...experience alike. You’ll also define technical strategy for how we scale to...Suggested$220k - $405k
...builders to join our Multimodal AI group, an industry-leading... ...we have yet to invent. As an engineer on the Multimodal AI team, you... ...evaluation systems, backend infrastructure, and supporting libraries and... ..., from problem definition to technical design, implementation, and launch...Suggested$220k
We build and run the inference engine behind every Perplexity query and deploy dozens of model architectures at scale with tight latency... ..., text-generation, and multimodal models in our inference infrastructure, from weight loading, request scheduling and KV-cache...Suggested- Perplexity is seeking energetic engineers to join our highly driven Agents engineering team... ...consists of backend, full-stack, and AI/ML engineers who collaborate to build delightful... ...and leverage cutting-edge AI models, infrastructure, and browser technologies to advance the...Flexible hours
$150k - $250k
...servicing with the industry’s most advanced AI credit-servicing agents. We are backed... ...Product Hunt), Charlie Songhurst (Board Member, Meta), and Michael Jones (Former Chair,... ...the United Nations, UChicago, and Oxford engineers and researchers. Our omnichannel...Full timeInternshipWorldwide- About the Role The Cloud Infrastructure team owns the foundational cloud... ...Own the roadmap and technical strategy for agent-driven cloud... ...low-latency, high-throughput AI workloads. Architect and scale... ...Terraform) and strong software engineering skills in at least one of...
- ...Role The Storage Platform team owns the infrastructure that powers how the company persists, retrieves... ...cost-efficiency for every product and AI workload. This foundational, high-... ...excellence around storage, the team enables engineers across the company to focus on product...
- About Perplexity AI Perplexity is an AI-powered answer engine built to serve the world’s curiosity... .... In this senior/staff role, you will shape architecture... ...and drive the long-term technical direction of Perplexity’... ...technical bar for data infrastructure through thoughtful...
- ...enabling every product and AI team to build with... ...maintains critical infrastructure, including backend systems... ...‑in‑depth. Set the technical bar for backend platform... ...area, mentoring other engineers and making long‑term... ..., more for senior and staff). Strong system design...
- Touchdown Labs, Inc. seeks a Founding Member of Technical Staff for AI Infrastructure in San Francisco/Bay Area or exceptional remote candidates. Responsibilities... .... The ideal candidate will have strong systems engineering experience and use AI tools effectively for...Full timeRemote work
- ...Catalog is building the commerce layer for AI - the missing infrastructure that lets agents not just search the web,... ...discover and buy online. Role As a Member of Technical Staff, you will ship core systems, set engineering culture, and move the mission from prototype...Work at office
$200k
...Join to apply for the Member of Technical Staff role at Listen Labs .... ..., so we are expanding our engineering team. We're looking for someone... ...Background: Listen Labs is an AI‑powered research platform... ...across the LLM pipeline, infrastructure, backend, and UX. You...Flexible hours- ...We’re an AI platform out to redefine knowledge work. The... .... About the Role As a Member of Technical Staff, you will be part of the team... ...vigorously on the underlying infrastructure, core features, agent... ...the most leverage. Shape engineering culture and practices at an...Work experience placementH1bWork at officeVisa sponsorship
- ...Tomo is building this generation's most important consumer AI product. We have been working quietly on a SOTA personal agent... ...equally strong obligations to both 1) choose good and 2) to win. think that this role should be renamed "member of tomo staff" #J-18808-Ljbffr...Immediate start
$225k - $300k
...Member of Technical Staff Location: San Francisco, CA Onsite Policy: Full-time onsite Comp... ...consumer underwriting infrastructure from the ground up using AI-powered systems across document... ...This is not a narrowly scoped engineering role inside a large organization...Full time- ...precedents to copy from. About the Role Members of Technical Staff (MTS) are the senior engineers who build the platform that... ...at its core. Multi‑tenant data infrastructure across very different portcos.... ...compounding growth. How We Use AI in Our Hiring Process To ensure...
$130k - $200k
...SketchPro SketchPro is building the first AI junior architect. We integrate deeply... ...of architecture. We’re a team of AI engineers and seasoned architects, bridging... ...frontier technology. The Role Being a Member of Technical Staff at SketchPro means the problem in front...Work at officeShift work- ...Description We’re looking for a Member of Technical Staff to build and deploy production-grade AI systems. In this role, you’ll... ...-world applications Systems Engineering: Design scalable pipelines... ...reliability of systems Data & Infrastructure: Work with large-scale datasets...
$200k - $240k
...blockchain analytics and AI solutions to help... ...for all. The AI Engineering Team is chartered... ...high‑performance infrastructure, and operational... .... As a Senior or Staff AI Infrastructure... ...building and scaling the technical infrastructure for... ..., mentors team members, and enhances...Remote workWorldwide$140k - $200k
...Member of Technical Staff Harper is an AI-native commercial insurance company in San Francisco. We're not... ...by instinct (frontend, backend, infrastructure). ~ You've shipped AI to production... ...: at most companies a junior engineer waits in line behind layers of process...Work at officeRelocation- ...Member Of Technical Staff Humans& is a human-centric frontier AI lab. We believe AI can be reimagined, centering around people and their relationships with each other. We are looking for researchers and engineers who have done exceptional work at the frontier of...
- ...Member of Technical Staff @ Lotus AI Lotus AI is a groundbreaking primary care app that integrates your... ...Our team includes ex-founders and engineers who have built and scaled consumer... ..., schema migrations, and data infrastructure simplification Familiarity with...
$150k - $300k
...Member of Technical Staff – Full-Stack AI/Software Engineer Location: San Francisco, CA (Preferred) or New York City, NY. Working arrangement: On‑Site. Employment: Full‑Time. Salary: $150,000 – $300,000 + Competitive Equity. About the opportunity: We are looking for a...Permanent employmentFull timeImmediate start- ...Pixeltable Inc. Member of Technical Staff San Francisco, CA·Full time Apply for... ...a founding member of the engineering team, you will impact the... ...Pixeltable is revolutionizing the AI development landscape with... ...on innovation, not on infrastructure. We aim to simplify the AI...Full timePart timeWork at officeWork from homeFlexible hours2 days per week
$227.5k - $401k
...everything we do is engineered for ambition. We create... ...who tackle unique technical challenges at scale... ...building a top-tier AI engineering team in... ...technology sector. As a Member of Technical Staff, you will operate... ...‑enabled fintech or infrastructure companies. Familiarity...Work at officeImmediate startRelocationFlexible hours$250k
...an enterprise-grade AI platform that lets companies... ...The team is small, technical, and moving fast,... ...AI Tools. The Role Member of Technical Staff who can handle... ...stack: Python; modern engineering / ML frameworks; AWS... ...pipelines, APIs, and cloud infrastructure (AWS, GCP)...Full time$70k - $110k
...Service Technician to join our dynamic engineering team. As a key member of our team, you will be responsible... ...individuals looking to apply their technical skills and knowledge in a challenging... ...this job, you agree to receive calls, AI-generated calls, text messages, or emails...Temporary workLocal area- ...As a Member of Technical Staff (MTS), you'll build production-grade systems that... ...continuous optimization loops for AI agents—from evaluation pipelines and data/trace infrastructure to APIs that deploy... ...is a blend of MLE + backend engineering with a strong customer empathy...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Member of Technical Staff (AI Infrastructure Engineer). Be the first to apply!
- application support technician San Francisco, CA
- personal computer support technician San Francisco, CA
- help desk assistant San Francisco, CA
- technical associate San Francisco, CA
- life support technician San Francisco, CA
- tech aide San Francisco, CA
- technical support analyst San Francisco, CA
- help desk technical support San Francisco, CA
- trade support analyst San Francisco, CA
- technical support specialist San Francisco, CA

