Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior, Staff Backend Engineer - Distributed System

SproutsAI

Actively hiring Job type : Full-time Workplace type : Hybrid Experience : 5 years About Us At Zettabyte, we’re on a mission to make AI compute ubiquitous, seamless, and limitless. We’re building a cloud where AI just works—anywhere, anytime. “AI Power. Everywhere.” Be part of the team designing the infrastructure for the AI-first world. Why this role exists We need a Backend Engineer to build the systems that orchestrate GPU clusters for AI workloads. You'll create APIs that handle GPU allocation, memory management, compute scheduling, and multi-tenant isolation—challenges unique to AI infrastructure that go far beyond typical backend engineering. As part of our backend team, you'll solve problems like How do we efficiently share expensive GPU resources across users? How do we handle GPU memory constraints for large AI models? How do we ensure quality of service when workloads compete for compute? This is an opportunity to build infrastructure where every API call could allocate thousands of dollars worth of compute per hour, where your optimizations directly impact whether AI startups can afford to train their models. What you`ll do Design APIs that abstract complex GPU operations into simple developer experiences Build scheduling algorithms that maximize GPU utilization while ensuring SLA compliance Develop resource management systems for GPU lifecycle—provisioning, allocation, scheduling, and release Create usage tracking and billing systems for GPU-hours, memory usage, and compute utilization Implement monitoring for GPU-specific metrics, health checks, and automatic failure recovery Build multi-tenancy systems with resource isolation, quota management, and fair scheduling Optimize cold starts for model serving and implement efficient model loading strategies Collaborate with frontend engineers to expose complex infrastructure through intuitive interfaces Leverage AI-assisted coding tools (GitHub Copilot, Claude Code, Cursor IDE, etc.) to boost productivity and code quality. You`ll thrive here if you 5+ years backend engineering experience with distributed systems Strong proficiency in Go, Python, or similar backend languages Experience with resource scheduling, orchestration, and API design (REST, GraphQL, gRPC) Understanding of hardware constraints and system optimization Linux systems knowledge and containerization experience (Docker, Kubernetes) Comfortable working with expensive resources where efficiency directly impacts costs Excited about solving novel problems in AI infrastructure (not just another CRUD app) Startup mindset—comfortable with ambiguity and rapid iteration GPU or HPC cluster management experience Understanding of ML,AI workload patterns and requirements Experience with high-value resource allocation systems Background in performance optimization for compute-intensive workloads Familiarity with GPU virtualization and sharing technologies Experience building billing or metering systems Details We provide Competitive salary and equity based on your experience and skillset; This is a Hybrid role - 3 days in office, 2 days WFH; Must locate in Palo Alto Applicants must be authorized to work in the United States without need for visa sponsorship. #J-18808-Ljbffr

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior, Staff Backend Engineer - Distributed System in Palo Alto, CA vacancy
  • $190k - $240k

     ...financial technology company is seeking an experienced backend software engineer to enhance their lifecycle-orchestrator service. The successful...  ...experience, proficiency in API design, and knowledge of distributed systems. The position supports remote work, ensuring flexibility... 
    Senior
    Remote work

    Affirm

    Palo Alto, CA
    16 days ago
  •  ...Job type: Full Time · Department: Backend Engineer · Work type: On-Site About A rchetype AI Archetype AI is developing the world's first...  ...passion for building performant, scalable, and resilient distributed systems. You’ll work closely with researchers, ML engineers, and... 
    Senior
    Full time

    Neara

    Palo Alto, CA
    3 hours ago
  •  ...Dormont Manufacturing Co is seeking a Senior Staff Software Engineer to lead the architectural design of a...  ...a scalable and fault-tolerant system, integrating AI to streamline functionality...  ...The ideal candidate has 8+ years in backend development, proficiency in Java, and... 
    Senior

    Dormont Manufacturing Company

    Mountain View, CA
    4 hours ago
  • $192.6k - $305.6k

     ...learning and real-world impact converge at scale. We're hiring a Staff Backend Engineer to build and operate the infrastructure those models depend on. You'll design and operate the distributed systems that power billions of daily decisions, with a focus on the performance... 
    Suggested
    Work at office
    Worldwide
    Relocation package

    Unity

    Mountain View, CA
    5 days ago
  • $224k - $284k

     ...Senior/Staff Backend Engineer Mountain View, CA About Us CloudKitchens helps restaurateurs around...  ...'ll design, implement, and optimize systems that power mission-critical...  ...of RESTful APIs, microservices, and distributed systems. ~ Strong debugging and problem... 
    Senior
    Full time
    Temporary work
    Work at office
    Flexible hours

    CloudKitchens

    Mountain View, CA
    9 days ago
  •  ...Center’s core team to develop and optimize backend services for high-traffic products (App...  ...). You’ll build scalable, reliable systems that serve millions of daily users....  ...multithreading, caching strategies, and distributed systems. ~ Hands-on with Redis, Elasticsearch... 
    Senior
    For contractors

    OPPO US Research Center

    Palo Alto, CA
    7 days ago
  • 247Hire is looking for a Backend Engineer to design and develop sophisticated multi-layered permission role integrations and automated...  ...ideal candidate will have over 6 years of experience with distributed systems and lead small to medium engineering teams. This role... 
    Senior

    247Hire

    Mountain View, CA
    1 day ago
  • $174k - $299k

     ...complex Coupang workflow systems. We need to create a set of...  ...are looking for a visionary Senior Staff Software Engineer to lead the architectural...  ...end‑to‑end design of a distributed workflow engine. Build the...  ...8 years of experience in backend software development Proficiency... 
    Senior
    Temporary work
    Flexible hours

    Dormont Manufacturing Co

    Mountain View, CA
    4 days ago
  •  ...to converse with all of their business systems through natural language to quickly find...  ...workflow automation with Moveworks' Reasoning Engine and natural language capabilities, we...  ...a variety of responsibilities including distributed training and inference pipeline for... 
    Senior
    Work at office
    Remote work
    Flexible hours

    ServiceNow

    Mountain View, CA
    2 days ago
  •  ...Member of Technical Staff - Backend Engineer - Data Systems and APIs About Vinci We’re building a copilot for hardware. Software engineers have powerful...  ...workflows Familiarity with batch processing, distributed systems, or workflow orchestration Experience managing... 

    Vinci4d

    Palo Alto, CA
    1 day ago
  •  ...company is seeking passionate developers to join their dynamic Engineering teams. In this role, you will design and implement...  ...building world-class APIs and contributing to large-scale distributed systems. If you are eager to make a difference and thrive in a collaborative... 
    Senior

    TechDigital Group

    Sunnyvale, CA
    4 hours ago
  •  ...Apple Inc. is seeking a Senior Software Engineer - Distributed Systems in Cupertino, California. This role focuses on building innovative and scalable software solutions for distributed systems. Candidates should have 5+ years of software engineering experience, familiarity... 
    Senior

    Apple

    Cupertino, CA
    1 day ago
  • $210k - $250k

     ...Cacheflow, located in Mountain View, California, is seeking a Staff Software Engineer, Backend (Infrastructure) to lead the development of our web services. The ideal candidate should have over 10 years of experience in architecting large-scale web services, with expertise... 
    Senior

    Cacheflow

    Mountain View, CA
    1 day ago
  • $180k - $220k

     ...black.ai is looking for a Senior Software Engineer, Calibration & Control in Palo Alto, CA. In this...  ...and scientists to develop the control systems for utility-scale quantum computers....  ...experience in Python or C++, with a focus on distributed storage and graph databases. The... 
    Senior

    Black Inc

    Palo Alto, CA
    3 hours ago
  •  ...Senior / Principal Software Engineer – Distributed Systems & Databases January 28, 2025 Xage is the first and only zero trust real-world security company. Powered by the Xage Fabric, the company’s Identity & Access Management, remote access, and dynamic data security... 
    Senior
    Contract work
    Remote work
    Worldwide

    Xage

    Palo Alto, CA
    1 day ago
  • $160.36k - $240.54k

     ...future. About the Role We’re looking for senior engineers to build/scale Nuro's large-scale...  ...infrastructure in the cloud/data center. This system is the foundation of many critical...  ...in building and developing large-scale distributed applications (e.g. Kubernetes). You’re... 
    Senior

    Icehouseventures

    Mountain View, CA
    4 hours ago
  • $142.6k - $261.5k

     ...scientists, designers, and software engineers enable our clients to solve...  ...practices. Knowledgeable in system development lifecycle and...  ...strong communication skills with staff at all levels. You are a self...  ...and interest in cloud and distributed systems architectures... 
    Summer holiday
    Flexible hours

    Ernst & Young Oman

    Palo Alto, CA
    2 days ago
  • $140k - $215k

     ...CrowdStrike Holdings, Inc. is looking for a backend software engineer to join their Ingestion group in Sunnyvale, California. This position...  ...in backend development, preferably with experience in distributed systems and cloud technologies. The role offers high autonomy and... 
    Senior

    CrowdStrike Holdings, Inc.

    Sunnyvale, CA
    4 hours ago
  •  ...CrowdStrike Holdings, Inc. is looking for a Cloud Engineer for the Sensor Control Plane (Cloud...  ...role is focused on building detection systems working in conjunction with on-endpoint...  ...-end system development, expertise in distributed systems, and proficiency in languages such... 

    CrowdStrike Holdings, Inc.

    Sunnyvale, CA
    3 hours ago
  • $197k - $266.5k

    Intuit is seeking an experienced software engineer in Mountain View, California, to lead technology initiatives and drive AI integrations. The successful candidate will have over 7 years of experience in delivering enterprise-class applications and a strong proficiency... 

    ATX Venture Partners

    Mountain View, CA
    5 days ago
  • $166k - $225k

     ...improve their business. Founded by engineers — and customer obsessed — we leap...  ...be building the next generation distributed data storage and processing systems that can outperform specialized...  ...amount of data on cloud storage backends, e.g., AWS S3, Azure Blob Store.... 
    Senior
    Local area
    Worldwide

    Databricks

    Mountain View, CA
    5 days ago
  • $168k - $270.25k

     ...History of using advanced programming skills to build distributed and compute systems, backend services, microservices and cloud technologies. Effective...  ...cloud systems. BS or MS in Computer Science, Computer Engineering or related field (or equivalent experience). 8+ years... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 hours ago
  • A leading AI infrastructure company in California seeks a Member of Technical Staff — Training to design and optimize large-scale distributed training systems for frontier AI models. Candidates should have 5+ years of experience in ML systems and be proficient in Python... 

    RadixArk

    Palo Alto, CA
    5 days ago
  •  ...Overview Staff/Senior Backend Engineer - Sunnyvale, CA. Duration: 6 to 12+ months. Rate: DOE. Responsibilities Provide operations support for backend end-to-end tools. Develop REST APIs and automation solutions. Collaborate with a large backend team (navigate through a... 
    Senior

    Redolent Infotech Pvt. Ltd.

    Sunnyvale, CA
    4 hours ago
  • $168k - $270.25k

     ...Senior Software Engineer, Distributed Systems - NIM Factory page is loaded## Senior Software Engineer, Distributed Systems - NIM Factorylocations: US,...  ...including the underlying infrastructure, pipelines, backends, Docker build, test harness, metrics, performance engineering... 
    Senior
    Remote work

    NVIDIA

    Santa Clara, CA
    4 hours ago
  •  ...tailored solutions for every stage of threat prevention. Bruce Senior Solutions Consultant “The company places great importance on...  ..., execution, integrity, and inclusion.” Yasmine Systems Engineering Manager, SE Academy EMEA “Sales at Palo Alto Networks is more... 
    Senior
    Internship

    Palo Alto Networks

    Santa Clara, CA
    4 hours ago
  •  ...A leading robotics company in Palo Alto seeks a Staff/Principal ML Systems Engineer to enhance training performance for their innovative humanoid robots. You will optimize distributed training systems and engage closely with researchers to transform model changes into... 
    Senior

    Rhoda ai

    Palo Alto, CA
    1 day ago
  • At Databricks in Mountain View, we are seeking a Performance Engineer to enhance product performance and scalability. You will collaborate with teams to identify bottlenecks and optimize efficiency across our data infrastructure. The ideal candidate will have a strong... 
    Senior
    Flexible hours

    I did my part and supported the Regular Toilet

    Mountain View, CA
    4 days ago
  • $147.4k - $272.1k

     ...Senior Software Engineer - Distributed Systems Cupertino, California, United States Machine Learning and AI Our team is on a mission to build innovative...  ...a team of engineers who build innovative storage and backend service while tackling interesting challenges in a supportive... 
    Senior
    Relocation

    Apple

    Cupertino, CA
    3 hours ago
  • $188.5k - $282.7k

    Rubrik, Inc. is seeking a Senior Software Engineer for its Atlas Distributed Systems team. You'll design and deliver innovative solutions for cloud storage while guiding architectural trends within our distributed file systems. The ideal candidate has a degree in Computer... 
    Senior

    Rubrik, Inc.

    Palo Alto, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior, Staff Backend Engineer - Distributed System. Be the first to apply!