Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Infra Engineer - Global GPU ML Inference

The Token Company

The Token Company in San Francisco is seeking a Member of Technical Staff for their infrastructure team. In this role, you will own the cloud systems that serve our compression API and build global low-latency, high-throughput GPU ML inference infrastructure. The ideal candidate will have solid experience in cloud infrastructure, including AWS and Docker, and a proven track record in building production environments. Additional benefits include equity, housing, food, and visa sponsorship. #J-18808-Ljbffr The Token Company

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Staff Infra Engineer - Global GPU ML Inference in San Francisco, CA vacancy
  • A tech startup focusing on AI optimization is seeking engineers in San Francisco to enhance their GPU kernel optimization framework. Candidates should possess...  .... Previous experience in GPU programming and AI/ML research is advantageous. Join a small team committed... 
    Suggested

    Wafer

    San Francisco, CA
    3 days ago
  • B Capital is seeking a skilled engineer for GPU infrastructure in San Francisco. This role involves designing and operating high-performance systems for model inference, synthetic data generation, and reinforcement learning. The ideal candidate has strong GPU systems experience... 
    Suggested

    B Capital

    San Francisco, CA
    3 days ago
  •  ...there. The Opportunity Our Edge Inference team compiles Liquid...  ...require deep understanding of both ML architectures and hardware constraints...  ...kernels for CPU, NPU, and GPU architectures across diverse...  ...Experience Embedded software engineering experience or work on resource... 
    Suggested

    Liquid AI

    San Francisco, CA
    4 days ago
  • Jaide Health is seeking an engineer for their Model Efficiency team in...  ...focuses on building reliable ML systems while enhancing core performance...  ...techniques such as GPU/CUDA optimizations and collaborate...  ...and insights into the LLM inference ecosystem. A commitment to diversity... 
    Suggested
    Remote job

    Jaide Health

    San Francisco, CA
    5 days ago
  •  ..., Fly over AWS when it makes sense, PyTorch over legacy ML frameworks. The Work GPU inference : We run our own ASR models. Real-time transcription : WebSocket...  ...C# Runtime : Bun, Node.js, Django, FastAPI ML : PyTorch Infra : Fly.io, Terraform, AWS (RDS), Redis Protocols : gRPC,... 
    Suggested

    Aqua Voice

    San Francisco, CA
    1 day ago
  • A leading AI research firm in San Francisco is seeking a Member of Technical Staff specialized in Model Efficiency. In this role, you will enhance LLM inference systems by tackling performance issues and collaborating with cross-functional teams. Ideal candidates have over... 
    Remote work

    Cohere

    San Francisco, CA
    2 days ago
  • $192k - $260k

    A leading data and AI company is seeking a Staff Engineer to design and implement core systems for Foundation Model Serving. The ideal candidate...  ...closely across teams to ensure operational excellence in GPU serving workloads. Competitive salary range of $192,000 to $26... 

    Databricks Inc.

    San Francisco, CA
    2 days ago
  • $200k - $250k

     ...Overview Build and operate the ML platform that powers...  ...scalable training, inference, and cost‑efficient operations...  ...ECS, SageMaker, GPU fleets, model serving, autoscaling...  ..., or similar). Prior staff‑level role in a company with a significant AI infra footprint. Experience... 
    Remote work

    AppFolio

    San Francisco, CA
    2 days ago
  • Acceler8 Talent is seeking a Member of Technical Staff focused on ML Systems & Inference in San Francisco, California. This role includes building and...  ...AI workloads. The ideal candidate has strong software engineering roots and experience in inference systems. You will... 

    Acceler8 Talent

    San Francisco, CA
    4 days ago
  •  ...technology company in San Francisco is seeking a Data Platform Engineer to drive architecture and implementation of core systems. The ideal...  ...and demonstrates strong analytical skills in fields such as ML and statistics. Responsibilities include planning technical roadmaps... 

    Hive

    San Francisco, CA
    3 days ago
  • $160k - $300k

     ...product development. We empower global innovators in automotive,...  ...mission is to revolutionize how engineering decisions are made, turning...  ...About the Role As a Senior / Staff Infrastructure Engineer at...  ...distributed systems) Exposure to ML infra Personality & Values:... 
    Work at office
    Visa sponsorship
    Flexible hours

    Apiphany

    San Francisco, CA
    22 hours ago
  • A tech-first company is seeking a Member of Technical Staff to focus on cutting-edge AI research and development. The role involves building and scaling training and inference infrastructure, designing ML kernels, and optimizing performance. Ideal candidates should have... 

    Mirendil

    San Francisco, CA
    2 days ago
  • Crusoe in San Francisco is looking for a Senior Staff Network Operations Engineer to oversee the reliability of its global network. This role entails leading incident responses, defining operational standards, and guiding a team of engineers in maintaining a high-performing... 

    ProducePay

    San Francisco, CA
    3 days ago
  • Crusoe is seeking a Staff Software Infrastructure Engineer in San Francisco to manage cloud infrastructure, develop...  ...critical role requires expertise in GPU troubleshooting, strong Linux skills...  ...make a significant impact on the global energy landscape. #J-18808-Ljbffr... 

    Crusoe

    San Francisco, CA
    3 days ago
  • A cutting-edge AI research firm in San Francisco is seeking talent to build and optimize GPU infrastructure for large-scale model inference and training workloads. The ideal candidate will have hands-on experience with GPU systems and optimization techniques, actively... 

    Reflection

    San Francisco, CA
    5 days ago
  • Claryo is seeking a Staff Software Engineer with a focus on Computer Vision Deployment based in San Francisco. The successful candidate will develop...  ...include creating and managing distributed cloud GPU infrastructures and building comprehensive computer vision pipelines... 
    Work at office
    3 days per week

    Claryo

    San Francisco, CA
    1 day ago
  • Requirements Deep experience with GPU programming and performance work (CUDA, Triton, CUTLASS, or similar)...  ...laid out for you 3+ years of professional software engineering experience with meaningful work on ML inference or high-performance systems Familiarity with at least... 

    Perplexity AI

    San Francisco, CA
    5 days ago
  • $300 per month

     ...We’re crafting the engine that powers a world...  ...the Role As a Senior Staff Cloud Support...  ...networking, and AI/ML infrastructure, and...  ...AI infrastructure globally. What You’ll Be Working...  ...Troubleshoot NCCL, IB, GPU driver/firmware...  ...workloads (training + inference) with performance... 
    Full time
    Temporary work

    Epoch Biodesign

    San Francisco, CA
    4 days ago
  • We are looking for an AI Infra engineer to join our growing team. We work...  ...partnering closely with our Inference and Research teams to build,...  ...observability solutions tailored to ML workloads running on...  ...strategies) Experience managing GPU clusters and optimizing compute... 

    Perplexity

    San Francisco, CA
    3 days ago
  •  ...candidates with expertise in AI simulation development. The role emphasizes optimizing training efficiency, enhancing GPU performance, and ensuring low-latency inference. Applicants should be proficient in methodologies for gradient checkpointing, Nsight profiling, and job... 

    Embedding VC

    San Francisco, CA
    5 days ago
  • Sail Research in San Francisco is seeking a talented engineer to design and implement robust systems that ensure fast and cost-efficient AI inference at global scale. You will be responsible for building high-performance schedulers and optimizing global routing while focusing... 

    Sail Research

    San Francisco, CA
    2 days ago
  • $197.3k - $313.7k

    Staff ML Engineer, Fine Tuning - SlackSkip to main content#Staff ML Engineer, Fine Tuning - Slack...  ...finetuning training pipelines on GPU infrastructure.* Brainstorm with Product...  ...Familiarity with model optimization for inference (quantization, pruning, speculative decoding... 
    Work at office

    Salesforce, Inc.

    San Francisco, CA
    2 days ago
  • $190.9k - $232.8k

    A leading data and AI company is seeking a Staff Software Engineer for GenAI inference to lead the architecture and optimization of the inference engine. The role requires expertise in CUDA, GPU programming, and distributed systems design. Ideal candidates will have a strong... 

    Menlo Ventures

    San Francisco, CA
    4 days ago
  • $181.1k - $318.4k

    Apple Inc. is looking for a Staff ML Infrastructure Engineer in San Francisco to lead pre-training initiatives for cutting-edge foundation models in machine learning. The successful candidate will have over 6 years of experience in building scalable backend systems, be... 

    Apple

    San Francisco, CA
    4 days ago
  • $227.2k - $417k

    Software Engineer, ML Infra & Distributed Systems (Staff & Principal) About the Role: As a Software Engineer on the ML Infrastructure team, you will collaborate...  ...Product teams to build world‑class machine learning inference platforms. These platforms power essential services... 
    Full time
    Temporary work
    Local area
    Flexible hours

    Tubi Tv

    San Francisco, CA
    1 day ago
  •  ...a web application that distills complex ML signals, building automation tools that run...  ...for: We’re looking for an experienced engineer to help shape our architecture, strengthen...  ...platform securing digital trust for leading global businesses. Our deep investments in... 

    Sift Science

    San Francisco, CA
    2 days ago
  • A leading streaming service is seeking a Staff Software Engineer to enhance ML infrastructure. The role involves designing scalable systems, mentoring engineers, and collaborating with cross-functional teams. Candidates should have over 8 years of experience in building... 

    Tubi Tv

    San Francisco, CA
    5 days ago
  • $200k - $400k

    Inferact is looking for a Developer Relations Engineer in San Francisco, California, to help developers utilize vLLM for AI inference. This unique role involves teaching technical concepts, creating educational content, and engaging with the AI infrastructure community.... 
    Remote work

    Inferact

    San Francisco, CA
    1 day ago
  • $253k - $308k

    Staff Engineer, Engineering Productivity & AI Quality Harper is an AI-native commercial insurance...  ...productivity, platform, CI/CD, build, test‑infra, or internal tooling that other engineers...  ...not a process or PM role. Production AI/ML systems experience (agent harness, eval... 
    Part time
    Work at office
    Relocation

    Harper Group

    San Francisco, CA
    4 days ago
  •  ...company in San Francisco is seeking a Member of Technical Staff focused on kernels and GPU performance. This role involves optimizing GPU and...  ...various hardware. Ideal candidates have strong software engineering foundations and experience with performance-critical systems... 

    Gimlet Labs

    San Francisco, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Infra Engineer - Global GPU ML Inference. Be the first to apply!