Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Software Engineer, ML Performance & Systems

$180k - $250k

fal

fal is the generative media ecosystem powering the next generation of AI products. We build the infrastructure, tools, and model access that teams need to move from idea to production, and do it at scale without compromise. For developers and enterprises, fal is the foundation that makes generative media not just possible, but practical: a unified platform where high-performance inference, orchestration, and observability come together to unlock new categories of AI-native products.

As generative media reshapes industries across a market projected to grow by hundreds of billions over the next decade, fal is becoming the ecosystem that ambitious teams build on.
About this role:

Help fal maintain its frontier position on model performance for generative media models. Design and implement novel approaches to model serving architecture on top of our in-house inference engine, focusing on maximizing throughput while minimizing latency and resource usage. Develop performance monitoring and profiling tools to identify bottlenecks and optimization opportunities. Work closely with our Applied ML team and customers (frontier labs on the media space) and make sure their workloads benefit from our accelerator.

Key Responsibilities:
  • Help fal maintain its frontier position on model performance for generative media models.
  • Design and implement novel approaches to model serving architecture on top of our in-house inference engine, focusing on maximizing throughput while minimizing latency and resource usage.
  • Develop performance monitoring and profiling tools to identify bottlenecks and optimization opportunities.
  • Work closely with our Applied ML team and customers (frontier labs on the media space) and make sure their workloads benefit from our accelerator.
Requirements:
  • Strong foundation in systems programming with expertise in identifying and fixing bottlenecks.
  • Deep understanding of cutting edge ML infrastructure stack (anything from PyTorch, TensorRT, TransformerEngine to Nsight), including model compilation, quantization, and serving architectures. Ideally following closely the developments in all these systems as they happen.
  • Have a fundamental view of the underlying hardware (Nvidia based systems at the moment), and when necessary go deeper into the stack to fix bottlenecks (custom GEMM kernels with CUTLASS for common shapes).
  • Proficient in Triton or willingness to learn with comparable experience in lower-level accelerator programming.
  • New frontier: multi-dimensional model parallelism (combining multiple parallelism techniques like TP with context parallel / sequence parallel).
  • Familiar with internals of Ring Attention, FA3, FusedMLP implementations.
What we offer at fal:
  • Interesting and challenging work
  • Competitive salary and equity
  • A lot of learning and growth opportunities
  • We offer relocation assistance to San Francisco.
  • Health, dental, and vision insurance (US)
  • Regular team events and offsite
Compensation:
  • $180,000 - $250,000 + equity + comprehensive benefits package
Location:
  • We are currently hiring in downtown San Francisco.
Vacancy posted 17 hours ago
Similar jobs that could be interesting for youBased on the Staff Software Engineer, ML Performance & Systems in San Francisco, CA vacancy
  • $144k - $288k

     ...Staff Software Engineer, Scientific System of Record Cambridge, MA USA; San Francisco, CA USA Your Impact...  ...developing user interfaces, services, high-performance APIs, databases, and reliable...  ...workflows. You'll work closely with ML researchers, platform engineers,... 
    Performance
    Full time
    Work at office
    Local area
    Flexible hours

    Lila Sciences

    San Francisco, CA
    5 days ago
  • $175k - $225k

     ...We are seeking a Staff Software Engineer to join a well-funded, early-stage technology...  ...and optimize core backend systems that orchestrate complex...  ...(e.g., Spark, Kafka), or ML infrastructure/MLOps...  ...skills with an emphasis on performance, reliability, and long-term... 
    Performance

    Murphy Talent Group

    San Francisco, CA
    4 days ago
  • $192k - $260k

     ...companies in the world. Our engineering teams build highly technical...  ...operate one of the largest scale software platforms. The fleet...  ...hardware, network, and operating system faults, and our software...  ...engines in relational query performance, yet provide the expressiveness... 
    Performance
    Work at office
    Local area
    Worldwide

    Databricks Inc.

    San Francisco, CA
    1 day ago
  • $200k - $300k

    Gravity is looking for a Staff Software Engineer in San Francisco to join their innovative team. This...  ...scaling the ad auction engine and shipping ML models for advanced analytics. Ideal...  ...software engineering, especially with ML systems. Compensation is competitive, ranging... 
    Suggested

    Gravity

    San Francisco, CA
    2 days ago
  •  ...building a future where engineers are focused on...  ...veterans in AI, software, and infrastructure...  ...We're hiring a Staff Software Engineer,...  ...frameworks to measure performance and reliability...  ...Develop self-improving systems that can adapt...  ...have a foundation in ML fundamentals. You... 
    Performance

    Cleric

    San Francisco, CA
    4 days ago
  •  ...transform the way transit systems and other government agencies...  ...the job involves As a Staff Software Engineer on the Perception team, you...  ..., system reliability, performance, and long-term maintainability...  ...-the-art Machine Learning (ML) and Computer Vision (CV) models... 
    Performance

    Hayden AI

    San Francisco, CA
    1 day ago
  •  ...intersection of product and engineering– you'll own features...  ...'ll be designing AI systems that work reliably in...  ...Optimize system performance, cost, and latency as...  ...Qualifications ~4+ years of software engineering experience...  ...1-2 years building AI/ML-powered products ~... 
    Performance

    Emergence Capital

    San Francisco, CA
    3 days ago
  • $251k - $310k

     ...Staff Software Engineer, Quantitative Evaluation Waymo is an autonomous driving...  ...signals to measure the performance and driving qualities of the...  ...to navigate complex systems and pursue open-ended problems...  ...in C++ Experience with ML Experience with A/B experiment... 
    Performance
    Full time
    Remote work

    Waymo

    San Francisco, CA
    3 days ago
  • $220k - $300k

     ...Senior/Staff Software Engineer, AI/ML Location: New York, NY / San Francisco, CA / Seattle, WA (Hybrid...  ...Build production-grade LLM systems using RAG, agent frameworks, orchestration...  ...scalability, safety, reliability, and performance Collaborate cross-functionally with... 
    Performance
    Work at office
    Remote work

    Recruiting from Scratch

    San Francisco, CA
    3 days ago
  •  ...produce production-grade ML solutions trained on a...  ...w.r.t cost and performance Integrate ML solutions...  ...solutions with our production systems; at the edge and in...  ...them into production software systems Hands-on...  ...optimization) Strong software engineering fundamentals Nice... 
    Performance
    Flexible hours

    Hivemapper

    San Francisco, CA
    1 day ago
  • $150k - $230k

     ...unified platform where high-performance inference, orchestration, and...  ...role As a Forward Deployed Engineer on Serverless, you will work...  ...to understand complex systems and communicate clearly with...  ...deployments Experience with AI/ML workloads in production Experience... 
    Performance
    Currently hiring
    Relocation package

    fal

    San Francisco, CA
    4 days ago
  •  ...Join Onton as a Founding Engineer and set the strategic foundation...  ...to continually enhance our system, yourself, and the rest of the...  ...Fine-tuning an ensemble of ML models to perform object detection on...  ...is passionate about making software tools accessible to all, we... 
    Performance
    Full time
    Work at office
    Local area
    Remote work
    Relocation
    3 days per week

    Onton

    San Francisco, CA
    4 days ago
  •  ...Staff AI/ML Engineer Rippling gives businesses one place to run HR, IT,...  ...together all of the workforce systems that are normally scattered...  ...amplify Rippling's high-performance GTM engine — from recommendation...  ...Will Need 7+ years of software engineering experience,... 
    Performance
    Work at office
    Immediate start
    3 days per week

    Rippling

    San Francisco, CA
    3 days ago
  • $251k - $310k

     ...Staff Software Engineer, Capacity Optimization Waymo is an autonomous driving...  ...for building the technical systems that forecast demand,...  ...simulation environment is both high-performance and cost-effective. You...  ...scale. Familiarity with ML-driven forecasting and... 
    Performance
    Full time
    Remote work
    Shift work

    Waymo

    San Francisco, CA
    5 days ago
  • $192k - $260k

     ...ingestion to ETL, BI, and all the way up to ML/AI with a unified platform. To...  ...the next generation (decoupled) query engine and structured storage system that can outperform specialized data warehouses in relational query performance, yet retain the expressiveness and of... 
    Performance
    Local area
    Worldwide

    Databricks

    San Francisco, CA
    3 days ago
  • $240k - $270k

     ...Staff Software Engineer - EC Lifecycle Redwood City, CA (Hybrid); San Francisco, CA (Hybrid...  ...achieving differentiation, high performance, and production-ready systems. We work with some of the world's...  ...product managers, designers, and ML experts to create a phenomenal... 
    Performance
    Work at office
    Local area
    3 days per week

    Snorkel AI

    San Francisco, CA
    3 days ago
  • $189k - $274k

     ...and accessible for all. As a Staff Software Engineer focusing on Deep Learning...  ...pivotal role in enhancing the performance of Deep Learning networks...  ...Autonomous Vehicle (AV) systems. Your primary responsibility...  ...experience in optimizing DL/ML workloads at the framework... 
    Performance
    Work at office
    Local area
    3 days per week

    Aurora Innovation

    San Francisco, CA
    2 days ago
  • $251k - $329k

     ...Staff Software Engineer - Video Performance - (Bay Area Only) Join the team redefining how the world experiences...  ...engine, GLSL/HLSL shaders, and ML-powered effect pipelines to reduce...  ...Design and build observability systems including internal profiling tools... 
    Performance
    Work at office
    Local area
    Remote work
    Flexible hours

    Canva

    San Francisco, CA
    1 day ago
  • $320k - $405k

     ...Staff Software Engineer, Android San Francisco, CA | New York City, NY | Seattle...  ..., and steerable AI systems. We want AI to be safe and...  ...AI technologies Optimize performance at all levels of the mobile...  ...applications that utilize modern ML/AI technology ~3D... 
    Performance
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    1 day ago
  •  ...Staff Software Engineer Unstructured is defining the standard for enterprise...  ...complex and high-impact systems we build. Drive long-term architectural...  ...that enables scale, performance, and simplicity. Be a...  ...Points: Experience in AI/ML systems, unstructured data,... 
    Performance
    Flexible hours

    Unstructured

    San Francisco, CA
    3 days ago
  • $320k

     ...and steerable AI systems. We want AI to be...  ...committed researchers, engineers, policy experts,...  ...'s research. As a Software Engineer on the...  ...maintaining reliability and performance Build the...  ...Note: Prior AI/ML infrastructure experience...  ..., we expect all staff to be in one of... 
    Performance
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    5 days ago
  • $252k - $315k

     ...looking for a strong engineer to join our team and help...  ...understanding of software engineering principles...  ...large-scale distributed systems. You will be responsible...  ...with LLMs and ML models. You will solve...  ...qualifications, interview performance, and relevant education... 
    Performance
    Full time

    Scale AI

    San Francisco, CA
    10 days ago
  • $190.9k - $232.8k

     ...About This Role As a staff software engineer for GenAI inference, you will...  ...and orchestration systems. What You Will Do...  ...6+ years or equivalent) in performance-critical systems Proven...  ...end Deep understanding of ML inference internals: attention... 
    Performance
    Local area
    Worldwide

    Databricks

    San Francisco, CA
    5 days ago
  • $240k - $310k

     ...Staff Software Engineer Crusoe is on a mission to accelerate the abundance of...  ...strategies, and be part of a high-performing team that believes in each...  ...a cohesive roadmap. System Programming Expertise:...  ...Points Public Cloud & AI/ML: Expertise in one or more Public... 
    Performance
    Temporary work

    Crusoe

    San Francisco, CA
    7 days ago
  •  ...seeking a product-minded Staff Software Engineer to join our San Francisco...  ...monitoring, and observability for ML services within the sandbox...  ...implementations for performance, scalability, and developer...  ..., Engineering, Information Systems, or equivalent industry experience... 
    Performance
    Work at office
    Remote work
    Flexible hours

    Workato

    San Francisco, CA
    1 day ago
  • $190.9k - $232.8k

     ...P-1285 About This Role As a staff software engineer for GenAI Performance and Kernel, you will own the design, implementation, optimization...  ...engineering. You will work closely with ML researchers, systems engineers, and product teams to push the state-of-... 
    Performance
    Local area
    Worldwide

    Databricks

    San Francisco, CA
    5 days ago
  •  ...and development of agentic AI systems that deliver personalized...  ...between clinical science and software engineering, building the engine that helps...  ...technologies to optimize performance and improve the platform....  ...surfaces, backend services, and AI/ML systems. Are technically... 
    Performance
    Local area
    Immediate start

    Hinge Health

    San Francisco, CA
    4 days ago
  •  ...About the Role As a Staff Software Engineer , you will help architect...  ...builds and scales intelligent systems. You'll play a key...  ...operate production-grade AI/ML systems, including RAG (Retrieval...  ...systems, system design, and performance optimization. ~ Hands-on... 
    Performance
    Temporary work
    Flexible hours

    Demandbase

    San Francisco, CA
    5 days ago
  • $248.4k - $310.5k

     ...Staff Software Engineer, Full-Stack - Enterprise Gen AI Scale GP (Scale Generative...  ...Platform (SGP), a powerful system that enables businesses to...  ...and optimize polished, high-performance UIs using Next.js, React,...  ...managers, designers, and AI/ML teams to create seamless,... 
    Performance
    Full time

    Scale AI

    San Francisco, CA
    1 day ago
  • $195k - $293k

     ...Join us! We are seeking a Staff Software Engineer with deep expertise in graph theory, graph-based systems, and large-scale social graph...  ...closely with Data Science, ML, Product, and Infrastructure...  ...and edges with low-latency performance. Design and ship pipelines... 
    Performance
    Full time
    Work at office
    Local area
    Remote work
    Relocation
    Flexible hours
    3 days per week

    GoFundMe

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Software Engineer, ML Performance & Systems. Be the first to apply!