Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Software Engineer, ML Performance & Systems

$180k - $250k

fal

Staff Software Engineer, ML Performance & Systems

San Francisco

fal is the generative media ecosystem powering the next generation of AI products. We build the infrastructure, tools, and model access that teams need to move from idea to production, and do it at scale without compromise. For developers and enterprises, fal is the foundation that makes generative media not just possible, but practical: a unified platform where high-performance inference, orchestration, and observability come together to unlock new categories of AI-native products.

As generative media reshapes industries across a market projected to grow by hundreds of billions over the next decade, fal is becoming the ecosystem that ambitious teams build on.

About This Role:

Help fal maintain its frontier position on model performance for generative media models. Design and implement novel approaches to model serving architecture on top of our in-house inference engine, focusing on maximizing throughput while minimizing latency and resource usage. Develop performance monitoring and profiling tools to identify bottlenecks and optimization opportunities. Work closely with our Applied ML team and customers (frontier labs on the media space) and make sure their workloads benefit from our accelerator.

Key Responsibilities:

  • Help fal maintain its frontier position on model performance for generative media models.
  • Design and implement novel approaches to model serving architecture on top of our in-house inference engine, focusing on maximizing throughput while minimizing latency and resource usage.
  • Develop performance monitoring and profiling tools to identify bottlenecks and optimization opportunities.
  • Work closely with our Applied ML team and customers (frontier labs on the media space) and make sure their workloads benefit from our accelerator.

Requirements:

  • Strong foundation in systems programming with expertise in identifying and fixing bottlenecks.
  • Deep understanding of cutting edge ML infrastructure stack (anything from PyTorch, TensorRT, TransformerEngine to Nsight), including model compilation, quantization, and serving architectures. Ideally following closely the developments in all these systems as they happen.
  • Have a fundamental view of the underlying hardware (Nvidia based systems at the moment), and when necessary go deeper into the stack to fix bottlenecks (custom GEMM kernels with CUTLASS for common shapes).
  • Proficient in Triton or willingness to learn with comparable experience in lower-level accelerator programming.
  • New frontier: multi-dimensional model parallelism (combining multiple parallelism techniques like TP with context parallel / sequence parallel).
  • Familiar with internals of Ring Attention, FA3, FusedMLP implementations.
What We Offer At Fal:
  • Interesting and challenging work
  • Competitive salary and equity
  • A lot of learning and growth opportunities
  • We offer relocation assistance to San Francisco.
  • Health, dental, and vision insurance (US)
  • Regular team events and offsite
Compensation:
  • $180,000 - $250,000 + equity + comprehensive benefits package
Location:
  • We are currently hiring in downtown San Francisco.
Vacancy posted 19 days ago
Similar jobs that could be interesting for youBased on the Staff Software Engineer, ML Performance & Systems in San Francisco, CA vacancy
  • $170k - $277.5k

     ...Skydio Drone Engineer Skydio is the leading US drone company and...  ..., best-in-class hardware and software product development, operational...  ...the Team We design camera systems that support autonomous...  ...up, low-level debugging, and performance optimization Commitment to... 
    Performance
    Full time
    Local area
    Relocation package

    Skydio

    San Francisco, CA
    3 days ago
  • $192k - $260k

     ...companies in the world. Our engineering teams build highly technical...  ...operate one of the largest scale software platforms. The fleet...  ...hardware, network, and operating system faults, and our software...  ...engines in relational query performance, yet provide the expressiveness... 
    Performance
    Work at office
    Local area
    Worldwide

    Databricks Inc.

    San Francisco, CA
    2 days ago
  • $200k

     ...Staff Perception Software Engineer (Aerospace) San Francisco Bay Area, CA | Full-time...  ...advanced aerospace perception systems for real-time...  ...aerospace applications using ML and first principles Help...  ...Optimize models for low-latency performance on resource-constrained... 
    Performance
    Full time

    Apera

    San Francisco, CA
    10 hours ago
  •  ...produce production-grade ML solutions trained on a...  ...w.r.t cost and performance Integrate ML solutions...  ...solutions with our production systems; at the edge and in...  ...them into production software systems Hands-on...  ...optimization) Strong software engineering fundamentals Nice... 
    Performance
    Flexible hours

    Hivemapper

    San Francisco, CA
    2 days ago
  • $150k - $230k

     ...Staff Software Engineer, Forward Deployed fal is the generative media ecosystem...  ...platform where high-performance inference, orchestration, and...  ...Ability to understand complex systems and communicate clearly...  ...deployments Experience with AI/ML workloads in production... 
    Performance
    Currently hiring
    Relocation package

    fal

    San Francisco, CA
    5 days ago
  • $192k - $260k

     ...ingestion to ETL, BI, and all the way up to ML/AI with a unified platform. To...  ...the next generation (decoupled) query engine and structured storage system that can outperform specialized data warehouses in relational query performance, yet retain the expressiveness and of... 
    Performance
    Local area
    Worldwide

    Databricks

    San Francisco, CA
    4 days ago
  • $251k - $310k

     ...Staff Software Engineer, Quantitative Evaluation Waymo is an autonomous driving...  ...signals to measure the performance and driving qualities of the...  ...to navigate complex systems and pursue open-ended problems...  ...in C++ Experience with ML Experience with A/B experiment... 
    Performance
    Full time
    Remote work

    Waymo

    San Francisco, CA
    4 days ago
  •  ...while supporting end-to-end system reliability, real-time...  ...orchestration, high-security software integration, and the resilient...  ...for the long‑term performance and reliability of AI use cases...  ...evolution: Partner with our Engineering and ML teams to ensure the lessons... 
    Performance

    AI Chopping Block, Inc.

    San Francisco, CA
    5 days ago
  • $220.4k - $297.4k

     ...central to their missions. Our engineering teams build highly technical...  ...trusted data analytics and ML platform in the world. Security...  .... Experience building systems at large scale internet companies...  ...include eligibility for annual performance bonus, equity, and the benefits... 
    Performance
    Local area

    I did my part and supported the Regular Toilet

    San Francisco, CA
    5 days ago
  •  ...Staff+ Software Engineer, Inference Runtime Remote-Friendly (Travel-Required...  ...interpretable, and steerable AI systems. We want AI to be safe and...  ...serving stack, whose performance, correctness, and abstractions...  ...in systems engineering or ML infrastructure, with the ability... 
    Performance
    Work at office
    Remote work
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    5 days ago
  • $190.72k - $286.08k

     ...from leadership to engineers — and work...  ...Role Overview As a Staff Database Administrator...  ...database safety and performance. This role is based...  ...and drive systemic improvements—triage...  ...clarity Bonus Strong software engineering skills...  ...Kafka) relevant to AI/ML workloads in... 
    Performance
    Relocation package

    Harvey

    San Francisco, CA
    5 days ago
  • $251k - $329k

     ...likely evolve. Lead end-to-end performance strategy across the video creation...  ...optimize critical paths in our C++ native engine, GLSL/HLSL shaders, and ML-powered effect pipelines to reduce...  .... Design and build observability systems including internal profiling tools... 
    Performance
    Work at office
    Local area
    Flexible hours

    Blackbird

    San Francisco, CA
    3 days ago
  • $180k - $315k

     ...all of the workforce systems that are normally scattered...  ...Team The Growth Engineering team builds world-...  ...Rippling's high-performance GTM engine - from recommendation...  ...We're seeking a Staff AI/ML Engineer to architect...  ...~7+ years of software engineering experience... 
    Performance
    Work at office
    Immediate start
    3 days per week

    Rippling

    San Francisco, CA
    3 days ago
  • $181k - $271k

     ...0. Join us! We are seeking a Staff Software Engineer with deep expertise in graph theory, graph-based systems, and large-scale social graph...  ...collaborate closely with Data Science, ML, Product, and Infrastructure...  ...and edges with low-latency performance. Design and ship pipelines... 
    Performance
    Full time
    Remote work
    Flexible hours

    GoFundMe

    San Francisco, CA
    5 days ago
  •  ...seeking a product-minded Staff Software Engineer to join our San Francisco...  ...monitoring, and observability for ML services within the sandbox...  ...implementations for performance, scalability, and developer...  ..., Engineering, Information Systems, or equivalent industry experience... 
    Performance
    Work at office
    Remote work
    Flexible hours

    Workato

    San Francisco, CA
    2 days ago
  • $320k

     ...and steerable AI systems. We want AI to be...  ...committed researchers, engineers, policy experts,...  ...'s research. As a Software Engineer on the...  ...maintaining reliability and performance Build the...  ...Note: Prior AI/ML infrastructure experience...  ..., we expect all staff to be in one of... 
    Performance
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    1 day ago
  •  ...into sensor fusion and ML models at the edge....  ...clients develop and deploy software directly to our...  ...speed LTE modem. We seek engineers experienced in sensor-...  ...based mapping stacks and systems to help us accelerate...  ...rapid iteration and performance monitoring across tens... 
    Performance
    Worldwide
    Flexible hours

    Hive

    San Francisco, CA
    5 days ago
  • $320k - $405k

     ...Staff Software Engineer, Android San Francisco, CA | New York City, NY | Seattle...  ..., and steerable AI systems. We want AI to be safe and...  ...AI technologies Optimize performance at all levels of the mobile...  ...applications that utilize modern ML/AI technology ~3D... 
    Performance
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    5 days ago
  • $251k - $310k

     ...Staff Software Engineer, Capacity Optimization Waymo is an autonomous driving...  ...for building the technical systems that forecast demand,...  ...simulation environment is both high-performance and cost-effective. You...  ...scale. Familiarity with ML-driven forecasting and... 
    Performance
    Full time
    Remote work
    Shift work

    Waymo

    San Francisco, CA
    1 day ago
  • $300 per month

     ...strategies, and be part of a high-performing team that believes in each...  ...with us at Crusoe. Senior Staff Software Engineer, Storage San Francisco,...  ...them into a cohesive roadmap. System Programming Expertise: Leverage...  ...Points Public Cloud & AI/ML: Expertise in one or more Public... 
    Performance

    Crusoe Energy Systems LLC

    San Francisco, CA
    5 days ago
  • $300 per month

     ...Staff Software Engineer Crusoe is on a mission to accelerate the abundance of...  ...strategies, and be part of a high-performing team that believes in each...  ...Management Intelligence systems, shaping implementation, making...  ...in GPU compute, AI/ML platform infrastructure, or... 
    Performance
    Temporary work

    Crusoe

    San Francisco, CA
    17 days ago
  •  ...feed sensor fusion and ML models at the edge....  ...customers develop and deploy software directly to our...  ...cutting edge GNSS/IMU performance, hardware acceleration...  ...We are looking for engineers with experience across...  ...mapping stacks and mapping systems to help us accelerate... 
    Performance
    Worldwide
    Flexible hours

    Hivemapper

    San Francisco, CA
    4 days ago
  • $230k - $280k

     ...the world capable of performing thousands of tasks across...  ...of the world’s best engineers and operators. If you...  ...a deeply experienced software engineer that will design...  ...the application systems, integrations, and orchestration...  ...robotics engineers, ML engineers, multi‑robot... 
    Performance
    Local area
    Flexible hours

    Nimble Robotics

    San Francisco, CA
    4 days ago
  •  ...Staff Engineer Proactive Communications Join the team that...  ...Interventions: Partner with ML scientists to architect systems that integrate...  ...Scale: Optimize system performance to ensure the reliability...  ...and monitoring for both software services and ML model performance... 
    Performance
    Full time
    Internship
    Work at office
    Local area
    Immediate start
    Remote work
    Worldwide
    3 days per week

    Hinge Health

    San Francisco, CA
    9 days ago
  •  ...Role As a founding Applied AI Engineer at Valence, you will help...  ..., and refine intelligent systems that deliver context-aware,...  ...directly impact how our platform performs in high-stakes enterprise...  ...8+ years of experience in software engineering, AI/ML, data-intensive systems, AI... 
    Performance
    Full time
    Freelance
    Remote work

    Valence

    San Francisco, CA
    5 days ago
  • $240k - $300k

     ...Staff Product Engineer Traba is the AI operating layer for the industrial supply...  ...data: by connecting to the systems running across every...  ...You have designed and built performant, scalable applications, and...  ...vetting pipelines powered by ML and AI agents. Own architecture... 
    Performance
    Temporary work
    Local area
    Flexible hours
    Shift work

    Traba

    San Francisco, CA
    4 days ago
  • $500 per month

     ...Staff And Principal Software Engineers Unstructured is looking for Staff and Principal...  ...someone who thrives in deep systems work—someone who could write...  ...'s systems are performant, resilient, and ready to support...  ...high-performance data or AI/ML systems—especially those involving... 
    Performance
    Work from home
    Flexible hours

    Unstructured

    San Francisco, CA
    4 days ago
  •  ...Principal Staff Engineer As a Principal Staff Engineer...  ...enhance innovative, high-performance, AI-driven platform,...  ...of building scalable systems, deep experience building...  ..., and user-centric software solutions. Lead architectural...  ...implement advanced AI/ML methods in... 
    Performance

    JBA International

    San Francisco, CA
    3 days ago
  • $192k - $260k

     ...platform to deploy and manage AI/ML models - from traditional ML...  ...and cost efficiency. As a Staff Engineer, you'll play a critical role...  .... You will design and build systems that enable high-throughput,...  ...decisions and trade-offs to optimize performance, throughput, autoscaling, and... 
    Performance
    Local area
    Worldwide

    Databricks

    San Francisco, CA
    5 days ago
  • $200k - $400k

     ...power Decagon: networking, data, ML serving, developer platform,...  ...highscale, lowlatency systems with clear SLOs and great developer...  ...a Senior Infrastructure Engineer to design, build, and operate...  ...endtoend, improve reliability and performance, and create pavedpaths that... 
    Performance
    Full time
    Work at office
    Local area

    Decagon

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Software Engineer, ML Performance & Systems. Be the first to apply!