Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Software Engineer, LLM Performance

Parasail

Senior Software Engineer, LLM Performance

SF Bay Area (Hybrid)

Parasail is redefining AI infrastructure by enabling seamless deployment across a distributed network of GPUs, optimizing for cost, performance, and flexibility. Our mission is to empower AI developers with a fast, cost-efficient, and scalable cloud experience—free from vendor lock-in and designed for the next generation of AI workloads.

The Senior Software Engineer, LLM Performance plays a crucial role in delivering a competitive platform by focusing on efficiently scheduling, executing, and managing AI workloads on distributed compute systems. This role is deeply technical, spanning from low-level GPU kernels to distributed AI orchestration and Kubernetes (K8s) deployments. It is about more than optimization; it's about pioneering efficient infrastructure that supports AI's transformative role in reshaping productivity, revolutionizing industries, and addressing some of the world's most challenging problems. You'll ensure that generative AI—including large language models (LLMs), multi-modal models, and diffusion models—operates efficiently at enterprise scale while driving continuous improvements in cost, performance, and sustainability.

Responsibilities
  • Add support for new LLMs, working across the stack from low-level GPU kernels to Kubernetes-based deployments.
  • Contribute to cutting-edge open-source LLM engines such as vLLM or SGLang to extend their capabilities and performance (e.g. use Python technologies to improve API servers or request schedulers).
  • Operate closer to the hardware, focusing on building and integrating solutions to boost performance and hardware utilization. For example, improve attention backends like FlashAttention or FlashInfer by contributing to their development and optimization, or by integrating their solutions into vLLM.
  • Improve LLM performance using advanced algorithmic solutions such as speculative decoding, quantization, or other state-of-the-art techniques. Understand the impact of such techniques in model quality.
Qualifications
  • Expertise in GPU computing, including low-level platforms such as CUDA, ROCm, XLA, PyTorch, Jax, etc.
  • Background in performance analysis and optimization of AI/HPC workloads (e.g. profiling or theoretical analysis of Flops and bandwidth).
  • Experience in writing GPU kernels using technologies like CUDA, CUTLASS, Triton.
  • Strength in Python and C++.
  • Demonstrated contributions to open-source projects. Contributions to inference engines such as vLLM is a strong plus.
  • A production-oriented mindset emphasizing robust, scalable code suitable for enterprise-grade applications.
  • A relentless curiosity about cutting-edge AI technologies combined with a passion for solving complex problems.

What You Bring to the Table: We are looking for people who are eager to learn and master the lower-level compute concepts that are critical for the AI revolution. With us, your skills will not only contribute to coding but will also have a significant impact on the scalability and efficiency of AI applications at large. If you're geared up for the challenge of optimizing AI performance and eager to push our technological prowess to new heights, we're excited to welcome you aboard.

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior Software Engineer, LLM Performance in San Francisco, CA vacancy
  • B Capital is seeking a backend Software Engineer to join the Einstein GPT Team in San Francisco. You will be building platform services for LLM technology supporting CRM cloud applications, focusing on performance, scalability, and efficiency. Ideal candidates have over... 
    Senior
    Performance

    B Capital

    San Francisco, CA
    2 days ago
  •  ...training pipelines, plus top AI researchers who specialize in software engineering, logical reasoning, STEM, multilinguality, multimodality,...  ...of concept into proprietary intelligence with systems that perform reliably, deliver measurable impact, and drive lasting results... 
    Senior
    Performance
    For contractors
    Remote work
    Flexible hours

    Turing

    San Francisco, CA
    2 days ago
  •  ...Francisco is seeking an experienced engineer for its Inference Platform...  ..., driving improvements in AI performance, and utilizing Kubernetes for...  ...have deep experience in software engineering, particularly with...  ...or Go, and be familiar with LLM serving frameworks and deploying... 
    Senior
    Performance

    Fluidstack

    San Francisco, CA
    5 days ago
  •  ...Senior AI/ML Engineer — LLM & Agent Stack Every production AI system, whether it's powering customer...  ...abstracts ML workloads as standard software primitives, so everything runs on...  ...orchestration, service meshes, and performance tuning. ~ Proven track record building... 
    Senior
    Performance

    TrueFoundry

    San Francisco, CA
    1 day ago
  • $50 - $150 per hour

     ...improve how Large Language Models (LLMs) perform on real software engineering problems. The core of this project...  ...required Experience working with LLM-generated code or AI evaluation projects...  ...apply. Seniorit y level ~ Mid-Senior level Employment type ~... 
    Senior
    Performance
    Full time
    Contract work
    Part time
    For contractors
    Flexible hours

    Turing

    San Francisco, CA
    4 days ago
  •  ...the first AI Hardware Engineer. Our goal is to...  ...AI Hardware Engineer, software that can design real,...  ...from a prompt. As a Senior Software Engineer, Agentic...  ...logging for runtime health, performance, and cost tracking....  ...integrating LLM-based systems into complex... 
    Senior
    Performance
    Remote work
    Shift work

    Flux Protocol

    San Francisco, CA
    4 days ago
  •  ...Own reliability, observability, and performance across agents (logging, tracing, instrumentation...  ...iteration across Research, QA, and Engineering. Hard Requirements Senior-level full-stack engineering...  ...ability to design, ship, and scale LLM-powered applications. Expertise in... 
    Senior
    Performance
    Contract work

    Sully

    San Francisco, CA
    4 days ago
  •  ...applicants We are building LLM evaluation and training...  ...train LLM to work on realistic software engineering problems. One of our...  ...codebases locally to assess LLM performance in bug-fixing scenarios....  ...start date as next week Seniority level ~ Seniority level... 
    Senior
    Performance
    Contract work
    For contractors
    Freelance
    Internship
    Remote work

    Turing

    San Francisco, CA
    4 days ago
  • $225k - $405k

     ...Engineering at Ivo Engineers at Ivo are inventors. Ivo was first-to...  ...agentic RAG [2023] • Large-scale LLM-based legal fact extraction [...  ...• Shipping high performance UI code and backend systems that...  ...legal drudgery. People love our software - despite high competition, we... 
    Senior
    Performance
    Contract work
    Work at office
    Remote work

    IVO Inc

    San Francisco, CA
    1 day ago
  • Writer is seeking an AI Engineer to develop and deploy high-performance AI applications, shaping how enterprises harness superintelligence. The ideal candidate has over 5 years in AI systems, proficient in Python, and is experienced with frameworks like PyTorch and TensorFlow... 
    Senior
    Performance

    Writer

    San Francisco, CA
    2 days ago
  • $220k - $320k

     ...squeezing every last drop of performance out of GPUs, diving deep into...  ...well-funded ten-person team of engineers who work in-person in...  ...has founded and run their own software companies. We are high-agency...  ...frameworks (vLLM, SGLang, TensorRT-LLM) and underlying libraries to... 
    Senior
    Performance
    Work at office

    Inference

    San Francisco, CA
    5 days ago
  •  ...vector database for building accurate and performant AI applications at scale in production...  ...the Team and Role: We are hiring a senior software engineer to help design and build core...  ...structured and unstructured data–to modern LLM-powered applications, leveraging the world... 
    Senior
    Performance
    Local area
    Work from home
    Flexible hours

    GrabJobs

    San Francisco, CA
    4 days ago
  • $200k

     ...About the Role: AngelList is seeking Senior Software Engineers to join our Intelligence team. You...  ...model integrations, and safety layers for LLM-driven workflows. Prototype new AI-...  ..., helping to elevate the overall team performance. Working Here If you don't... 
    Senior
    Performance
    Work at office
    2 days per week

    AL Talent, Inc. (d/b/a Wellfound)

    San Francisco, CA
    1 day ago
  •  ...Senior Software Engineer, Fullstack SF Bay Area (Hybrid) Parasail is redefining AI infrastructure...  ...network of GPUs, optimizing for cost, performance, and flexibility. Our mission is to...  ...Familiarity (Nice to Have): Understanding of LLM technologies, inference optimization (... 
    Senior
    Performance

    Parasail

    San Francisco, CA
    1 day ago
  • $160k - $250k

    Senior Software Engineer Salary: $160K - $250K + Equity Company: Series B backed multimodal AI lab...  ...production codebase. Optimize system performance by centralizing inter‑process communication...  ...concepts and has experience with LLM frameworks or WebRTC video streaming.... 
    Senior
    Performance

    Jack & Jill/External ATS

    San Francisco, CA
    2 days ago
  •  ...for downtime. We are looking for a Senior Backend Engineer who is excited by the full breadth of...  ...requirements that don't fit neatly into a spec, performance bottlenecks that are hard to reproduce...  ...specific platform features, including LLM deployment workflows and inference-... 
    Senior
    Performance
    Worldwide
    Flexible hours

    FriendliAI Corp

    San Francisco, CA
    1 day ago
  • $160k - $180k

     ...the first in-house backend engineer - immediate ownership and growth...  ...San Francisco and hiring a Senior Software Engineer (TypeScript / Node....  ...a focus on scalability and performance Architect and manage...  ...agentic AI coding systems, LLM integrations, or AI-assisted... 
    Senior
    Performance
    Local area
    Immediate start
    Flexible hours
    2 days per week

    Jobot

    San Francisco, CA
    5 days ago
  •  ...Senior Software Engineer AirOps is the first end-to-end content engineering platform built for the...  ...Ruby on Rails to building intuitive, performant interfaces with React. In particular...  ...Experience working on AI/ML or LLM-based products Familiarity with CI/... 
    Senior
    Performance
    Remote work
    Flexible hours
    Shift work

    AirOps

    San Francisco, CA
    1 day ago
  •  ...significantly outperforms individual engineers. We combine language models...  ...to push the boundaries of software development efficiency and...  ...features Ensure reliability, performance, and security across systems...  ...Experience integrating AI/LLM-based systems into product workflows... 
    Senior
    Performance
    Remote work

    CodeRabbit

    San Francisco, CA
    1 day ago
  • $152k - $230k

     ...’s why we’re on a mission to engineer a frictionless, next-generation...  ...here. We’re hiring 2 x Senior Full Stack Engineer to own the...  ...customer-facing features, high-performance APIs, and reliable backend...  ...Bring AI to Life: Integrate LLM APIs and innovative agentic frameworks... 
    Senior
    Performance
    H1b
    Worldwide
    Flexible hours

    GrabJobs

    San Francisco, CA
    2 days ago
  • $216k - $270k

     ...private evaluations. About Data Engine Our Generative AI Data...  ...that push the boundaries of LLM capabilities, to optimizing...  ...: ~5+ years of software engineering experience, ideally...  ...scale ~ Drive reliability and performance across critical infrastructure... 
    Senior
    Performance
    Full time

    Scale AI

    San Francisco, CA
    4 days ago
  • $160k - $190k

     ...you. Job Summary Vanilla is seeking a Senior Software Engineer - AI Applications with a strong background...  ...optimize queries, and manage database performance. Project management: You must be an...  ...or LangGraph. Experienced with LLM orchestration tooling and decision frameworks... 
    Senior
    Performance
    Full time
    Work experience placement
    Work at office
    Local area
    Remote work
    Home office
    Flexible hours

    GrabJobs

    San Francisco, CA
    4 days ago
  • Engineering at Ivo Engineers at Ivo are inventors. Ivo was first-to-market...  ...RAG [2023] Large-scale LLM-based legal fact extraction [...  ...showed her this) [2025] Role Our software delights users, and our...  ...enabled interface Writing high performance UI code Implementing... 
    Senior
    Performance
    Contract work

    Ivo Inc.

    San Francisco, CA
    5 days ago
  • $120k - $150k

     ...Full Stack Software Engineer JLL empowers you to shape a brighter way. Our people at JLL...  ...architecture, modern API development, performance optimization, and high development standards...  .... Exposure to prompt engineering, LLM integration, or AI tools is an advantage... 
    Senior
    Performance
    Daily paid
    Shift work

    JLL Technology Solutions (formerly BRG)

    San Francisco, CA
    1 day ago
  •  ...Senior AI Engineer Disney Entertainment and ESPN Product & Technology Technology is at the...  ...technology and products – driving advertising performance, innovation, and value in Disney's...  ...This role blends backend engineering, LLM orchestration, and developer enablement... 
    Senior
    Performance

    The Walt Disney Studios

    San Francisco, CA
    1 day ago
  • $149.2k - $220k

     ...AI/ML scientists, and other engineers to define requirements and complete...  ...our team in building great software. If you enjoy working on...  ...resolving complex architectural and performance bottlenecks across the entire...  ...implementing and leveraging LLM and agent orchestration... 
    Senior
    Performance
    For contractors
    Work at office
    Local area
    Remote work
    Worldwide
    Flexible hours
    Shift work

    GrabJobs

    San Francisco, CA
    1 day ago
  • $160k - $230k

     ...Senior Software Engineer - Together Cloud Infrastructure San Francisco About the Role Together...  ...AI lifecycle, combining the fastest LLM inference engine with state-of-the-art...  ...Responsibilities Design, build, and maintain performant, secure, and highly-available backend... 
    Senior
    Performance
    Full time
    Remote work

    Together AI

    San Francisco, CA
    1 day ago
  • Rippletide is hiring a Senior Software Engineer in San Francisco to build the core systems that power...  ...agents production-ready. You will work on performance-critical backend infrastructure, APIs,...  ...Background in AI/ML infrastructure, LLM tooling, or agent orchestration... 
    Senior
    Performance
    Work at office
    Remote work
    Relocation
    Flexible hours

    Rippletide SAS

    San Francisco, CA
    1 day ago
  • $140k - $160k

     ...Senior Software Engineer — Development Team Location: Remote - Bay Area (Occasional Office Visits...  ...Build and maintain high-performance backend services in Go Design, query...  ...they become problems Leverage AI and LLM tooling as a force multiplier — you treat... 
    Senior
    Performance
    Full time
    Live in
    Work at office
    Remote work

    GrabJobs

    San Francisco, CA
    5 days ago
  • AI Chopping Block, Inc. is seeking a Senior Software Engineer to enhance AI-driven operational capabilities. This hybrid position requires expertise...  ...production systems, data pipelines, and evaluating AI performance. Ideal candidates have 5+ years in engineering, strong... 
    Senior
    Performance

    AI Chopping Block, Inc.

    San Francisco, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Software Engineer, LLM Performance. Be the first to apply!