Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, Inference

Trypulse

Overview Pulse is tackling one of the most persistent challenges in data infrastructure: extracting accurate, structured information from complex documents at scale. We have a breakthrough approach to document understanding that combines intelligent schema mapping with fine-tuned extraction models where legacy OCR and other parsing tools consistently fail. We are a small, fast-growing team of engineers in San Francisco powering Fortune 100 enterprises, YC startups, public investment firms, and growth-stage companies. We are backed by tier 1 investors and growing quickly. What makes our tech special is our multi-stage architecture: Layout understanding with specialized component detection models Low-latency OCR models for targeted extraction Advanced reading-order algorithms for complex structures Proprietary table structure recognition and parsing Fine-tuned vision-language models for charts, tables, and figures If you are passionate about the intersection of computer vision, NLP, and data infrastructure, your work at Pulse will directly impact customers and shape the future of document intelligence. What we are looking for 5 days in-office at our San Francisco office Eager to learn and adapt quickly Prior startup or founding experience is a plus About the Role Specialize in low-latency, high-throughput inference for OCR and multimodal models. Own profiling, batching, and autoscaling across single-tenant and multi-tenant environments. Responsibilities Build inference services with smart batching and caching Optimize kernels, tokenization, and model graphs Evaluate vLLM, TensorRT LLM, and Triton tradeoffs Implement autoscaling and admission control with clear SLOs Own performance dashboards and capacity planning Requirements 3+ years in performance engineering or ML systems Strong Python, plus C++ or CUDA exposure Experience with GPU profiling and model serving Nice to have Experience reducing p95 and cost in production ML systems Sponsorship Sponsorship available. Compensation and benefits Competitive base salary plus equity, performance-based bonus, relocation assistance for Bay Area moves, daily meal stipend, medical, vision, and dental coverage. #J-18808-Ljbffr Trypulse

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Software Engineer, Inference in San Francisco, CA vacancy
  • The Consensus is looking for a Software Engineer to join our Inference Stack team in San Francisco. You will help develop the infrastructure that powers large-scale LLM inference, ensuring scalability and reliability in our systems. This role is ideal for engineers who... 
    Suggested

    The Consensus

    San Francisco, CA
    2 days ago
  •  ...BASETEN Baseten powers mission-critical inference for the world's most dynamic AI companies...  .... Join us and help build the platform engineers turn to to ship AI products. THE ROLE Baseten...  ..., reliability, and ease of use. As a Software Engineer on the Inference Stack team,... 
    Suggested
    Flexible hours

    The Consensus

    San Francisco, CA
    2 days ago
  • About the Team We’re hiring a Developer Productivity engineer to support the company’s Inference Runtime teams. These teams own the systems responsible for serving models reliably, efficiently, and safely across Codex, ChatGPT, API, and internal research workloads. We’... 
    Suggested

    United States Digital Space LLC

    San Francisco, CA
    2 days ago
  • $320k

    About the Role The Cloud Inference team scales and optimizes Claude to serve the massive audiences...  ..., and day‑to‑day operations. Our engineers are extremely high leverage: we...  ...Be a Good Fit If You Have significant software engineering experience, with a strong background... 
    Suggested
    Visa sponsorship

    United States Digital Space LLC

    San Francisco, CA
    6 days ago
  • $150k - $230k

     ...Model Performance team. The role involves designing and operating Model APIs to enhance AI model performance focusing on advanced inference capabilities. The ideal candidate should have over 3 years of experience in distributed systems or APIs and strong communication skills... 
    Suggested

    Dormont Manufacturing Co

    San Francisco, CA
    2 days ago
  • $300k

     ...growing group of committed researchers, engineers, policy experts, and business leaders working...  ...AI systems. About the role Our Inference team is responsible for building and maintaining...  ...customers. Qualifications Significant software engineering experience, particularly... 
    Work at office
    Worldwide
    Visa sponsorship

    United States Digital Space LLC

    San Francisco, CA
    2 days ago
  • Software Engineer (AI Infrastructure / Training / Inference) About the Role We are hiring Software Engineers focused on AI Infrastructure to build the systems that enable frontier multimodal AI to operate reliably at production scale. This role exists because modern generative... 

    SpreeAI

    San Francisco, CA
    7 days ago
  • $320k

     ...growing group of committed researchers, engineers, policy experts, and business leaders working...  ...About the Role Our mandate is to make inference deployment boring and unattended. the...  ...continuous and unattended. As a Software Engineer on the Launch Engineering team,... 
    Visa sponsorship
    Shift work

    United States Digital Space LLC

    San Francisco, CA
    2 days ago
  •  ...San Francisco is looking for a Developer Platform Engineer to build and maintain their API platform for inference. This role involves defining user-facing APIs and...  ...providers. Ideal candidates have 5+ years of software engineering experience, are collaborative, and possess... 

    TypeSafe AI

    San Francisco, CA
    3 days ago
  •  ...At Inductive Bio, our goal is to build software that can dramatically improve how molecules...  .... We are seeking a full-stack software engineer to join our talented, ambitious, and...  ...infrastructure for model management and low-latency inference, including security features,... 

    Inductive Bio, Inc.

    San Francisco, CA
    1 day ago
  •  ...enable enterprises to implement AI workloads effectively. The role involves designing large-scale deployment architectures, solving AI inference challenges, and collaborating closely with customers' DevOps teams. Ideal candidates will have 3+ years in cloud infrastructure or... 
    Flexible hours

    FriendliAI

    San Francisco, CA
    2 days ago
  • $320k

    United States Digital Space LLC is seeking a backend engineer for the Cloud Inference team. This role involves designing and building infrastructure...  ...and cost. The ideal candidate will have significant software engineering experience with a major cloud platform. We offer... 

    United States Digital Space LLC

    San Francisco, CA
    4 days ago
  • Qualifications CUDA + GPU inference optimization vLLM, SGLang, or TensorRT-LLM experience KV caching, paged attention, batching, token streaming, etc. Distributed compute (with GPUs is a super plus) No degree required Company Luminal (YC S25) builds an AI compiler and serving... 

    SupportFinity™

    San Francisco, CA
    12 days ago
  • $405k

     ...growing group of committed researchers, engineers, policy experts, and business leaders working...  ...About the role We are seeking a Staff Software Engineer to build and operate the safety...  ...Safeguards organization and the Cloud Inference team: taking classifiers, detection... 
    Visa sponsorship

    United States Digital Space LLC

    San Francisco, CA
    6 days ago
  •  ...BASETEN Baseten powers mission‑critical inference for the world's most dynamic AI companies...  .... Join us and help build the platform engineers turn to to ship AI products. THE ROLE As...  ...scale and who enjoy working across product, software development, performance engineering,... 
    Work experience placement
    Flexible hours

    Baseten

    San Francisco, CA
    12 days ago
  • $167.2k - $209k

    A leading cloud service provider is seeking a Senior Engineer 2 for their AI Inference Data Plane team. This remote role focuses on designing and developing high-scale, resilient data plane services that enhance AI-driven applications. The ideal candidate will have strong... 
    Remote job

    DigitalOcean

    San Francisco, CA
    21 days ago
  • $180k - $220k

     ...and get work done in the AI era. We’re looking for a backend engineer to join our small team and help lead the effort to prepare the...  ...RESTful and GraphQL APIs, with emphasis on integrating AI/ML inference endpoints and ensuring predictable SLAs ~ Familiarity with... 
    Work at office
    Local area
    Immediate start
    Flexible hours

    Glu Mobile Inc.

    San Francisco, CA
    5 days ago
  • United States Digital Space LLC is looking for a Software Engineer to join the Launch Engineering team in San Francisco. You’ll design and...  ...build deployment infrastructure for continuous and unattended inference deployment. The ideal candidate will have at least 5 years... 

    United States Digital Space LLC

    San Francisco, CA
    5 days ago
  • $405k

    About the role Anthropic's Inference organization serves Claude to millions of users and enterprise...  ...we add. We're looking for a Staff Engineer to be a technical lead for Inference...  ...agnostic across all of them Have significant software engineering experience, with a strong... 
    Work at office
    Visa sponsorship
    Flexible hours

    jobr.pro

    San Francisco, CA
    6 days ago
  • $320k

     ...growing group of committed researchers, engineers, policy experts, and business leaders working...  ...AI systems. About the role Our Inference team is responsible for building and maintaining...  ...Minimum qualifications Significant software engineering experience, particularly... 
    Worldwide
    Visa sponsorship

    United States Digital Space LLC

    San Francisco, CA
    6 days ago
  •  ...company specializing in AI infrastructure is seeking a skilled professional to build scalable infrastructure for AI model training and inference. You will lead architectural decisions and work with core systems that power their GPU optimization platform. Candidates should... 

    WAFER INC

    San Francisco, CA
    5 days ago
  • $200k - $250k

     ...procure, manage, and forecast hyperscaler, inference, and GPU infrastructure. With the explosive...  ...Our customers are the strategic finance and engineering leaders at AI labs, inference providers, neoclouds, and AI-native software companies who are making seven, eight, nine... 
    Contract work

    Duckbill

    San Francisco, CA
    3 days ago
  • ABOUT BASETEN Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence,...  ...Partners, and Spark Capital. Join us and help build the platform engineers turn to to ship AI products. THE ROLE As an early member of... 
    Flexible hours

    The Consensus

    San Francisco, CA
    2 days ago
  • $220k

    Perplexity is looking for an engineer to join their team in San Francisco. You will work on building and operating the inference engine, supporting new models, migrating GPU kernels...  ...candidate has 3+ years of experience in software engineering with a focus on ML inference... 

    Perplexity

    San Francisco, CA
    6 days ago
  • A leading cloud infrastructure company is seeking a Senior Engineer 2 to join their AI Inference Optimization team. The role involves leading the technical strategy for performance architecture and addressing complex performance issues ensuring industry-leading service.... 
    Remote job

    DigitalOcean

    San Francisco, CA
    2 days ago
  •  ...Montreal Employment Type Full time Location Type Hybrid Department Inference Model Serving Who are we? Our mission is to scale intelligence...  ...’s best for our customers. Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft.... 
    Full time
    Work experience placement
    Work at office
    Remote work
    Flexible hours

    Jaide Health

    San Francisco, CA
    5 days ago
  •  ...embeddings, and fine-tuning. We also operate inference infrastructure at scale. There's a lot...  ...us than unfettered growth. The Fraud Engineering team works within our Applied...  ...on our platform. We are looking for a software engineer with anti fraud & abuse experience... 
    Immediate start

    OpenAI

    San Francisco, CA
    13 hours ago
  • $180k - $230k

     ...possible, but practical: a unified platform where high-performance inference, orchestration, and observability come together to unlock new...  ...ambitious teams build on. About this role You are a versatile engineer who thrives on building and deploying seamless user... 
    Currently hiring
    Relocation package

    Dormont Manufacturing Company

    San Francisco, CA
    13 hours ago
  • Fluency Digital, Inc. is seeking a data engineer to join their team in San Francisco, California. This role focuses on building the data and inference infrastructure for B2B marketing, ensuring hyper-precise targeting through data. Candidates should have 6 to 12 years... 

    Fluency Digital, Inc.

    San Francisco, CA
    2 days ago
  • $140k - $170k

     ...CaseMark Software Engineer San Francisco, CA · Full time Ship open-source legal AI tools—full-stack engineer working with Next.js, TypeScript...  ...Compute: Modal, OpenAI, Anthropic, OpenRouter, and other AI inference providers. Infrastructure: AWS (S3, compute), Vercel.... 
    Full time
    Immediate start
    Shift work

    Alumni Ventures

    San Francisco, CA
    13 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, Inference. Be the first to apply!