Staff Software Engineer, Inference Infrastructure

Jaide Health

Location San Francisco, Toronto, London, New York, Montreal Employment Type Full time Location Type Hybrid Department Inference Model Serving Who are we? Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI. We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers. Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products. Join us on our mission and shape the future! Why this role? Are you energized by building high-performance, scalable and reliable machine learning systems? Do you want to help define and build the next generation of AI platforms powering advanced NLP applications? We are looking for Members of Technical Staff to join the Model Serving team at Cohere. The team is responsible for developing, deploying, and operating the AI platform delivering Cohere's large language models through easy to use API endpoints. In this role, you will work closely with many teams to deploy optimized NLP models to production in low latency, high throughput, and high availability environments. You will also get the opportunity to interface with customers and create customized deployments to meet their specific needs. You may be a good fit if you have: 5+ years of engineering experience running production infrastructure at a large scale Experience designing large, highly available distributed systems with Kubernetes, and GPU workloads on those clusters Experience with Kubernetes dev and production coding and support Experience with GCP, Azure, AWS, OCI, multi-cloud on-prem / hybrid serving Experience in designing, deploying, supporting, and troubleshooting in complex Linux-based computing environments Experience in compute/storage/network resource and cost management Excellent collaboration and troubleshooting skills to build mission-critical systems, and ensure smooth operations and efficient teamwork The grit and adaptability to solve complex technical challenges that evolve day to day Familiarity with computational characteristics of accelerators (GPUs, TPUs, and/or custom accelerators), especially how they influence latency and throughput of inference. Strong understanding or working experience with distributed systems. Experience in Golang, C++ or other languages designed for high-performance scalable servers If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit the Accommodations Request Form, and we will work together to meet your needs. Full-Time Employees at Cohere enjoy these Perks: An open and inclusive culture and work environment Work closely with a team on the cutting edge of AI research Weekly lunch stipend, in-office lunches & snacks Full health and dental benefits, including a separate budget to take care of your mental health 100% Parental Leave top-up for up to 6 months Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend 6 weeks of vacation (30 working days!) #J-18808-Ljbffr Jaide Health

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the Staff Software Engineer, Inference Infrastructure in San Francisco, CA vacancy

Staff+ Software Engineer, Inference Runtime
$405k
About the role Anthropic's Inference organization serves Claude... ...add. We're looking for a Staff Engineer to be a technical lead for... ...counterpart to Anthropic's central Infrastructure org on the compilers, build... ...of them Have significant software engineering experience,...
Suggested
Work at office
Visa sponsorship
Flexible hours
jobr.pro
San Francisco, CA
4 days ago
Staff Software Engineer, Infrastructure
$200k - $300k
F2 Staff Software Engineer, Infrastructure Location: San Francisco Employment Type: Full time Location Type: Hybrid Department: Engineering, Product,... ...vector search infrastructure, and high‑throughput LLM inference paths; balancing latency, throughput, and cost. Design...
Suggested
Full time
F2
San Francisco, CA
3 days ago
Software Engineer (AI Infrastructure / Training / Inference)
Software Engineer (AI Infrastructure / Training / Inference) About the Role We are hiring Software Engineers focused on AI Infrastructure to build the systems that enable frontier multimodal AI to operate reliably at production scale. This role exists because modern generative...
Suggested
SpreeAI
San Francisco, CA
5 days ago
Senior Staff+ Software Engineer, Kubernetes Platform
$320k - $405k
...group of committed researchers, engineers, policy experts, and... ...with research, training, and inference to understand workload shapes... ...qualifications Significant software engineering experience building... ...deployments) Familiarity with ML infrastructure: GPUs, TPUs, or Trainium;...
Suggested
Menlo Ventures
San Francisco, CA
1 day ago
Staff Software Engineer, GenAI Platform
$208k - $250k
...that operate across Ripple's polyrepo engineering environment. Define and advance Ripple... ...Collaborate closely with engineering, infrastructure, product, and business partners to... ...hybrid environments, including managed inference endpoints and GPU‑based workloads. Excellent...
Suggested
Full time
Local area
TryApplyNow
San Francisco, CA
4 days ago
Staff Software Engineer - Platform/Infrastructure
...innovative GPU marketplace and AI inference service that promise affordability... ...the Role We're seeking a Platform Engineer to design and build the control plane... ..., developer platforms, or infrastructure services Expert-level software engineering skills in Go (Golang)...
Worldwide
Hyperbolic Labs
San Francisco, CA
4 days ago
Cloud-Scale AI Inference Architect
...involves designing large-scale deployment architectures, solving AI inference challenges, and collaborating closely with customers' DevOps teams. Ideal candidates will have 3+ years in cloud infrastructure or DevOps, strong skills in Kubernetes, Docker, Terraform, and a...
Flexible hours
FriendliAI
San Francisco, CA
5 days ago
AI Infrastructure Engineer — Scalable Training & Inference
An innovative AI company is seeking a Software Engineer to develop infrastructure that supports AI training and inference workflows. This role requires strong object-oriented programming skills and a solid foundation in data structures and algorithms. The ideal candidate...
SpreeAI
San Francisco, CA
5 days ago
Cloud Inference Engineer
Qualifications CUDA + GPU inference optimization vLLM, SGLang, or TensorRT-LLM experience KV caching, paged attention, batching, token streaming, etc. Distributed compute (with GPUs is a super plus) No degree required Company Luminal (YC S25) builds an AI compiler and serving...
SupportFinity™
San Francisco, CA
5 days ago
Staff Backend Software Engineer- (AI Platform)
$192k - $260k
...the world's best data and AI infrastructure platform so our customers... ...and serving frontier AI model inference for open source models like... ...necessary. We’re looking for engineers who have owned high scale operational... ...runtimes at scale. As a Staff Engineer, you’ll play a...
Local area
Worldwide
Menlo Ventures
San Francisco, CA
2 days ago
Senior Staff Software Engineer, API
$405k
...growing group of committed researchers, engineers, policy experts, and business... ...is seeking an exceptional Senior Staff Software Engineer to join the Claude Developer... ...partnering closely with Research, Inference, Platform, Infrastructure, and Safeguards to ensure the...
Work at office
Remote work
Visa sponsorship
Flexible hours
Menlo Ventures
San Francisco, CA
2 days ago
Staff Software Engineer - Product
$170k - $220k
...supply chain and enterprise software investors. We're live with manufacturers... ...with design and backend/infrastructure to shape APIs and UX for... ...of professional software engineering experience with a strong... ...discriminated unions, type inference). • Next.js mastery: Production...
Tenkara Labs, Inc.
San Francisco, CA
5 days ago
Platform Engineer: Inference API & Reliable Cloud
...looking for a Developer Platform Engineer to build and maintain their API platform for inference. This role involves defining... ...APIs and creating robust infrastructures across cloud providers. Ideal candidates have 5+ years of software engineering experience, are collaborative...
TypeSafe AI
San Francisco, CA
1 day ago
Software Engineer, Inference
...the most persistent challenges in data infrastructure: extracting accurate, structured information... .... We are a small, fast-growing team of engineers in San Francisco powering Fortune 100... ...in low-latency, high-throughput inference for OCR and multimodal models. Own profiling...
Work at office
Visa sponsorship
Relocation package
Trypulse
San Francisco, CA
3 days ago
Software Engineer, Inference - AMD GPU Enablement
$325k
About the Team Our Inference team brings OpenAI's most capable research and technology to the world through... ...model inference. About the Role We're hiring engineers to scale and optimize OpenAI's inference infrastructure across emerging GPU platforms. You'll work across...
Centaur Labs
San Francisco, CA
4 days ago
Staff Software Engineer, Backend
$205k - $250k
...About the Role We are seeking a Backend Engineer to design and scale high-performance... ...delivering reliable, secure, and scalable infrastructure. Ideally, you’ve worked on services... ...integrating external AI APIs, managing ML inference pipelines, or supporting data infrastructure...
Work experience placement
Private practice
Work at office
3Y Health
San Francisco, CA
4 days ago
Staff Software Engineer, Autonomous Learning & Pipelines (Hybrid)
NextGenEnergyJobs is seeking a Staff Software Engineer to develop and enhance datasets and models for autonomous driving technology. This role will involve improving dataset quality, training and inference pipelines, and collaborating with cross-functional teams. Candidates...
NextGenEnergyJobs
San Francisco, CA
2 days ago
Staff Software Engineer (Artificial Intelligence)
...Experience building and deploying AI Inference and Generative AI... ...with foundation models, prompt engineering, fine‑tuning, semantic search... ...Experience with AI/ML orchestration software KServe, Knative, Kubeflow (... ...Cloudera is looking for a Staff Software Engineer to join the...
Cloudera
San Francisco, CA
4 days ago
Staff Software Engineer, Continuous Learning
$189k - $303k
...more efficient and accessible for all. We’re searching for a Staff Software Engineer on the Autonomy Data: Continuous Learning team. The ideal... ...interesting events to millions of miles Own model training and inference pipelines for all core Autonomy models Collaborate across...
Local area
I did my part and supported the Regular Toilet
San Francisco, CA
2 days ago
Staff Software Engineer - GenAI Performance and Kernel
$190.9k - $232.8k
About This Role As a staff software engineer for GenAI Performance and Kernel, you will own the... ...performance GPU kernels powering our GenAI inference stack. You will lead development of... ...set best practices Collaborate with infrastructure, tooling, and ML teams to roll out...
Local area
Worldwide
Databricks
San Francisco, CA
4 days ago
Staff Software Engineer, Continuous Learning
$189k - $303k
Staff Software Engineer, Continuous Learning The role involves developing and improving datasets and models for autonomous driving technology,... ...reinforcement learning techniques, as well as managing training and inference pipelines to enhance the Aurora Driver system. Key...
Work at office
3 days per week
NextGenEnergyJobs
San Francisco, CA
3 days ago
Staff Software Engineer (Consumer Experience)
...growing business with billions in revenue About the Role As a Staff Software Engineer on the Consumer Experience team, you'll build the products... ...graphs, including entity resolution and real‑time inference Experience building AI‑powered systems, including LLM‑based...
Full time
Freelance
Internship
Work at office
Remote work
Flexible hours
Handshake
San Francisco, CA
1 day ago
Staff Software Engineer, Forward Deployed San Francisco
$150k - $230k
fal is building the fastest and most scalable infrastructure for AI inference. Fal Serverless powers 1,300+ endpoints on the fal Marketplace and handles... ...product. About this role As a Forward Deployed Engineer on Serverless, you will work directly with enterprise customers...
Currently hiring
Relocation
Visa sponsorship
Fal
San Francisco, CA
2 days ago
Senior Staff Software Engineer, Infrastructure
$207k - $345k
Senior Staff Software Engineer, Infrastructure About this Position Rippling gives businesses one place to run HR, IT, and Finance. It brings together all of the workforce systems that are normally scattered across a company, like payroll, expenses, benefits, and computers...
Work at office
Local area
3 days per week
Rippling
San Francisco, CA
3 days ago
Staff Software Engineer (Database Infrastructure)
$200k - $230k
...Role We’re looking for an experienced engineer with deep expertise in distributed data... ...and scale. About the Team The Datastores Infrastructure Engineering team designs, builds, and... ...’s what we're looking for 12+ years of software engineering experience building and scaling...
Work at office
Local area
Remote work
2 days per week
3 days per week
Prudence Holdings Inc
San Francisco, CA
5 days ago
Staff Software Engineer — AI Ops & Decision Infrastructure
Reific is seeking a Member of Technical Staff for a full-time role in San Francisco. You'll be responsible for building the Reific interface and backend, transforming operational data into forecasts and decision records. The role entails creating product flows, designing...
Full time
Reific
San Francisco, CA
3 days ago
Software Engineer, Model Inference
$325k
About the Team Our Inference team brings OpenAI's most capable research and technology to... ...inference. About the Role We are looking for an engineer who wants to take the world's largest... .... Have at least 5 years of professional software engineering experience. Have or can...
Centaur Labs
San Francisco, CA
5 days ago
Staff Software Engineer, Electron & Browser Infrastructure - Slack Desktop
$197.3k - $313.7k
## Staff Software Engineer, Electron & Browser Infrastructure - Slack DesktopApplyremote type: Office Tech-Flexiblelocations: Georgia - Atlanta: Washington - Seattle Metro - Remote: Washington - Seattle: California - Remote: California - San Franciscotime type: Full timeposted...
Work at office
Remote work
Slack Enterprise
San Francisco, CA
3 days ago
Staff+ Software Engineer, Identity Infrastructure Engineering
...general intelligence benefits all of humanity. The Identity Infrastructure Engineering team sits at the core of this effort, designing and... ...innovative AI research. About the Role We’re looking for a Staff+ Software Engineer to help build and evolve the identity...
Work at office
Relocation package
Aimling
San Francisco, CA
4 days ago
Staff Software Engineer, Secure Product Platform (IAM & Federation)
...A leading technology firm is seeking a Software Engineer to drive engineering excellence and influence technical strategy within their Secure Product Group. Responsibilities include leading innovation for AI-powered products and shaping architectural direction. Candidates...
IBM Computing
San Francisco, CA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Software Engineer, Inference Infrastructure. Be the first to apply!