Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Edge Transformer Inference Tech Lead

OpenAI

A leading AI research firm in San Francisco is seeking a Technical Lead to join its Future of Computing Research team. This role involves evaluating silicon platforms and optimizing model architectures while working in a hybrid model. Ideal candidates have expertise in evaluating workloads on accelerators, understanding transformer models, and leading teams focused on performance-critical software. The position offers relocation assistance and is centered on deploying cutting-edge AI technology responsibly and effectively. #J-18808-Ljbffr OpenAI

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Edge Transformer Inference Tech Lead in San Francisco, CA vacancy
  •  ...About the Role As a Technical Lead on the Future of Computing...  ...) for on-device and edge deployment of OpenAI models....  ...ensure efficient execution of transformer workloads. Build and lead...  ...for implementing the low-level inference stack, including kernel development... 
    Transformer
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    4 days ago
  •  ...About the Job We are seeking a highly technical Inference Engine Engineer to optimize the performance and efficiency of our core...  ...Design and optimize custom GPU kernels for AI (e.g., transformer and diffusion) workloads Contribute to the development of FriendliAI... 
    Transformer
    Worldwide
    Flexible hours

    FriendliAI Corp

    San Francisco, CA
    1 day ago
  •  ...team to build and ship cutting edge models and experiences. We're funded by leading investors at Index Ventures and...  ...the Role We're hiring an Inference Engineer to advance our mission...  ...cutting edge foundation models using Transformers, SSMs and hybrid models.... 
    Transformer
    Work at office
    Visa sponsorship
    Flexible hours

    Cartesia, Inc.

    San Francisco, CA
    4 days ago
  •  ...San Francisco We're looking for an ML Inference Engineer with deep expertise in high-performance...  ..., and shaping Reactor’s competitive edge in ultra‑low‑latency, high‑throughput...  ...hardware (NVIDIA) Strong understanding of transformer architectures and modern ML optimization... 
    Transformer
    Visa sponsorship
    Relocation package

    Reactor

    San Francisco, CA
    4 days ago
  •  ...Staff Technical Lead for Inference & ML Performance San Francisco fal is the generative media ecosystem powering the next generation of...  ...work directly impacts our ability to rapidly deliver cutting-edge creative solutions to users, from individual creators to global... 
    Suggested

    Fal

    San Francisco, CA
    8 hours ago
  • $220k

    We build and run the inference engine behind every Perplexity query and deploy dozens of model architectures at scale with tight latency...  ...Of Real Work The Team Does New models support. Support transformer-based retrieval, text-generation, and multimodal models in our... 
    Transformer

    Perplexity

    San Francisco, CA
    4 days ago
  • $250k

    Edge AI is a production requirement across automotive, robotics...  ...models are doing in the field. Inference latency, memory pressure,...  ...the resources needed to build transformational companies. We've launched 17...  ...Ability to attract, hire and lead world‑class teams. Demonstrated... 

    Forum Ventures

    San Francisco, CA
    1 day ago
  •  ...'re looking for a Founding Engineer, ML Inference with deep expertise in high-performance...  ...performance, and shaping the competitive edge in ultra-low-latency, high-throughput environments...  ...as needed Strong understanding of transformer architectures and modern ML model... 
    Transformer
    Relocation
    Visa sponsorship
    Relocation package

    Reactor

    San Francisco, CA
    5 days ago
  • $252k - $315k

     ...enable our next generation LLM training, inference and data curation. If you are excited...  ...and tools such as CUDA, Pytorch, transformers, flash attention, etc. Strong written...  ...stack technologies that power the world's leading models, and help enterprises and governments... 
    Transformer
    Full time

    Scale AI, Inc.

    San Francisco, CA
    4 days ago
  •  ...Tech Lead, Data & Inference Engineer Georgia, Georgia, United States About the Job Tech Lead, Data & Inference Engineer Our client...  ...They have raised twelve million dollars in funding and are transforming how business to business marketers reach their ideal customers... 
    Full time

    Catalyst Labs, LLC

    San Francisco, CA
    8 hours ago
  •  ...foundational data infrastructure for an edge-first world — a world where intelligence...  ...intelligence. Why This Role Matters As Lead Edge AI Engineer , you will own Source's...  ...— from federated learning and on-device inference to adaptive compute pipelines running on... 
    Local area

    Source, Inc.

    San Francisco, CA
    1 day ago
  • $165k

     ...intelligence, join us in building what's next. About the Role Inference is now the defining cost and latency bottleneck for frontier...  ..., and cost‑per‑token across diverse model families (dense transformers, mixture-of-experts, multi-modal) and customer workload patterns... 
    Transformer
    Local area

    Fluidstack

    San Francisco, CA
    5 days ago
  •  ...custom hardware systems to accelerate AI inference. These inference systems offer...  ...combination of FPGAs and x86 CPUs to accelerate transformer-based models . The software stack is written...  ...models. Why Join Us? Work on a cutting-edge ML inference platform that redefines... 
    Transformer

    GrabJobs

    San Francisco, CA
    4 days ago
  •  ...exceptional people to help us get there. The Opportunity Our Edge Inference team compiles Liquid Foundation Models into optimized machine...  ...device AI possible. You will work directly with the technical lead on problems that require deep understanding of both ML architectures... 

    Liquid AI

    San Francisco, CA
    1 day ago
  •  ...to carry out our mission from industry‑leading investors. We are obsessed with rapid...  ...Deep debug failure modes in transformer and diffusion policy field deployments...  ...Optimize policies for real‑time (~10hz) inference on edge hardware What you bring Experience deploying... 
    Transformer
    Temporary work

    Kovari

    San Francisco, CA
    2 days ago
  •  ...including GPU orchestration, large-scale inference systems, performance optimization, and developer...  ...and brand at the forefront of fashion-tech innovation. Your design work will...  ...love of design, luxury fashion, and cutting-edge tech, you'll have the freedom to do it here... 
    Internship
    Immediate start

    SpreeAI

    San Francisco, CA
    4 days ago
  • $425k

     ...architecture aims for efficient training, fast inference, and high spatial resolution. The...  ...with just their imagination. This is a lead-by-doing role at the intersection of ML,...  ...experience developing and training large-scale transformer-based models, ideally multimodal or... 
    Transformer

    Strativ Group

    San Francisco, CA
    4 days ago
  • $160k - $230k

     ...LLM Inference Frameworks and Optimization Engineer San Francisco, Singapore, Amsterdam...  ...Techniques: Deep understanding of Transformer architectures and LLM/VLM/Diffusion model...  ...algorithms, and models. We have contributed to leading open-source research, models, and... 
    Transformer
    Full time

    Together AI

    San Francisco, CA
    1 day ago
  • $380k

     ...are reliable, user-friendly, and aligned with our mission of broad societal benefit. About the Role We're looking for a GPU Inference Engineer to contribute to improvements in model serving efficiency for Sora. This is a high-impact role where you'll drive... 
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    1 day ago
  • $175k - $225k

     ...with participation from other leading venture capital firms. The...  ...We're looking for an AI Inference Engineer who lives at the boundary...  ...sophisticated models and transforming them into lightning-fast, production...  ...-ready engines running on edge devices in homes across the... 
    Local area
    Remote work

    Sauron

    San Francisco, CA
    8 hours ago
  •  ...and low-level systems engineering. Beyond inference, you'll profile and optimize the entire...  ...with ML model architectures (transformers, CNNs) and the ability to reason about computational...  ...autonomous vehicles, robotics, or IoT/edge devices Deep knowledge of CUDA, TensorRT... 
    Transformer
    Local area

    Humble Robotics

    San Francisco, CA
    5 days ago
  •  ...cloud deployment — ensuring our cutting-edge computer vision and multi-modal AI systems...  ...optimize model serving for low-latency inference at scale. You\'ll work closely with our...  ...learning models such as auto-regressive transformers and familiarity with inference optimization... 
    Transformer
    Work at office
    3 days per week

    Claryo

    San Francisco, CA
    5 days ago
  •  ...machine learning systems that power real-time perception and inference across our edge-cloud platform. This role owns the training, deployment,...  ...architectures and their deployment tradeoffs (YOLO, transformers, CNNs, real-time detection/tracking). Hands-on experience... 
    Transformer

    Specter

    San Francisco, CA
    2 days ago
  • $342k

     ...infrastructure execution-translating cutting‑edge compute roadmaps into scalable,...  ...We are seeking a CPU & Storage Technical Lead to define and drive the server compute and...  ...storage systems are optimized for training, inference, and supporting services. You will work... 
    Local area

    OpenAI

    San Francisco, CA
    3 days ago
  • Quadric in San Francisco is looking for an experienced AI Kernel Engineer to develop and optimize AI kernels for their innovative neural processing platform. This role involves enhancing performance for various hardware configurations and providing technical support to ...

    Quadric

    San Francisco, CA
    4 days ago
  •  ...Montreal Employment Type Full time Location Type Hybrid Department Inference Model Serving Who are we? Our mission is to scale intelligence...  ...and work environment Work closely with a team on the cutting edge of AI research Weekly lunch stipend, in-office lunches & snacks... 
    Full time
    Work experience placement
    Work at office
    Remote work
    Flexible hours

    Jaide Health

    San Francisco, CA
    2 days ago
  • $248.8k - $311k

     ...automation. Role Overview As the Technical Lead Manager (TLM) for the Physical AI team of...  ...you will bridge the gap between cutting‑edge Machine Learning research and physical...  ...proficiency in PyTorch , with deep knowledge of Transformer architectures , Attention mechanisms ,... 
    Transformer
    Full time

    Scale AI, Inc.

    San Francisco, CA
    1 day ago
  •  ...(RAG) pipelines to fine-tuning compact transformer models and classic ML solutions where they...  ...with DeepSpeed or vLLM for efficient inference serving. Familiarity with LangChain or LlamaIndex...  .... Interest in decentralised or edge deployments (e.g., WASM at the edge) for... 
    Transformer

    Synagi

    San Francisco, CA
    5 days ago
  • A leading AI technology company in San Francisco is seeking a Tech Lead Manager focused on machine learning performance. In this role, you will manage and mentor a team while driving optimization projects. Ideal candidates have over 5 years of software engineering experience... 

    Baseten

    San Francisco, CA
    3 days ago
  • A tech company specializing in AI infrastructure is seeking a skilled professional to build scalable infrastructure for AI model training and inference. You will lead architectural decisions and work with core systems that power their GPU optimization platform. Candidates... 

    Wafer

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Edge Transformer Inference Tech Lead. Be the first to apply!