Edge Transformer Inference Tech Lead
OpenAI
A leading AI research firm in San Francisco is seeking a Technical Lead to join its Future of Computing Research team. This role involves evaluating silicon platforms and optimizing model architectures while working in a hybrid model. Ideal candidates have expertise in evaluating workloads on accelerators, understanding transformer models, and leading teams focused on performance-critical software. The position offers relocation assistance and is centered on deploying cutting-edge AI technology responsibly and effectively. #J-18808-Ljbffr OpenAI
- ...About the Role As a Technical Lead on the Future of Computing... ...) for on-device and edge deployment of OpenAI models.... ...ensure efficient execution of transformer workloads. Build and lead... ...for implementing the low-level inference stack, including kernel development...TransformerWork at officeRelocation package
- ...About the Job We are seeking a highly technical Inference Engine Engineer to optimize the performance and efficiency of our core... ...Design and optimize custom GPU kernels for AI (e.g., transformer and diffusion) workloads Contribute to the development of FriendliAI...TransformerWorldwideFlexible hours
- ...team to build and ship cutting edge models and experiences. We're funded by leading investors at Index Ventures and... ...the Role We're hiring an Inference Engineer to advance our mission... ...cutting edge foundation models using Transformers, SSMs and hybrid models....TransformerWork at officeVisa sponsorshipFlexible hours
- ...San Francisco We're looking for an ML Inference Engineer with deep expertise in high-performance... ..., and shaping Reactor’s competitive edge in ultra‑low‑latency, high‑throughput... ...hardware (NVIDIA) Strong understanding of transformer architectures and modern ML optimization...TransformerVisa sponsorshipRelocation package
- ...Staff Technical Lead for Inference & ML Performance San Francisco fal is the generative media ecosystem powering the next generation of... ...work directly impacts our ability to rapidly deliver cutting-edge creative solutions to users, from individual creators to global...Suggested
$220k
We build and run the inference engine behind every Perplexity query and deploy dozens of model architectures at scale with tight latency... ...Of Real Work The Team Does New models support. Support transformer-based retrieval, text-generation, and multimodal models in our...Transformer$250k
Edge AI is a production requirement across automotive, robotics... ...models are doing in the field. Inference latency, memory pressure,... ...the resources needed to build transformational companies. We've launched 17... ...Ability to attract, hire and lead world‑class teams. Demonstrated...- ...'re looking for a Founding Engineer, ML Inference with deep expertise in high-performance... ...performance, and shaping the competitive edge in ultra-low-latency, high-throughput environments... ...as needed Strong understanding of transformer architectures and modern ML model...TransformerRelocationVisa sponsorshipRelocation package
$252k - $315k
...enable our next generation LLM training, inference and data curation. If you are excited... ...and tools such as CUDA, Pytorch, transformers, flash attention, etc. Strong written... ...stack technologies that power the world's leading models, and help enterprises and governments...TransformerFull time- ...Tech Lead, Data & Inference Engineer Georgia, Georgia, United States About the Job Tech Lead, Data & Inference Engineer Our client... ...They have raised twelve million dollars in funding and are transforming how business to business marketers reach their ideal customers...Full time
- ...foundational data infrastructure for an edge-first world — a world where intelligence... ...intelligence. Why This Role Matters As Lead Edge AI Engineer , you will own Source's... ...— from federated learning and on-device inference to adaptive compute pipelines running on...Local area
$165k
...intelligence, join us in building what's next. About the Role Inference is now the defining cost and latency bottleneck for frontier... ..., and cost‑per‑token across diverse model families (dense transformers, mixture-of-experts, multi-modal) and customer workload patterns...TransformerLocal area- ...custom hardware systems to accelerate AI inference. These inference systems offer... ...combination of FPGAs and x86 CPUs to accelerate transformer-based models . The software stack is written... ...models. Why Join Us? Work on a cutting-edge ML inference platform that redefines...Transformer
- ...exceptional people to help us get there. The Opportunity Our Edge Inference team compiles Liquid Foundation Models into optimized machine... ...device AI possible. You will work directly with the technical lead on problems that require deep understanding of both ML architectures...
- ...to carry out our mission from industry‑leading investors. We are obsessed with rapid... ...Deep debug failure modes in transformer and diffusion policy field deployments... ...Optimize policies for real‑time (~10hz) inference on edge hardware What you bring Experience deploying...TransformerTemporary work
- ...including GPU orchestration, large-scale inference systems, performance optimization, and developer... ...and brand at the forefront of fashion-tech innovation. Your design work will... ...love of design, luxury fashion, and cutting-edge tech, you'll have the freedom to do it here...InternshipImmediate start
$425k
...architecture aims for efficient training, fast inference, and high spatial resolution. The... ...with just their imagination. This is a lead-by-doing role at the intersection of ML,... ...experience developing and training large-scale transformer-based models, ideally multimodal or...Transformer$160k - $230k
...LLM Inference Frameworks and Optimization Engineer San Francisco, Singapore, Amsterdam... ...Techniques: Deep understanding of Transformer architectures and LLM/VLM/Diffusion model... ...algorithms, and models. We have contributed to leading open-source research, models, and...TransformerFull time$380k
...are reliable, user-friendly, and aligned with our mission of broad societal benefit. About the Role We're looking for a GPU Inference Engineer to contribute to improvements in model serving efficiency for Sora. This is a high-impact role where you'll drive...Work at officeRelocation package$175k - $225k
...with participation from other leading venture capital firms. The... ...We're looking for an AI Inference Engineer who lives at the boundary... ...sophisticated models and transforming them into lightning-fast, production... ...-ready engines running on edge devices in homes across the...Local areaRemote work- ...and low-level systems engineering. Beyond inference, you'll profile and optimize the entire... ...with ML model architectures (transformers, CNNs) and the ability to reason about computational... ...autonomous vehicles, robotics, or IoT/edge devices Deep knowledge of CUDA, TensorRT...TransformerLocal area
- ...cloud deployment — ensuring our cutting-edge computer vision and multi-modal AI systems... ...optimize model serving for low-latency inference at scale. You\'ll work closely with our... ...learning models such as auto-regressive transformers and familiarity with inference optimization...TransformerWork at office3 days per week
- ...machine learning systems that power real-time perception and inference across our edge-cloud platform. This role owns the training, deployment,... ...architectures and their deployment tradeoffs (YOLO, transformers, CNNs, real-time detection/tracking). Hands-on experience...Transformer
$342k
...infrastructure execution-translating cutting‑edge compute roadmaps into scalable,... ...We are seeking a CPU & Storage Technical Lead to define and drive the server compute and... ...storage systems are optimized for training, inference, and supporting services. You will work...Local area- Quadric in San Francisco is looking for an experienced AI Kernel Engineer to develop and optimize AI kernels for their innovative neural processing platform. This role involves enhancing performance for various hardware configurations and providing technical support to ...
- ...Montreal Employment Type Full time Location Type Hybrid Department Inference Model Serving Who are we? Our mission is to scale intelligence... ...and work environment Work closely with a team on the cutting edge of AI research Weekly lunch stipend, in-office lunches & snacks...Full timeWork experience placementWork at officeRemote workFlexible hours
$248.8k - $311k
...automation. Role Overview As the Technical Lead Manager (TLM) for the Physical AI team of... ...you will bridge the gap between cutting‑edge Machine Learning research and physical... ...proficiency in PyTorch , with deep knowledge of Transformer architectures , Attention mechanisms ,...TransformerFull time- ...(RAG) pipelines to fine-tuning compact transformer models and classic ML solutions where they... ...with DeepSpeed or vLLM for efficient inference serving. Familiarity with LangChain or LlamaIndex... .... Interest in decentralised or edge deployments (e.g., WASM at the edge) for...Transformer
- A leading AI technology company in San Francisco is seeking a Tech Lead Manager focused on machine learning performance. In this role, you will manage and mentor a team while driving optimization projects. Ideal candidates have over 5 years of software engineering experience...
- A tech company specializing in AI infrastructure is seeking a skilled professional to build scalable infrastructure for AI model training and inference. You will lead architectural decisions and work with core systems that power their GPU optimization platform. Candidates...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Edge Transformer Inference Tech Lead. Be the first to apply!
- technical lead manager San Francisco, CA
- technical leader San Francisco, CA
- salesforce technical lead San Francisco, CA
- technical lead San Francisco, CA
- transformer San Francisco, CA
- vice president marketing technology San Francisco, CA
- cardiac tech San Francisco, CA
- technology transfer associate San Francisco, CA
- business technology San Francisco, CA
- monitor tech San Francisco, CA

