Inference Software Engineer: High-Performance Transformers

Etched.ai, Inc.

Etched.ai, Inc. is looking for candidates to support porting AI models to its innovative architecture in San Jose, California. Responsibilities include enhancing runtime environments and optimizing performance layers. Ideal applicants are proficient in C++ or Rust and have experience with distributed software systems. Benefits include a housing subsidy, relocation support, and comprehensive health packages. Join a fully in-person team that values engineering and research collaboration. #J-18808-Ljbffr

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the Inference Software Engineer: High-Performance Transformers in San Jose, CA vacancy

Inference Software Engineer
$2,000 per month
...the world’s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically... ...staffed by leading engineers, Etched is redefining... ...distributed software systems like Linux... ...TPUs), Compilers, or high-speed interconnects...
Transformer
Performance
Work at office
Relocation package
Etched.ai, Inc.
San Jose, CA
1 day ago
High-Performance AI Inference Engineer (TensorRT)
$124k - $195.5k
...NVIDIA Gruppe is looking for a passionate Software Engineer to join its TensorRT team in Santa Clara, California. This role involves designing and developing high-performance AI inference solutions while contributing to performance optimizations and collaborating with...
Performance
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior AI Inference Engineer - High-Performance LLM Serving
$152k - $241.5k
...NVIDIA Gruppe is seeking a Senior Software Engineer – AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...
Performance
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior Software Engineer - TensorRT Edge-LLM
...large language model inference? Join NVIDIA’s TensorRT... .... We build the software stack that enables Large... ...optimizations tailored for transformer‑based models running... ...robotics to deliver high‑performance, production‑ready... ...Electrical/Computer Engineering, or a closely related...
Transformer
Performance
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Kernel Driver Software Engineer
$2,000 per month
...the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically... ...staffed by leading engineers, Etched is redefining... ...drivers ensuring high reliability, informative... ...Collaborate with software and hardware teams...
Transformer
Performance
Work at office
Relocation package
ETCHED LLC
San Jose, CA
6 days ago
Accelerator Software Engineer
$2,000 per month
...the world’s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically... ...staffed by leading engineers, Etched is redefining... ...the accelerator software that makes it usable... ...memory management, high-throughput low-latency...
Transformer
Performance
Work at office
Relocation package
Etched.ai, Inc.
San Jose, CA
2 days ago
Chip Simulation Software Engineer
$2,000 per month
...the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically... ...staffed by leading engineers, Etched is redefining... ...We are seeking highly motivated and detail-oriented software engineers to join our...
Transformer
Performance
Work at office
Relocation package
ETCHED LLC
San Jose, CA
5 days ago
Senior Software Engineer, AI Networking
$152k - $241.5k
...NVIDIA seeks a senior software engineer to join the AI... ...within LLM training and inference stacks. A strong passion... ...ML) for comprehensive performance analysis and... ...includes a focus on high-performance networking... ...ability to apply GNNs/transformers-based optimization to...
Transformer
Performance
NVIDIA
Santa Clara, CA
3 days ago
Distinguished Software Engineer - IFX
$269.1k - $307.2k
...Distinguished Software Engineer - IFX As a Distinguished Engineer... ...system that delivers high‑scale developer and... ...available, high‑performance systems will drive our... ...ML/DL model training, inference, feature pipelines, pre... ...and fine‑tuning of transformer‑based models, and generative...
Transformer
Performance
Local area
Capital One National Association
San Jose, CA
1 day ago
Software Engineer - Performance Profiling
$2,000 per month
...building the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost... ...and staffed by leading engineers, Etched is redefining the infrastructure... .... We are seeking a highly skilled engineer to design...
Transformer
Performance
Work at office
Relocation package
ETCHED LLC
San Jose, CA
3 days ago
Software Development Engineer
$160k - $200k
...leakage. Build a highly scalable pipeline... ...features. Mentor junior engineers on secure backend... ...of high-quality software features while... ...model inference pipelines, fine‑tuning... ...data caching, and performance optimization. Knowledge... ...as Hugging Face Transformers or LangChain for...
Transformer
Performance
Full time
Fortinet
Sunnyvale, CA
5 days ago
Deep Learning Software Engineer, TensorRT Performance - New College Grad 2026
$124k - $195.5k
...looking for a Deep Learning Software Engineer, TensorRT Performance! NVIDIA is seeking an... ...performance of NVIDIA’s inference ecosystem! NVIDIA is rapidly... ...and workloads (e.g. Transformers, Recommenders, ASR, TTS,... ...opportunity employer. As we highly value diversity in our current...
Transformer
Performance
Remote work
NVIDIA
Santa Clara, CA
6 days ago
Distinguished Software Engineer - IFX
$269.1k - $307.2k
...Distinguished Software Engineer - IFX As a Distinguished... ...system delivering the high-scale developer and runtime... ...available and high performance systems to develop... ...training, model inference and feature generation... ...training and fine tuning Transformer-based models as well...
Transformer
Performance
Full time
Part time
Local area
Capital One
San Jose, CA
5 days ago
Principal AI Inference Systems Engineer
...Senior Staff AI Infra Engineer who is passionate about improving the performance of key applications and... ...of hardware and software to optimize performance... ...accelerate LLM training and inference on AMD GPUs, improving... ...understanding of transformer-based architectures and...
Transformer
Performance
Advanced Micro Devices , Inc.
Santa Clara, CA
6 days ago
Senior Staff Field Applications Engineer - High Voltage
...Senior Staff Field Applications Engineer - High Voltage Job Description... ...V to 48V DCX (LLC-based DC transformers) ~800V to 12V direct... ...bring-up and validation ~ Performance optimization and characterization... ...explore our hardware and software capabilities and try new...
Transformer
Performance
Temporary work
Local area
Remote work
Flexible hours
Shift work
Renesas
San Jose, CA
6 days ago
Senior Deep Learning Software Engineer, Inference
$184k - $287.5k
...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build,... ...team is responsible for developing and maintaining high-performance deep learning frameworks, including SGLang and...
Performance
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior Software Engineer - AI Inference
$152k - $241.5k
...application is built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source LLM serving... ...the underlying stack that enables high‑throughput, low‑latency inference at... ...an engineer who enjoys digging into performance bottlenecks, designing pragmatic...
Performance
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior Software Engineer, Deep Learning Inference - TensorRT
$152k - $241.5k
...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact in Deep Learning by helping build... ...performanceDevelop components of TensorRT, NVIDIA’s SDK for high-performance deep learning inference.Closely follow academic...
Performance
NVIDIA
Santa Clara, CA
1 day ago
Senior Software Development Engineer - SGLang and Inference Stack
...enhancing GPU kernel performance, accelerating deep learning... ...LLM and Multimodal inference at scale across multi-... ...across internal GPU software teams and engage with... .... THE PERSON: Skilled engineer with strong technical... ...efforts, and deliver high‑quality solutions. Strong...
Performance
Advanced Micro Devices , Inc.
Santa Clara, CA
1 day ago
Senior Software Engineer, Machine Learning Inference
$152k - $241.5k
...seeking talented and motivated engineers to join our TensorRT team... ...-leading deep learning inference software for NVIDIA AI accelerators.... ...Knowledge of close-to-metal performance analysis, optimization techniques... ...employer. As we highly value diversity in our current...
Performance
NVIDIA
Santa Clara, CA
4 days ago
Senior Software Development Engineer - LLM Inference Framework
...senior member of the LLM inference framework team, you... ...runtime layer, driving performance, scalability, and... ...intersection of inference engines, distributed systems,... ...PyTorch, TensorFlow) for high-throughput and... ...kernel development Software Engineering ~ Expertise...
Performance
Advanced Micro Devices , Inc.
Santa Clara, CA
3 days ago
Senior Software Engineer II, Inference
$165k - $242k
...Senior Software Engineer II, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud... ...CoreWeave combines superior infrastructure performance with deep technical expertise to... ...leveraging custom accelerators for high-efficiency workloads ~ Hands-on...
Performance
Permanent employment
Temporary work
Casual work
Work at office
Remote work
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
4 days ago
Senior Performance Engineer, Inference
...deliver industry‑leading training and inference speeds and empowers machine... ...to deploy 750 megawatts of scale, transforming key workloads with ultra high‑speed inference. About The Role We are hiring a Senior Performance Engineer to join our Product team. You are an...
Transformer
Performance
Contract work
Shift work
Cerebras
Sunnyvale, CA
3 days ago
Software Engineer Graduate (Inference Infrastructure) - 2026 Start (PhD)
$156k - $316.8k
...Responsibilitie About the Team The Inference Infrastructure team is the... ...infrastructure that is highly performant, massively scalable, cost-... ..., and are looking for engineers passionate about cloud-native... ...completed a PhD degree in Software Development, Computer...
Performance
Temporary work
Local area
ByteDance
San Jose, CA
3 days ago
Software Engineer, LLM Compilation
$2,000 per month
...(Sohu) only supports transformers, but has an order of... ...of-thought reasoning. Software, LLM Compilation Software... ...issues that hurt performance. You will work with the... ...we will build a few highly-optimized fused... ...3+ years of software engineering experience Have experience...
Transformer
Performance
Work at office
Relocation package
OpenReq
Cupertino, CA
1 day ago
Software Engineer, Simulation
$120k - $150k
...-based virtual driver software for factory-built autonomous... ...building a simulation engine that simulates the... ...its reliability and performance to unblock all technical... ...and experience in transformer and diffusion architectures... ..., learn and grow in a highly future‑oriented,...
Transformer
Performance
Immediate start
Medium
Santa Clara, CA
1 day ago
Senior Software Engineer, AI Inference Systems
$184k - $287.5k
...Position Overview We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-... ...and implement high-performance inference stacks, optimize... ...intelligence (AI) will fundamentally transform how people live and work....
Performance
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Software Engineer, Infrastructure Performance
$2,000 per month
...first product (Sohu) only supports transformers, but has an order of magnitude... ...deep chain-of-thought reasoning. Software Engineer, Infrastructure Performance Designing and writing software for... ...You may be a good fit if you Are highly technical Possess expert-level scripting...
Transformer
Performance
Work at office
Relocation package
OpenReq
Cupertino, CA
1 day ago
Principal AI Compiler Engineer
...AI Compiler Engineer Locations available... ...focus on hardware-software co-design, you'll... ...to translate high-level AI models into... ...Pioneer new graph transformations, lowering,... ...Diagnose and crush performance bottlenecks with... ...understanding of AI inference workloads (CNNs,...
Transformer
Performance
NXP Semiconductors
San Jose, CA
2 days ago
ML Engineer - Inference & Model Deployment
...00x better job search engine: fast, comprehensive,... ...deploying models, optimizing inference latency and throughput... ...the details of model performance, GPU utilization,... ...multi-GPU inference, or high-throughput inference... ...attention mechanisms, transformer optimization, or modern...
Transformer
Performance
Full time
Relocation package
HiringCafe
Cupertino, CA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Inference Software Engineer: High-Performance Transformers. Be the first to apply!