Inference Software Engineer
$2,000 per monthETCHED LLC
About Etched Etched is building the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history. Key responsibilities
- Support porting state-of-the-art models to our architecture. Help build programming abstractions and testing capabilities to rapidly iterate on model porting.
- Build, enhance, and scale Sohu's runtime, including multi-node inference, intra-node execution, state management, and robust error handling.
- Optimize routing and communication layers using Sohu's collectives.
- Utilize performance profiling and debugging tools to identify bottlenecks and correctness issues.
- Proficiency in C++ or Rust.
- Understanding of performance-sensitive or complex distributed software systems like Linux internals, accelerator architectures (e.g. GPUs, TPUs), Compilers, or high-speed interconnects (e.g. NVLink, InfiniBand).
- Familiarity with PyTorch or JAX.
- Ported applications to non-standard accelerator hardware or hardware platforms.
- Developed low-latency, high-performance applications using both kernel-level and user-space networking stacks.
- Deep understanding of distributed systems concepts, algorithms, and challenges, including consensus protocols, consistency models, and communication patterns.
- Solid grasp of Transformer architectures, particularly Mixture-of-Experts (MoE).
- Built applications with extensive SIMD (Single Instruction, Multiple Data) optimizations for performance-critical paths.
- Medical, dental, and vision packages with generous premium coverage
- $500 per month credit for waiving medical benefits
- Housing subsidy of $2k per month for those living within walking distance of the office
- Relocation support for those moving to San Jose (Santana Row)
- Various wellness benefits covering fitness, mental health, and more
- Daily lunch + dinner in our office
- Unlimited compute budget subject to ROI justification
Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Inference Software Engineer in San Jose, CA vacancy
$184k - $287.5k
...We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You’ll architect and implement high-performance inference stacks, optimize GPU kernels and compilers, drive...Suggested$165k - $242k
...Senior Software Engineer II, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence....SuggestedPermanent employmentTemporary workCasual workWork at officeRemote workFlexible hoursShift work$139k - $204k
...Senior Software Engineer I, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence....SuggestedPermanent employmentTemporary workCasual workWork at officeRemote workFlexible hoursShift work$152k - $241.5k
...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact in Deep Learning by helping build a state-of-the-art inference framework for accelerating Deep Learning models, especially Large Language Models, on NVIDIA...Suggested$152k - $241.5k
...some of the world’s most challenging problems. We're seeking talented and motivated engineers to join our TensorRT team in developing the industry-leading deep learning inference software for NVIDIA AI accelerators. As a Senior Software Engineer in the TensorRT team,...Suggested$193.3k - $261.5k
...AWS Neuron is the software stack powering AWS Inferentia and Trainium machine learning accelerators... ...to deliver high-performance, low-cost inference at scale. The Neuron Serving team... .... We are seeking a Software Development Engineer to lead and architect our next-...InternshipLocal areaFlexible hours- ...The Role As a senior member of the LLM inference framework team, you will be responsible... ...sits at the intersection of inference engines, distributed systems, and GPU runtime and... ...architectures and kernel development Software Engineering ~ Expertise in Python...
- ...RL training and SOTA LLM and Multimodal inference at scale across multi-GPU and multi-node... ...You will collaborate across internal GPU software teams and engage with open-source... ...THE PERSON: Skilled engineer with strong technical and analytical expertise...
$152k - $241.5k
...NVIDIA is the platform upon which every new AI‑powered application is built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source LLM serving by contributing directly to upstream inference engines like vLLM and SGLang-ensuring they run best‑...Remote work$193.3k - $261.5k
...Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep... ...and JAX enabling unparalleled ML inference and training performance. The Inference... ...till the hardware-software boundary, our engineers build systematic infrastructure, innovate...Work experience placementInternshipLocal areaFlexible hours$156k - $316.8k
...Responsibilitie About the Team The Inference Infrastructure team is the creator and open... ...new AI workloads, and are looking for engineers passionate about cloud-native systems, scheduling... ...have recently completed a PhD degree in Software Development, Computer Science, Computer...Temporary workLocal area$152k - $241.5k
...technology for safety-critical applications? Join NVIDIA's TensorRT team as a Senior Software Engineer, and be at the forefront of technology, enabling high-performance AI inference solutions for automotive safety and other specialized platforms. Your expertise will help...$92k - $135k
...CRWV) in March 2025. Learn more at What You'll Do: Join the Inference team to ship production features that improve latency,... ...practices, and grow quickly with mentorship from experienced engineers. About the role: Implement well-scoped features and fixes...Permanent employmentTemporary workCasual workInternshipWork at officeRemote workFlexible hours- ...your career. THE ROLE: AMD is looking for a strategic software engineering lead who is passionate about improving the performance of... ...Develop techniques for optimizing scale-up and scale-out inference. Develop methods and tooling to utilize dynamic resources...
$272k - $431.25k
...NVIDIA is the platform for every new AI-powered application. We seek a Principal Software Engineer - AI Inference to advance open-source LLM serving. This role involves contributing to upstream inference engines like vLLM and SGLang. You will ensure they run outstandingly...Remote work- Advanced Micro Devices is looking for a systems-minded engineer in San Jose, CA, focusing on ML infrastructure and performance optimization for large-scale model inference. Ideal candidates should have a strong background in systems engineering and experience with GPU...
- ...is looking for a Senior Staff AI Infra Engineer who is passionate about improving the performance... ...at the intersection of hardware and software to optimize performance for next-... ...Optimize and accelerate LLM training and inference on AMD GPUs, improving kernel, communication...
- ...to deliver industry-leading training and inference speeds and empowers machine learning... ...Sunnyvale We're hiring a Staff Engineer to own major areas of the architecture of... ...Qualifications ~8+ years of experience in software engineering, with substantial individual...
- ...Tech Lead, Data & Inference Engineer Sunnyvale, California, United States About the Job Tech Lead, Data & Inference Engineer Our client is a fast moving and venture backed advertising technology startup based in San Francisco. They have raised twelve million...Full time
$199.7k - $254.6k
...You will collaborate with product and engineering teams to deploy reliable, secure, and observable... ...for LLMs/SLMs, including on-prem inference packaging, runtime optimization,... ...observability. This role requires strong software engineering, hands-on GPU inference experience...Full timeTemporary workLocal areaFlexible hours$246.5k
...core of this is our Machine Learning and Inference Platform that powers the entire... ...optimizations that span across hardware, software, and models. We're looking for a strong... ...frameworks - someone excited to mentor engineers, innovate at scale, and shape the future...Work at officeLocal areaRemote workMonday to ThursdayFlexible hours$197.3k - $225.1k
...Lead AI Engineer (FM Hosting, LLM Inference) Overview At Capital One, we are creating responsible and reliable AI systems, changing banking... ...Capital One. Design, develop, test, deploy, and support AI software components including foundation model training, large...Full timePart timeLocal area$120k - $180k
Application Engineer - Low Power Edge Inference (DIB Focus) About this Role We are seeking an Application Engineer to support deployment and integration... ...integrate models, and debug issues across hardware and software boundaries. This is an entry-level role with significant...For contractorsInternship- A leading technology company is looking for a Principal AI Performance Engineer to optimize AI inference performance on GPUs. In this role, you will lead a team driving performance optimization across various configurations, diagnose complex performance issues, and interact...
$212.3k - $275.8k
...You will collaborate with product and engineering teams to deploy reliable, secure, and observable AI services, optimizing inference performance from CPU and small GPUs to large... ...observability. This role requires strong software engineering, hands-on GPU inference experience...Full timeTemporary workLocal areaFlexible hours3 days per week- ...proficient in C++ or Rust and have experience with distributed software systems. Benefits include a housing subsidy, relocation... ...comprehensive health packages. Join a fully in-person team that values engineering and research collaboration. #J-18808-Ljbffr Etched.ai, Inc.Relocation package
- ...Platform Software Engineer Platform Software Engineer About Tensordyne AI is transforming our world. It can perform cognitive functions... ...that builds very high-performance, low-power generative AI inference systems. Our mission, through the creation of custom silicon...Contract workRemote workFlexible hours
- ...Full-Stack Software Engineer We are seeking a motivated, hardworking Full-Stack Software Engineer to join our team. The ideal candidate... ...Support integrating AI/ML into internal tools (data pipelines, inference endpoints, and dashboard integration). System...Internship
$139k - $204k
...CRWV) in March 2025. Learn more at What You'll Do: Senior engineers are area owners who lead designs, raise engineering standards,... ...orchestration, and hardware teams to evolve our Kubernetes-native inference platform and meet strict P99 SLAs at scale. About the role:...Permanent employmentTemporary workCasual workWork at officeFlexible hoursShift work$151.8k
...What you can expect We are looking for an AI Inference Engineer with a solid background in speech recognition and model inference. In this role, you will develop state-of-the-art automatic speech recognition system and ship it to various Zoom products. You will work...Work at officeRemote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Inference Software Engineer. Be the first to apply!
Related searches
- software engineer full time San Jose, CA
- facebook software engineer San Jose, CA
- startup software engineer San Jose, CA
- intermediate software engineer San Jose, CA
- research software engineer San Jose, CA
- rust software engineer San Jose, CA
- work from home software developer San Jose, CA
- software developer San Jose, CA
- software development engineer aws San Jose, CA
- software qa engineer San Jose, CA


