Inference Software Engineer
$2,000 per monthETCHED LLC
About Etched Etched is building the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history. Key responsibilities
- Support porting state-of-the-art models to our architecture. Help build programming abstractions and testing capabilities to rapidly iterate on model porting.
- Build, enhance, and scale Sohu's runtime, including multi-node inference, intra-node execution, state management, and robust error handling.
- Optimize routing and communication layers using Sohu's collectives.
- Utilize performance profiling and debugging tools to identify bottlenecks and correctness issues.
- Proficiency in C++ or Rust.
- Understanding of performance-sensitive or complex distributed software systems like Linux internals, accelerator architectures (e.g. GPUs, TPUs), Compilers, or high-speed interconnects (e.g. NVLink, InfiniBand).
- Familiarity with PyTorch or JAX.
- Ported applications to non-standard accelerator hardware or hardware platforms.
- Developed low-latency, high-performance applications using both kernel-level and user-space networking stacks.
- Deep understanding of distributed systems concepts, algorithms, and challenges, including consensus protocols, consistency models, and communication patterns.
- Solid grasp of Transformer architectures, particularly Mixture-of-Experts (MoE).
- Built applications with extensive SIMD (Single Instruction, Multiple Data) optimizations for performance-critical paths.
- Medical, dental, and vision packages with generous premium coverage
- $500 per month credit for waiving medical benefits
- Housing subsidy of $2k per month for those living within walking distance of the office
- Relocation support for those moving to San Jose (Santana Row)
- Various wellness benefits covering fitness, mental health, and more
- Daily lunch + dinner in our office
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Inference Software Engineer in San Jose, CA vacancy
$184k - $287.5k
...We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You’ll architect and implement high-performance inference stacks, optimize GPU kernels and compilers, drive...Suggested$139k - $204k
...Senior Software Engineer I, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence....SuggestedPermanent employmentTemporary workCasual workWork at officeRemote workFlexible hoursShift work$165k - $242k
...Senior Software Engineer II, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence....SuggestedPermanent employmentTemporary workCasual workWork at officeRemote workFlexible hoursShift work$184k - $287.5k
...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and optimize the GPU-accelerated software that powers today's most sophisticated AI applications. Our team is responsible...SuggestedRemote work$152k - $241.5k
...NVIDIA is the platform upon which every new AI‑powered application is built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source LLM serving by contributing directly to upstream inference engines like vLLM and SGLang-ensuring they run best‑...Suggested- ...RL training and SOTA LLM and Multimodal inference at scale across multi-GPU and multi-node... ...You will collaborate across internal GPU software teams and engage with open-source... ...THE PERSON: Skilled engineer with strong technical and analytical expertise...
$193.3k - $261.5k
...AWS Neuron is the software stack powering AWS Inferentia and Trainium machine learning accelerators... ...to deliver high-performance, low-cost inference at scale. The Neuron Serving team... .... We are seeking a Software Development Engineer to lead and architect our next-...InternshipLocal areaFlexible hours$152k - $241.5k
...some of the world’s most challenging problems. We're seeking talented and motivated engineers to join our TensorRT team in developing the industry-leading deep learning inference software for NVIDIA AI accelerators. As a Senior Software Engineer in the TensorRT team,...$152k - $241.5k
...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact in Deep Learning by helping build a state-of-the-art inference framework for accelerating Deep Learning models, especially Large Language Models, on NVIDIA...$152k - $241.5k
...technology for safety-critical applications? Join NVIDIA's TensorRT team as a Senior Software Engineer, and be at the forefront of technology, enabling high-performance AI inference solutions for automotive safety and other specialized platforms. Your expertise will help...$193.3k - $261.5k
...Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep... ...and JAX enabling unparalleled ML inference and training performance. The Inference... ...till the hardware-software boundary, our engineers build systematic infrastructure, innovate...Work experience placementInternshipLocal areaFlexible hours$156k - $316.8k
...Responsibilitie About the Team The Inference Infrastructure team is the creator and open... ...new AI workloads, and are looking for engineers passionate about cloud-native systems, scheduling... ...have recently completed a PhD degree in Software Development, Computer Science, Computer...Temporary workLocal area$92k - $135k
...CRWV) in March 2025. Learn more at What You'll Do: Join the Inference team to ship production features that improve latency,... ...practices, and grow quickly with mentorship from experienced engineers. About the role: Implement well-scoped features and fixes...Permanent employmentTemporary workCasual workInternshipWork at officeRemote workFlexible hours- ...your career. THE ROLE: AMD is looking for a strategic software engineering lead who is passionate about improving the performance of... ...Develop techniques for optimizing scale-up and scale-out inference. Develop methods and tooling to utilize dynamic resources...
$272k - $431.25k
...NVIDIA is the platform for every new AI-powered application. We seek a Principal Software Engineer - AI Inference to advance open-source LLM serving. This role involves contributing to upstream inference engines like vLLM and SGLang. You will ensure they run outstandingly...Remote work$155.42k - $205.9k
...Description About the Team: The ML Inference Platform is part of the AV ML... ...are seeking a Senior ML Infrastructure engineer to help build and scale robust platforms... ...Design and implement core platform backend software components. Collaborate with ML engineers...Local areaRemote workWork from homeRelocationRelocation packageFlexible hours$128.7k - $261.3k
...About the Team The Model Deployment & Inference Solutions team in GM AV deploys machine... ...workflows currently performed manually by engineers. Build the developer experience that... ...Experience designing clean, well-tested software with clear interfaces and good...Local areaRemote workWork from homeRelocation packageFlexible hoursShift work$185.5k - $270k
...assistance. About the Team: The ML Inference Platform is part of the AI Compute... ...We are seeking a Staff ML Infrastructure engineer to help build and scale robust Compute platforms... ...and implement core platform backend software components. Collaborate with ML engineers...Local areaWork from homeRelocation packageFlexible hours- ...to deliver industry-leading training and inference speeds and empowers machine learning... ...Sunnyvale We're hiring a Staff Engineer to own major areas of the architecture of... ...Qualifications ~8+ years of experience in software engineering, with substantial individual...
- ...Tech Lead, Data & Inference Engineer Cupertino, California, United States About the Job A fast moving and venture backed advertising technology startup based in San Francisco. They have raised twelve million dollars in funding and are transforming how business...Full time
$199.7k - $254.6k
...Incubation Team as a Senior AI/ML DevOps Engineer and help productionize LLM/SLM... ...and observable AI services, optimizing inference performance from CPU and small GPUs to large... ...observability. This role requires strong software engineering, hands-on GPU inference experience...Full timeTemporary workLocal areaFlexible hours$246.5k
...core of this is our Machine Learning and Inference Platform that powers the entire... ...optimizations that span across hardware, software, and models. We're looking for a strong... ...frameworks - someone excited to mentor engineers, innovate at scale, and shape the future...Work at officeLocal areaRemote workMonday to ThursdayFlexible hours$197.3k - $225.1k
...Lead AI Engineer (FM Hosting, LLM Inference) Overview At Capital One, we are creating responsible and reliable AI systems, changing banking... ...Capital One. Design, develop, test, deploy, and support AI software components including foundation model training, large...Full timePart timeLocal area- ...Platform Software Engineer Platform Software Engineer About Tensordyne AI is transforming our world. It can perform cognitive functions... ...that builds very high-performance, low-power generative AI inference systems. Our mission, through the creation of custom silicon...Contract workRemote workFlexible hours
- ...Full-Stack Software Engineer We are seeking a motivated, hardworking Full-Stack Software Engineer to join our team. The ideal candidate... ...Support integrating AI/ML into internal tools (data pipelines, inference endpoints, and dashboard integration). System...Internship
$151.8k
...What you can expect We are looking for an AI Inference Engineer with a solid background in speech recognition and model inference. In this role, you will develop state-of-the-art automatic speech recognition system and ship it to various Zoom products. You will work...Work at officeRemote work$181.1k - $318.4k
...Software Engineer - AML, AI & Data Platforms (AiDP) AI & Data Platforms (AiDP) is IS&T's engine for AI-powered innovation. The team brings... ...Learning and Data Science teams to train, build, deploy and inference models at scale to prevent Fraud on multiple Apple Platforms...Relocation$181.1k - $318.4k
...Full Stack Software Engineer - ML Compute Capacity Scaling machine learning workloads across thousands of accelerators creates challenges... ...the infrastructure that powers large-scale ML training and inference workloads, bringing together expertise in distributed...Relocation$172.5k - $306.63k
...these models and the associated prompt engine. This is an opportunity to reach millions... ...come up with solutions to simplify the software stack ~ Develop efficient, reliable... ...~ Experience with GPU-based ML inference services #FireflyGenAI About...Temporary workLocal areaWorldwide$181.1k - $318.4k
...Sr Software Engineer - AI, Search & Knowledge Platform – Cloud Infrastructure Are you an open-source contributor passionate about building... ...intelligent, automated infrastructure for ML training and inference at massive scale—this role is for you. You'll architect...Relocation
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Inference Software Engineer. Be the first to apply!
Related searches
- graduate software developer San Jose, CA
- rust software engineer San Jose, CA
- senior software design engineer San Jose, CA
- software engineer student San Jose, CA
- software engineer amazon San Jose, CA
- software developer positions San Jose, CA
- software engineer full time San Jose, CA
- software qa engineer San Jose, CA
- new graduate software engineer San Jose, CA
- junior software developer San Jose, CA


