Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Inference Software Engineer: High-Performance Transformers

Etched.ai, Inc.

Etched.ai, Inc. is looking for candidates to support porting AI models to its innovative architecture in San Jose, California. Responsibilities include enhancing runtime environments and optimizing performance layers. Ideal applicants are proficient in C++ or Rust and have experience with distributed software systems. Benefits include a housing subsidy, relocation support, and comprehensive health packages. Join a fully in-person team that values engineering and research collaboration. #J-18808-Ljbffr Etched.ai, Inc.

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Inference Software Engineer: High-Performance Transformers in San Jose, CA vacancy
  • $2,000 per month

     ...the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically...  ...staffed by leading engineers, Etched is redefining...  ...distributed software systems like Linux...  ...TPUs), Compilers, or high-speed interconnects... 
    Transformer
    Performance
    Work at office
    Relocation package

    ETCHED LLC

    San Jose, CA
    5 days ago
  • $151.8k

     ...AI Inference Engineer We are looking for an AI Inference Engineer with...  ...infrastructure teams, to deliver high-impact projects from the...  ...Optimizing model inference performance by diving deep into the...  ...Demonstrate deep understanding of transformer encoder-decoder frameworks... 
    Transformer
    Performance

    Zoom Video Communications

    San Jose, CA
    4 days ago
  •  ...ML Systems Engineer — Training & Inference Optimization (MBMB) We are building...  ...robot foundation models, high-performance training infrastructure,...  ...boundaries across hardware, software, and model design — where...  ...Improve performance of transformer and diffusion-based architectures... 
    Transformer
    Performance

    Seer

    Sunnyvale, CA
    1 day ago
  •  ...leader to join the AI Software group. As a...  ...industry-leading performance for our top-tier...  ...engagement, and software engineering, ensuring that...  ...and compilers to high-level AI...  ...optimizing distributed inference and training at scale...  ...architectures (Transformer, Attention, KV... 
    Transformer
    Performance

    Advanced Micro Devices , Inc.

    San Jose, CA
    2 days ago
  • $152k - $241.5k

     ...large language model inference? Join NVIDIA's TensorRT...  .... We build the software stack that enables Large...  ...optimizations tailored for transformer-based models running...  ...robotics to deliver high-performance, production-ready...  ...Electrical/Computer Engineering, or a closely related... 
    Transformer
    Performance
    Remote work

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $2,000 per month

     ...the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically...  ...staffed by leading engineers, Etched is redefining...  ...drivers ensuring high reliability, informative...  ...Collaborate with software and hardware teams... 
    Transformer
    Performance
    Work at office
    Relocation package

    ETCHED LLC

    San Jose, CA
    5 days ago
  • $2,000 per month

     ...the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically...  ...staffed by leading engineers, Etched is redefining...  ...We are seeking highly motivated and detail-oriented software engineers to join our... 
    Transformer
    Performance
    Work at office
    Relocation package

    ETCHED LLC

    San Jose, CA
    3 days ago
  • $152k - $241.5k

     ...NVIDIA seeks a senior software engineer to join the AI...  ...within LLM training and inference stacks. A strong passion...  ...ML) for comprehensive performance analysis and...  ...includes a focus on high-performance networking...  ...ability to apply GNNs/transformers-based optimization to... 
    Transformer
    Performance

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $2,000 per month

     ...the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically...  ...staffed by leading engineers, Etched is redefining...  ...Join our team as a Software Engineer -...  .... We are seeking a highly skilled engineer to... 
    Transformer
    Performance
    Work at office
    Relocation package

    ETCHED LLC

    San Jose, CA
    2 days ago
  • $152k - $241.5k

     ...for a Senior Deep Learning Software Engineer, TensorRT Performance! NVIDIA is seeking an...  ...performance of NVIDIA's inference ecosystem! NVIDIA is rapidly...  ...models and workloads (e.g. Transformers, Recommenders, ASR, TTS,...  ...employer. As we highly value diversity in our current... 
    Transformer
    Performance
    Remote work

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $2,000 per month

     ...the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically...  ...staffed by leading engineers, Etched is redefining...  ...infrastructure as software - and we engineer it...  ...scaling our hybrid high-performance compute... 
    Transformer
    Performance
    Work at office
    Relocation package

    ETCHED LLC

    San Jose, CA
    2 days ago
  • $269.1k - $307.2k

     ...Distinguished Software Engineer - IFX As a Distinguished...  ...system delivering the high-scale developer and runtime...  ...available and high performance systems to develop...  ...training, model inference and feature generation...  ...training and fine tuning Transformer-based models as well... 
    Transformer
    Performance
    Full time
    Part time
    Local area

    Capital One

    San Jose, CA
    4 days ago
  • $2,000 per month

     ...the world’s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically...  ...staffed by leading engineers, Etched is redefining...  ...the accelerator software that makes it usable...  ...memory management, high-throughput low-latency... 
    Transformer
    Performance
    Work at office
    Relocation package

    Etched.ai, Inc.

    San Jose, CA
    4 days ago
  • $269.1k - $307.2k

    Distinguished Software Engineer - IFX As a Distinguished Engineer...  ...system that delivers high‑scale developer and...  ...available, high‑performance systems will drive our...  ...ML/DL model training, inference, feature pipelines, pre...  ...and fine‑tuning of transformer‑based models, and generative... 
    Transformer
    Performance
    Local area

    Capital One National Association

    San Jose, CA
    4 days ago
  • $160k - $200k

     ...leakage. Build a highly scalable pipeline...  .... Mentor junior engineers on secure backend...  ...of high-quality software features while adhering...  ...model inference pipelines, fine‑tuning...  ...data caching, and performance optimization. Knowledge...  ...as Hugging Face Transformers or LangChain for... 
    Transformer
    Performance
    Full time

    Fortinet, Inc.

    Sunnyvale, CA
    4 days ago
  •  ...Senior Staff Field Applications Engineer - High Voltage Job Description...  ...V to 48V DCX (LLC-based DC transformers) ~800V to 12V direct...  ...bring-up and validation ~ Performance optimization and characterization...  ...explore our hardware and software capabilities and try new... 
    Transformer
    Performance
    Temporary work
    Local area
    Remote work
    Flexible hours
    Shift work

    Renesas

    San Jose, CA
    5 days ago
  • $152k - $241.5k

     ...seeking talented and motivated engineers to join our TensorRT team...  ...-leading deep learning inference software for NVIDIA AI accelerators....  ...Knowledge of close-to-metal performance analysis, optimization techniques...  ...employer. As we highly value diversity in our current... 
    Performance

    NVIDIA

    Santa Clara, CA
    3 days ago
  •  ...enhancing GPU kernel performance, accelerating deep learning...  ...LLM and Multimodal inference at scale across multi-...  ...across internal GPU software teams and engage with...  ...PERSON: Skilled engineer with strong technical...  ...efforts, and deliver high-quality solutions.... 
    Performance

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    2 days ago
  • $152k - $241.5k

     ...application is built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source LLM serving...  ...the underlying stack that enables high‑throughput, low‑latency inference at...  ...an engineer who enjoys digging into performance bottlenecks, designing pragmatic... 
    Performance

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

     ...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact in Deep...  ...multiple platforms for functionality and performance Develop components of TensorRT, NVIDIA’s SDK for high-performance deep learning inference.... 
    Performance

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build,...  ...team is responsible for developing and maintaining high-performance deep learning frameworks, including SGLang and... 
    Performance
    Remote work

    NVIDIA

    Santa Clara, CA
    6 days ago
  • $193.3k - $261.5k

     ...AWS Neuron is the software stack powering AWS Inferentia and Trainium machine...  ...accelerators, designed to deliver high-performance, low-cost inference at scale. The Neuron Serving team develops...  ...are seeking a Software Development Engineer to lead and architect our next-... 
    Performance
    Internship
    Local area
    Flexible hours

    Amazon

    Cupertino, CA
    5 days ago
  • $165k - $242k

     ...Senior Software Engineer II, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud...  ...CoreWeave combines superior infrastructure performance with deep technical expertise to...  ...leveraging custom accelerators for high-efficiency workloads ~ Hands-on... 
    Performance
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours
    Shift work

    CoreWeave

    Sunnyvale, CA
    3 days ago
  • $156k - $316.8k

     ...Responsibilitie About the Team The Inference Infrastructure team is the...  ...infrastructure that is highly performant, massively scalable, cost-...  ..., and are looking for engineers passionate about cloud-native...  ...completed a PhD degree in Software Development, Computer... 
    Performance
    Temporary work
    Local area

    ByteDance

    San Jose, CA
    2 days ago
  • $193.3k - $261.5k

     ...AWS) builds AWS Neuron, the software development kit used to...  ...enabling unparalleled ML inference and training performance. The Inference Enablement...  ...hardware-software boundary, our engineers build systematic...  ...innovate new methods and create high-performance kernels for ML... 
    Performance
    Work experience placement
    Internship
    Local area
    Flexible hours

    Amazon

    Cupertino, CA
    5 days ago
  •  ...deliver industry‑leading training and inference speeds and empowers machine...  ...to deploy 750 megawatts of scale, transforming key workloads with ultra high‑speed inference. About The Role We are hiring a Senior Performance Engineer to join our Product team. You are an... 
    Transformer
    Performance
    Contract work
    Shift work

    Cerebras

    Sunnyvale, CA
    2 days ago
  • $184k - $287.5k

     ...We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models...  ...architect and implement high-performance inference stacks, optimize...  ...intelligence (AI) will fundamentally transform how people live and work.... 
    Performance

    NVIDIA

    Santa Clara, CA
    3 days ago
  •  ...building the world’s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower...  ...and staffed by leading engineers, Etched is redefining...  ...model performance Implement high-performance software components for the Model... 
    Transformer
    Performance
    Internship
    Work at office
    Relocation

    Etched

    San Jose, CA
    3 days ago
  •  ...AI Compiler Engineer Locations available...  ...focus on hardware-software co-design, you'll...  ...to translate high-level AI models into...  ...Pioneer new graph transformations, lowering,...  ...Diagnose and crush performance bottlenecks with...  ...understanding of AI inference workloads (CNNs,... 
    Transformer
    Performance

    NXP Semiconductors

    San Jose, CA
    1 day ago
  •  ...technology firm based in San Jose is seeking an experienced RTL Engineer who will focus on design verification of cutting-edge AI chips....  ...least 5 years of experience in RTL development and be proficient in high-speed digital logic. This role requires a quick learner willing... 
    Performance
    Relocation package

    Etched

    San Jose, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Inference Software Engineer: High-Performance Transformers. Be the first to apply!