Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Infrastructure Software Engineer

$2,000 per month

Etched

Job Description

Job Description

About Etched

Etched is building the world’s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history.

Job Summary

Building cutting-edge model-specific ASICs requires crafting custom infrastructure and toolchains to support ultra-fast, reliable, and scalable development across the stack - from simulation to silicon. We build this infrastructure as software - and we engineer it with the same best practices we apply to our products. We use the same rigor, design discipline, and quality standards and testing as we do to our ASIC, software, and platform.

You will lead the development and adoption of next-generation infrastructure tooling, enabling Etched ASIC, Software, and Platform engineers to iterate faster, build more reliably, and push the boundaries of AI performance. This includes building and scaling our hybrid high-performance compute (HPC) cluster, optimized for massively parallel CI, EDA workflows, Emulation, and hardware-aware job execution.

You’ll also architect and implement a state-of-the-art observability stack with LLM integration and a strong emphasis on streaming health and performance telemetry, log aggregation, distributed tracing, insight generation, synthetic testing, and smart alerting - across CI pipelines, simulation clusters, and service endpoints.

This role demands a strong software engineering mindset, quality instincts, and deep understanding of systems. It’s not just about writing scripts - it’s about writing code that builds and manages infrastructure with precision, repeatability, and intent.

Key responsibilities

  • Design and build the orchestration layers that drive our hybrid high-performance clusters—enabling simulation, synthesis, and continuous integration of AI ASICs at unprecedented scale.

  • Develop and maintain a fully programmable infrastructure control plane to ensure reproducibility, auditability, and rapid iteration across the entire stack.

  • Create tools and abstractions that empower engineers to harness massive parallelism without worrying about the underlying complexity..

  • Prototype and execute workload orchestration and migration strategies between on-premise and cloud environments, balancing performance, storage availability and replication, uptime, and cost across heterogeneous hardware and compute backends.

  • Implement real-time telemetry, tracing systems that surface insights from millions of metrics, enabling proactive debugging and system optimization.

  • Build a full observability stack that includes dashboards, alerting, automated responses, and a synthetic testing framework to proactively test infrastructure performance and reliability for various application and data flows, ensuring we remain proactive against issues impacting development and productivity workflows.

Representative projects

  • Design and deploy a fully automated, scalable hybrid HPC cluster, combining bare-metal servers and switches with cloud instances, provisioned through MaaS and orchestrated via SLURM and Kubernetes, optimized for mixed EDA workloads and parallel CI pipelines.

  • Develop a real-time observability system for ASIC toolchain jobs and distributed builds, integrating Prometheus, Grafana, and VictoriaMetrics with streaming telemetry, tracing, and alerting to detect performance regressions before they hit silicon.

  • Architect and implement a programmable infrastructure-as-code control plane, using Terraform, Ansible, and Puppet, to version, audit, and redeploy every layer of Etched's development stack with deterministic reproducibility.

  • Create a zero-downtime interactive development environment that provisions and connects Jupyter and VS Code sessions to GPUs and high-memory nodes via a secure zero-trust network, abstracting away cluster state and machine failures.

  • Prototype and evaluate dynamic workload migration strategies between on-premise and cloud environments to optimize for latency, reliability, and cost across simulation and synthesis pipelines.

  • Design a synthetic testing and fault injection framework to validate the behavior of infrastructure under high-load, degraded hardware, and intermittent network partitions - before they happen in production.

You may be a good fit if you

  • Are a systems-minded software engineer who loves building foundational platforms, working close to the metal and cloud, solving high-leverage problems at scale.

  • Are a deeply technical engineer who treats infrastructure as a software problem - prioritizing clean abstractions, version control,small change lists, easy roll backs, testing, and long-term maintainability over ad hoc configuration.

  • Have strong programming skills in languages such as Python, Go, Rust, and C++, and are comfortable building production-grade tooling.

  • Possess expert-level knowledge of Linux, virtualization, containerization, and CI/CD pipelines, with a deep understanding of how to debug, optimize, and scale complex systems.

  • Are familiar with Infrastructure as Code tools like OpenTofu, Ansible, or Puppet, and enjoy designing declarative, reproducible infrastructure systems.

  • Understand and use PromQL and other telemetry/query languages and have used LLM to extract insight from real-time metrics, and know how to architect and tune observability stacks.

  • Have a track record of debugging and resolving difficult hardware-software integration problems across bare-metal systems, networks, and distributed workloads.

  • Can lead and mentor technical teams, guiding design decisions and helping others develop sound engineering instincts.

  • Have 8+ years of experience in infrastructure engineering, systems programming, or backend software development - ideally in environments where performance, scale, or hardware interaction mattered.

  • Are driven by curiosity, take initiative, and have an innate sense of ownership — you thrive in uncharted territory, design for edge cases, and love making systems more powerful, reliable, and elegant.

Strong candidates may also have experience with

  • Familiarity with Bazel build system

  • Deep understanding of ASIC development flows, especially those involving Synopsys, Cadence, and Verilator, including how EDA tools interact with infrastructure for simulation, synthesis, and verification.

  • Hands-on experience architecting systems with AWS, GCP, or Azure, including hybrid on-prem/cloud deployments, workload migration strategies, and cloud-native orchestration tooling.

  • Experience monitoring, provisioning, and debugging bare-metal servers, network hardware, and high-performance storage systems in rack-scale environments.

  • Comfortable in profiling and optimizing compute environments for single-threaded latency, memory-bound workloads, or I/O throughput, especially in the context of simulation or CI performance.

  • Proficiency building or operating telemetry systems at scale using Prometheus, Grafana, Loki, VictoriaMetrics, and tools for distributed tracing, log aggregation, and real-time alerting across heterogeneous mediums (SMS, email, push alerts, etc.)

Benefits

  • Medical, dental, and vision packages with generous premium coverage

    • $500 per month credit for waiving medical benefits

  • Housing subsidy of $2k per month for those living within walking distance of the office

  • Relocation support for those moving to San Jose (Santana Row)

  • Various wellness benefits covering fitness, mental health, and more

  • Daily lunch + dinner in our office

  • Unlimited compute budget subject to ROI justification

How we’re different

Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

We are a fully in-person team in San Jose (Santana Row), and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.

Compensation Range: $150K - $250K

Vacancy posted 28 days ago
Similar jobs that could be interesting for youBased on the Infrastructure Software Engineer in San Jose, CA vacancy
  • $156k - $387.6k

     ...About The Team: The Recommendation System Infrastructure team is responsible for building and...  ...next generation of AI-native and agentic engineering workflows. We focus on core...  ...s degree or above in Computer Science, Software Engineering, or a related technical field... 
    Suggested
    Temporary work
    Local area

    Tik Tok

    San Jose, CA
    2 days ago
  • $175k - $290k

     ...Senior Software Infrastructure Engineer Santa Clara, CA This role is part of the Software Infrastructure team, responsible for building and scaling the core development infrastructure that supports the entire software engineering organization. You will work on designing... 
    Suggested
    Remote work

    Phizenix

    Santa Clara, CA
    3 days ago
  • $175k - $225k

     ...startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate...  ...the Role: CoreWeave is seeking a passionate and innovative Software Engineer of Network Services to lead the architecture, scaling, and... 
    Suggested
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    20 days ago
  • $75k - $300k

     ...Software Engineer for Network Protocol Stack Tensor is an agentic AI company dedicated to building agentic products that empower individual consumers. Our flagship product, the Tensor Robocar, is the world’s first personal Robocar and the first AI agentic vehicle — fully... 
    Suggested

    Tensor

    San Jose, CA
    3 days ago
  • $152k - $241.5k

     ...NVIDIA is searching for a highly motivated, excellent Senior Software Engineer for design and verification to join the software tools group. You will design and develop tools that enable developers worldwide to harness the full power of NVIDIA products. The successful... 
    Suggested
    Worldwide

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $165k - $241.4k

     ...A leading technology firm in Milpitas is looking for an experienced Software Engineer to join their Service Provider High End Router team. The role involves developing and optimizing firmware and SDKs for next-generation networking products. Candidates should have expertise... 

    Cisco

    Milpitas, CA
    3 days ago
  • $136.5k - $276.5k

     ...Hobbsnews seeks a Senior Networking Software Engineer for an onsite position in San Jose, California. This role involves designing and developing software for networking applications, including routers and networks. The ideal candidate has strong C and Python programming... 

    Hobbsnews

    San Jose, CA
    4 days ago
  • $152k - $241.5k

     ...NVIDIA Gruppe is seeking a Senior Software Engineer for design and verification in their software tools group located in Santa Clara, CA. The ideal candidate will have a strong background in C++ and Python, leadership in Agile development, and ownership of software development... 

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  •  ...Software Engineer – Networking Raleigh, NC | Bay Area / San Jose, CA | 12 Months Responsibilities: Design, develop, and maintain software components related to network switches and routers, including Switch Abstraction Interface (SAI) Implement SAI-based... 

    Echo IT Solutions

    San Jose, CA
    1 day ago
  • $212.8k

     ...and prevent outages. 2. Cost Optimization, including: Analyze infrastructure usage and spending to identify key cost drivers and...  ...Minimum Qualifications 1. Bachelor's degree in Computer Science, Engineering, or a related field. 2. Proficiency in at least one programming... 
    Temporary work
    Local area

    Tik Tok

    San Jose, CA
    3 days ago
  • $165k - $241.4k

     ...platform for secure, high-performance infrastructure and large-scale AI clusters. Our team is...  ...We work at the intersection of systems software, hardware acceleration, distributed security...  ...We are looking for a senior software engineer to develop programmable packet... 
    Full time
    Temporary work
    Work at office
    Local area
    Flexible hours
    3 days per week

    Cisco

    San Jose, CA
    5 days ago
  • $182k - $193k

     ...ordering, and more, allowing our customers to extend their brand in new and meaningful ways. We want to bring in a senior-level software engineer to join the engineering team building Dynamic Interaction, a groundbreaking product in conversational AI. Dynamic Interaction... 
    Work at office

    SoundHound

    Santa Clara, CA
    3 days ago
  • $136.5k - $276.5k

     ...Senior Networking Software Engineer This role has been designed as ''Onsite' with an expectation that you will primarily work from an HPE office. Who We Are: Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work... 
    Work experience placement
    Work at office
    Remote work

    Hewlett Packard Enterprise Development LP

    San Jose, CA
    5 days ago
  •  ...ACL Digital is seeking a highly skilled Senior Network Software Engineer in Santa Clara, California, to join a cutting-edge networking team focused on next-generation routing platforms and network operating systems. Candidates should have 7+ years of experience with Layer... 

    ACL Digital

    Santa Clara, CA
    3 days ago
  • $165k - $241.4k

     ...platform for secure, high-performance infrastructure and large-scale AI clusters. Our team is...  ...We work at the intersection of systems software, hardware acceleration, distributed security...  ...We are looking for a senior software engineer to help explore, design, and deliver... 
    Full time
    Temporary work
    Work at office
    Flexible hours
    3 days per week

    Cisco

    Milpitas, CA
    2 days ago
  • $122.57k - $256k

     ...technologies from network architecture, software defined networking (SDN), network...  ...operating the global, intelligent network infrastructure to meet the requirements of high availability...  ...support the whole lifecycle of network engineering. Qualification Minimum... 
    Temporary work
    Local area

    ByteDance

    San Jose, CA
    4 days ago
  • $136.5k - $276.5k

     ...Senior WLAN/WIFI Networking/Software Engineer This role has been designed as "Onsite" with an expectation that you will primarily work from an HPE office. Who We Are: Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live... 
    Work experience placement
    Work at office

    Hewlett Packard Enterprise

    San Jose, CA
    1 day ago
  • $212.8k

     ...technologies from network architecture, software defined networking (SDN), network...  ...operating the global, intelligent network infrastructure to meet the requirements of high availability...  ...Computer Science, Information Science, Engineering, Mathematics, or equivalent with two or... 
    Temporary work
    Work experience placement
    Local area

    ByteDance

    San Jose, CA
    4 days ago
  • $174k - $252k

    Senior Software Engineer, Infrastructure, CoreOS Agentic Engineering Sunnyvale, CA, USA Mid Experience driving progress, solving problems, and mentoring more junior team members; deeper expertise and applied knowledge within relevant area. Apply Bachelor’s degree or... 
    Full time
    Worldwide

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $152k - $241.5k

    We are now looking for a Senior Infrastructure Software Engineer for Deep Learning Libraries! NVIDIA's Deep Learning Libraries Group is seeking excellent software engineers to enable the next wave of NVIDIA’s highest performing deep learning libraries. The role spans multiple... 

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...NVIDIA is looking for an excellent Software Engineer to join the InfiniBand Switch and NVLink FW group in Santa Clara, CA. As the team member, you will be part of a major development effort for the next-generation networking products. The verification team develops modern... 
    Shift work

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $212.8k

     ...technologies from network architecture, software defined networking (SDN), network...  ...operating the global, intelligent network infrastructure to meet the requirements of high availability...  ..., Computer Science, Computer Engineering, or a related technical discipline. Preferred... 
    Temporary work
    Local area

    ByteDance

    San Jose, CA
    3 days ago
  • $224k - $356.5k

     ...technical leader to help NVIDIA's GPU software team advance its software development lifecycle...  ...the branch health, improving the build infrastructure and coordinating with cross-...  ...due to unprecedented growth, our elite engineering teams are rapidly growing. If you're a... 
    Temporary work

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...Senior WLAN/WIFI Networking/Software Engineer This role has been designed as 'Onsite' with an expectation that you will primarily work from an HPE office. Who We Are: Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live... 
    Work at office

    Hewlett Packard Enterprise

    Alviso, CA
    7 days ago
  • $136.5k - $276.5k

     ...Senior WLAN/WIFI Networking/Software EngineerThis role has been designed as ‘’Onsite’...  ...#unitedstates### ### #networking**Job:**Engineering**Job Level:**TCP\_04 "The expected salary...  ...secure, cloud-enabled, mobile-friendly infrastructure. Many rely on a combination of both.... 
    Work experience placement
    Work at office

    Hewlett Packard Enterprise Development LP

    San Jose, CA
    3 days ago
  • $155.5k - $315k

     ...Senior Full Stack Software Engineer (Network) This role has been designed as 'Hybrid' with an expectation that you will work on average 2 days per week from an HPE office. Who We Are: Hewlett Packard Enterprise is the global edge-to-cloud company advancing the... 
    Work experience placement
    Work at office
    2 days per week

    Hewlett Packard Enterprise Development LP

    San Jose, CA
    5 days ago
  • $250k - $300k

     ...the best candidates from Jack's network. The next step is to speak to Jack. Network Software Engineer ($250K - $300K + 0.1% - 0.5% Equity) at Blaxel - AI Cloud Infrastructure Company Description: Blaxel - First Round Capital backed AI-native cloud Job... 

    Jack and Jill AI

    San Jose, CA
    2 days ago
  • $147.4k - $272.1k

    Cupertino, California, United States Software and Services We are building and supporting new and existing critical infrastructural systems and frameworks which provide and support...  ...reuse, efficiency, and simplicity. This engineer’s work will affect hundreds of millions... 
    Relocation

    Apple Inc.

    Cupertino, CA
    5 days ago
  • $131k - $154k

     ...energy and intelligence . As the only vertically integrated AI infrastructure company built from the ground up, we own and operate each...  ...build with us at Crusoe. About This Role: As a Software Engineer II - Software Defined Networking, you will lead the development... 
    Temporary work

    Crusoe

    Sunnyvale, CA
    14 days ago
  • $185k - $265k

     ...A leading cybersecurity firm is seeking a Senior/Staff Software Engineer for their FortiCNAPP Team in Sunnyvale, California. The successful candidate will design and implement scalable and resilient platforms for processing large volumes of data. Candidates should have... 

    Fortinet

    Sunnyvale, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Infrastructure Software Engineer. Be the first to apply!