Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer - Performance Profiling

$2,000 per month

Delos™

ML Performance Characterization Engineer About Etched Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Job Summary Join our team as a Senior ML Performance Characterization Engineer and take the lead in illuminating the performance landscape of our cutting‑edge ML accelerator. We are seeking a highly skilled engineer to design and develop a sophisticated performance analysis tool, tailored specifically for Sohu. You will be instrumental in creating the essential tooling that enables our ML engineers and customers to understand workload behavior, identify performance bottlenecks, and unlock the full potential of Sohu accelerating the most demanding ML applications in the world. This is a unique opportunity to shape performance analysis for novel hardware from the ground up. Key responsibilities Tool Architecture & Design: Lead the design and architecture of a comprehensive performance analysis suite, including data collection mechanisms, data processing pipelines, analysis engines, and user interfaces (CLI and/or GUI). Low‑Level Data Collection: Develop robust methods to capture performance data directly from our custom ML accelerator hardware (e.g., hardware performance counters, execution unit status, memory access patterns) via driver interfaces or other mechanisms. Host & System Tracing: Implement tracing for host‑side API calls (runtime libraries, driver interactions) and system‑level events (CPU activity, PCIe traffic, memory usage, network contention) related to Sohu workloads. Data Correlation & Synchronization: Design and implement techniques to accurately correlate performance events across the host CPU, device driver, PCIe bus, and multiple accelerators, ensuring precise time synchronization. Performance Analysis Engine: Build analysis modules to automatically interpret collected trace and counter data, identifying key performance limiters (e.g., compute‑bound, memory bandwidth‑bound, latency‑bound, PCIe‑bound, specific hardware bottlenecks). Visualization & Reporting: Develop intuitive visualizations (timelines, dependency graphs, resource utilization charts, statistical summaries) to clearly communicate performance characteristics and bottlenecks to users. Collaboration & Support: Work closely with hardware architects, firmware engineers, driver developers, compiler engineers, and ML application engineers to understand their needs, define tool requirements, and provide expert guidance on performance analysis and optimization using the tool. Representative projects Architect and implement the core data collection framework for hardware performance counters on a custom PCIe-based accelerator. Develop a kernel driver module or user‑space service for low‑overhead tracing of accelerator activity. Design and build a correlated timeline view visualizing CPU API calls, driver submissions, PCIe transfers, and accelerator execution units. Create an analysis pass to detect and quantify memory access inefficiencies or PCIe bandwidth saturation while transacting on a PCIe‑attached accelerator. You may be a good fit if you have Strong proficiency in C/C++ and Python. Deep understanding of computer architecture (CPU, GPU, accelerators), memory hierarchies (caches, DRAM), and interconnects (especially PCIe). Proven experience in low‑level performance analysis, profiling, and bottleneck identification on complex hardware systems (GPUs, CPUs, FPGAs, or custom ASICs). Experience with performance analysis tools (e.g., NVIDIA Nsight, AMD uProf, Intel VTune, perf, Tracy, ETW). Solid understanding of operating system internals (Linux preferred), including scheduling, memory management, and driver interaction. Experience working close to hardware, potentially reading performance counters or interacting directly with device drivers. Strong candidates may also have experience with Direct experience developing performance analysis or debugging tools. Experience with ML accelerator architectures (GPUs, TPUs, etc.). Experience with kernel‑mode driver development (Linux or Windows). Understanding of compiler internals, code generation, and optimization. In‑depth knowledge of the PCIe protocol and analysis tools (PCIe analyzers). Experience with firmware or embedded systems development. Experience with hardware description languages (Verilog, VHDL) or hardware verification. Benefits Full medical, dental, and vision packages, with generous premium coverage Housing subsidy of $2,000/month for those living within walking distance of the office Daily lunch and dinner in our office Relocation support for those moving to West San Jose Compensation Range $150,000 - $275,000 How we’re different Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model‑specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single‑model ASICs. We are a fully in‑person team in West San Jose, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed. #J-18808-Ljbffr

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Software Engineer - Performance Profiling in San Jose, CA vacancy
  • $272k - $431.25k

     ...Always-On, low-overhead GPU profiling service that runs in production...  ...-on delivery across system software, drivers, and CUDA to make...  ...driver/platform layers, and performance counter/trace providers....  ...technical direction for an engineering team; mentor engineers, drive... 
    Performance

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $120 - $130 per hour

     ...Animation Software Engineer/Graphics Engineer V Location: Cupertino, California - Remote Duration: Contract Job...  ...Adding new features to the Keynote animation engine Profiling and optimizing performance of Keynote animations using state‑of‑the‑art graphics... 
    Performance
    Contract work
    Remote work

    PTR Global

    Cupertino, CA
    12 days ago
  • $224k - $356.5k

     ...are looking for a Senior Deep Learning Software Engineer to design and build our automated...  ...JAX, designing and implementing a high-performance execution environment, low-level GPU optimizations...  ...deployment solution. Analyze and profile GPU kernel-level performance to... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  •  ...Job Description: ~ We're looking for a Software Engineer to join a team working on next-generation AI and high-performance computing technologies. Responsibilities...  ...layers. Participate in debugging, profiling, benchmarking, and validation activities.... 
    Performance

    Varite

    San Jose, CA
    3 days ago
  • $100k

     ...Software Engineer, TT-Distributed Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining...  ...maintain testing, debugging, profiling, and monitoring tools for large-... 
    Performance

    Tenstorrent

    Santa Clara, CA
    14 days ago
  • $129.19k - $214.78k

     ...Software Engineer – Motion & Behavioral Planning San Jose, CA About the Company DiDi's autonomous...  ...aware, comfortable, and safe velocity profiles. Model complex driving scenarios and...  ..., and debugging of the system's performance in various scenarios, leading root cause... 
    Performance
    Internship

    DiDi Labs

    San Jose, CA
    1 day ago
  • $2,000 per month

     ...generation models and extremely deep chain-of-thought reasoning. Software Engineer, ML Performance Running millions of tokens per second for large models (...  ...involve a blend of low-level programming, performance profiling, and hands‑on debugging, all aimed at maximizing the... 
    Performance
    Work at office
    Relocation package

    OpenReq

    Cupertino, CA
    1 day ago
  • $2,000 per month

     ...delivering over 10x higher performance and dramatically lower cost...  ...investors and staffed by leading engineers, Etched is redefining the...  ...collectives. Utilize performance profiling and debugging tools to...  ...sensitive or complex distributed software systems like Linux internals... 
    Performance
    Work at office
    Relocation package

    Etched.ai, Inc.

    San Jose, CA
    1 day ago
  • $181.1k - $318.4k

     ...Cupertino, California, United States Software and Services Imagine what you...  ...and creative software engineer to build experiences that...  ...high quality, well tested, and performant code to develop those prototypes...  ...stand the test of time. - Profile and optimize cutting‑edge features... 
    Performance
    Work experience placement
    Relocation

    Apple

    Cupertino, CA
    1 day ago
  • $184k - $287.5k

     ...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing...  ...for developing and maintaining high-performance deep learning frameworks, including SGLang...  ...background with performance modeling, profiling, debug, and code optimization or... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $124k - $195.5k

    ## Software Engineer, GPU Performance ToolsApplylocations: US, CA, Santa Clara: US, OR, Hillsboro: US, Remotetime type: Full timeposted on: Posted...  ...This role offers the opportunity to develop innovative profiling capabilities on the next-generation hardware. In this role... 
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  •  ...fusing hardened hardware with software, sensors, AI, and networking...  ...of life. From technicians and engineers to first responders and...  ...platform XR support. You will ship performant, reliable user features that...  ...Optimize rendering, input, latency, profile and tune on device.... 
    Performance

    Rivet Industries, Inc.

    San Jose, CA
    1 day ago
  • $126.8k - $220.9k

     ...Software Development Engineer, Vision Products Group Sunnyvale, California, United States Software and...  ...contributing to systems that meet demanding performance requirements for latency, jitter,...  ...solutions for complex problems. Profile and optimize system performance to... 
    Performance
    Relocation

    Apple

    Sunnyvale, CA
    1 day ago
  • $156k - $387.6k

     ...Software Engineer, Recommendation Architecture ShortText Location: San Jose Employment Type:...  ...Responsibilities Build and maintain high performance online services for TikTok...  ...pipelines for candidates generation, profile generation, training examples generation... 
    Performance
    Temporary work
    Local area

    Ellis Technologies, Inc.

    San Jose, CA
    1 day ago
  • $152k - $241.5k

     ...We are seeking a Senior Software Engineer to drive integration of the NVIDIA Grove project within...  ...by users and partners. Optimize performance, scalability, and reliability for distributed...  .... Knowledge of GPU performance profiling and optimization (Nsight tools or similar... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...NVIDIA is seeking a Senior Software Engineer, NCCL and CUDA specialization to join our Cloud...  ...on ML software stack functionality and performance for datacenter products such as GB300...  ...multi-GPU workloads performance through profiling, benchmarking, and tuning. Understand... 
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  •  ...world around it. JOB SUMMARY: As a Senior Software Engineer - Navigation at Apptronik, you will...  ...people — in spaces built for humans — performing meaningful tasks that improve...  ...best practices. Strong debugging and profiling skills for performance optimization. Strong... 
    Performance
    Local area

    Trustwise

    Sunnyvale, CA
    1 day ago
  •  ...Middleware Software Engineer Figure is an AI robotics company developing autonomous general-...  ...intelligence. Its robots are engineered to perform a variety of tasks in the home and...  ...Comfortable using debuggers and performance profiling tooling. Bonus Qualifications:... 
    Performance

    Figure

    San Jose, CA
    4 days ago
  • $152k - $241.5k

     ...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like...  ...components of TensorRT, NVIDIA’s SDK for high-performance deep learning inference.Closely follow...  ...in software performance benchmarking, profiling, and optimizations.Background in... 
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

    ## Senior Software Engineer - NVIDIA WarpApplylocations: US, CA, Santa Clara: US, WA, Seattletime...  ...robotics targets.* Optimize on-device performance under real constraints including...  ...behavior, and to diagnose bottlenecks using profiling and system tools.* Strong... 
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $116.8k - $174.2k

     ...Vancouver, British Columbia, Canada Software and Services Imagine shaping...  ...Apple Developer Services Engineering team is at the heart of this...  ...robust, scalable, and high-performance server-side systems. These...  ...in performance tuning, profiling, and optimizing Java applications... 
    Performance
    Local area
    Worldwide
    Relocation

    Apple

    Cupertino, CA
    1 day ago
  • $168k - $270.25k

     ...NVIDIA GPU Architecture Group is seeking a senior software engineer to automate and optimize performance analysis workflows for AI training and inference workloads...  ...intuitive tooling. Develop integrations between profiling infrastructure and AI frameworks and workflows.... 
    Performance
    Work experience placement

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $147.4k - $272.1k

     ..., California, United States Software and Services Imagine what you...  ...and battery life focused engineer on the Workout team, you will...  ...development process to hit our performance goals, including help...  ...optimizing applications and profiling throughout the stack Experience... 
    Performance
    Relocation

    Apple

    Cupertino, CA
    1 day ago
  • $169.98k - $279.97k

     ...Implement and design Quantum Pro EDA software that enables engineers to design, model, and validate next‑...  ...into robust, maintainable, high‑performance software delivered through an Agile...  ...APIs, writing unit/integration tests, profiling and optimizing performance, and collaborating... 
    Performance
    Flexible hours

    Keysight Technologies SAles Spain SL.

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...application is built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source...  ...an engineer who enjoys digging into performance bottlenecks, designing pragmatic...  ...improve throughput and tail latency. Profile and improve hot paths across layers—from... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...Description: The Software Engineer position will be responsible for hands-on development as well...  ...application architecture, ensure high performance, scalability and availability for...  ...with others. o What makes a candidate profile stand out to you? Candidate should be... 
    Performance
    Contract work
    Work at office
    Local area
    Remote work

    My3Tech Inc

    Sunnyvale, CA
    2 days ago
  • $126.8k - $220.9k

     ...Bluetooth Software Engineer, Wireless Technologies & Ecosystems Cupertino, California, United...  ...based features for automotive audio/phone profiles, working closely with OEM partners to...  ...profiles and into OS-level frameworks Performance & Reliability — Profile, debug, and... 
    Performance
    Relocation

    Apple

    Cupertino, CA
    1 day ago
  • $184k - $287.5k

     ...production. That position depends on software as much as hardware, and compiler engineering is a big part of what makes it...  ...and lowering infrastructure. Performance analysis and optimization across...  ...including debugging, performance profiling, and designing for... 
    Performance
    Work experience placement

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $46 per hour

     ...Software Engineer, Simulation (Summer Intern) San Jose, CA About the Company DiDi's autonomous...  ...engineers to improve overall simulation performance. Responsibilities Perform optimization...  ...software under the guidance of profiler. Identify and resolve algorithm deficiencies... 
    Performance
    Hourly pay
    Summer work
    Internship
    Summer internship

    DiDi Labs

    San Jose, CA
    1 day ago
  •  ...instrumental in enhancing GPU kernel performance, accelerating deep learning...  ...across internal GPU software teams and engage with open-source...  .... THE PERSON: Skilled engineer with strong technical and analyticalexpertisein...  ...Deep Learning Models: Profile, analyze, code change and... 
    Performance

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer - Performance Profiling. Be the first to apply!