Software Engineer - Performance Profiling
$2,000 per monthDelos™
ML Performance Characterization Engineer About Etched Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Job Summary Join our team as a Senior ML Performance Characterization Engineer and take the lead in illuminating the performance landscape of our cutting‑edge ML accelerator. We are seeking a highly skilled engineer to design and develop a sophisticated performance analysis tool, tailored specifically for Sohu. You will be instrumental in creating the essential tooling that enables our ML engineers and customers to understand workload behavior, identify performance bottlenecks, and unlock the full potential of Sohu accelerating the most demanding ML applications in the world. This is a unique opportunity to shape performance analysis for novel hardware from the ground up. Key responsibilities Tool Architecture & Design: Lead the design and architecture of a comprehensive performance analysis suite, including data collection mechanisms, data processing pipelines, analysis engines, and user interfaces (CLI and/or GUI). Low‑Level Data Collection: Develop robust methods to capture performance data directly from our custom ML accelerator hardware (e.g., hardware performance counters, execution unit status, memory access patterns) via driver interfaces or other mechanisms. Host & System Tracing: Implement tracing for host‑side API calls (runtime libraries, driver interactions) and system‑level events (CPU activity, PCIe traffic, memory usage, network contention) related to Sohu workloads. Data Correlation & Synchronization: Design and implement techniques to accurately correlate performance events across the host CPU, device driver, PCIe bus, and multiple accelerators, ensuring precise time synchronization. Performance Analysis Engine: Build analysis modules to automatically interpret collected trace and counter data, identifying key performance limiters (e.g., compute‑bound, memory bandwidth‑bound, latency‑bound, PCIe‑bound, specific hardware bottlenecks). Visualization & Reporting: Develop intuitive visualizations (timelines, dependency graphs, resource utilization charts, statistical summaries) to clearly communicate performance characteristics and bottlenecks to users. Collaboration & Support: Work closely with hardware architects, firmware engineers, driver developers, compiler engineers, and ML application engineers to understand their needs, define tool requirements, and provide expert guidance on performance analysis and optimization using the tool. Representative projects Architect and implement the core data collection framework for hardware performance counters on a custom PCIe-based accelerator. Develop a kernel driver module or user‑space service for low‑overhead tracing of accelerator activity. Design and build a correlated timeline view visualizing CPU API calls, driver submissions, PCIe transfers, and accelerator execution units. Create an analysis pass to detect and quantify memory access inefficiencies or PCIe bandwidth saturation while transacting on a PCIe‑attached accelerator. You may be a good fit if you have Strong proficiency in C/C++ and Python. Deep understanding of computer architecture (CPU, GPU, accelerators), memory hierarchies (caches, DRAM), and interconnects (especially PCIe). Proven experience in low‑level performance analysis, profiling, and bottleneck identification on complex hardware systems (GPUs, CPUs, FPGAs, or custom ASICs). Experience with performance analysis tools (e.g., NVIDIA Nsight, AMD uProf, Intel VTune, perf, Tracy, ETW). Solid understanding of operating system internals (Linux preferred), including scheduling, memory management, and driver interaction. Experience working close to hardware, potentially reading performance counters or interacting directly with device drivers. Strong candidates may also have experience with Direct experience developing performance analysis or debugging tools. Experience with ML accelerator architectures (GPUs, TPUs, etc.). Experience with kernel‑mode driver development (Linux or Windows). Understanding of compiler internals, code generation, and optimization. In‑depth knowledge of the PCIe protocol and analysis tools (PCIe analyzers). Experience with firmware or embedded systems development. Experience with hardware description languages (Verilog, VHDL) or hardware verification. Benefits Full medical, dental, and vision packages, with generous premium coverage Housing subsidy of $2,000/month for those living within walking distance of the office Daily lunch and dinner in our office Relocation support for those moving to West San Jose Compensation Range $150,000 - $275,000 How we’re different Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model‑specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single‑model ASICs. We are a fully in‑person team in West San Jose, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed. #J-18808-Ljbffr
$272k - $431.25k
...Always-On, low-overhead GPU profiling service that runs in production... ...-on delivery across system software, drivers, and CUDA to make... ...driver/platform layers, and performance counter/trace providers.... ...technical direction for an engineering team; mentor engineers, drive...Performance$120 - $130 per hour
...Animation Software Engineer/Graphics Engineer V Location: Cupertino, California - Remote Duration: Contract Job... ...Adding new features to the Keynote animation engine Profiling and optimizing performance of Keynote animations using state‑of‑the‑art graphics...PerformanceContract workRemote work$224k - $356.5k
...are looking for a Senior Deep Learning Software Engineer to design and build our automated... ...JAX, designing and implementing a high-performance execution environment, low-level GPU optimizations... ...deployment solution. Analyze and profile GPU kernel-level performance to...Performance- ...Job Description: ~ We're looking for a Software Engineer to join a team working on next-generation AI and high-performance computing technologies. Responsibilities... ...layers. Participate in debugging, profiling, benchmarking, and validation activities....Performance
$100k
...Software Engineer, TT-Distributed Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining... ...maintain testing, debugging, profiling, and monitoring tools for large-...Performance$129.19k - $214.78k
...Software Engineer – Motion & Behavioral Planning San Jose, CA About the Company DiDi's autonomous... ...aware, comfortable, and safe velocity profiles. Model complex driving scenarios and... ..., and debugging of the system's performance in various scenarios, leading root cause...PerformanceInternship$2,000 per month
...generation models and extremely deep chain-of-thought reasoning. Software Engineer, ML Performance Running millions of tokens per second for large models (... ...involve a blend of low-level programming, performance profiling, and hands‑on debugging, all aimed at maximizing the...PerformanceWork at officeRelocation package$2,000 per month
...delivering over 10x higher performance and dramatically lower cost... ...investors and staffed by leading engineers, Etched is redefining the... ...collectives. Utilize performance profiling and debugging tools to... ...sensitive or complex distributed software systems like Linux internals...PerformanceWork at officeRelocation package$181.1k - $318.4k
...Cupertino, California, United States Software and Services Imagine what you... ...and creative software engineer to build experiences that... ...high quality, well tested, and performant code to develop those prototypes... ...stand the test of time. - Profile and optimize cutting‑edge features...PerformanceWork experience placementRelocation$184k - $287.5k
...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing... ...for developing and maintaining high-performance deep learning frameworks, including SGLang... ...background with performance modeling, profiling, debug, and code optimization or...Performance$124k - $195.5k
## Software Engineer, GPU Performance ToolsApplylocations: US, CA, Santa Clara: US, OR, Hillsboro: US, Remotetime type: Full timeposted on: Posted... ...This role offers the opportunity to develop innovative profiling capabilities on the next-generation hardware. In this role...Performance- ...fusing hardened hardware with software, sensors, AI, and networking... ...of life. From technicians and engineers to first responders and... ...platform XR support. You will ship performant, reliable user features that... ...Optimize rendering, input, latency, profile and tune on device....Performance
$126.8k - $220.9k
...Software Development Engineer, Vision Products Group Sunnyvale, California, United States Software and... ...contributing to systems that meet demanding performance requirements for latency, jitter,... ...solutions for complex problems. Profile and optimize system performance to...PerformanceRelocation$156k - $387.6k
...Software Engineer, Recommendation Architecture ShortText Location: San Jose Employment Type:... ...Responsibilities Build and maintain high performance online services for TikTok... ...pipelines for candidates generation, profile generation, training examples generation...PerformanceTemporary workLocal area$152k - $241.5k
...We are seeking a Senior Software Engineer to drive integration of the NVIDIA Grove project within... ...by users and partners. Optimize performance, scalability, and reliability for distributed... .... Knowledge of GPU performance profiling and optimization (Nsight tools or similar...Performance$184k - $287.5k
...NVIDIA is seeking a Senior Software Engineer, NCCL and CUDA specialization to join our Cloud... ...on ML software stack functionality and performance for datacenter products such as GB300... ...multi-GPU workloads performance through profiling, benchmarking, and tuning. Understand...Performance- ...world around it. JOB SUMMARY: As a Senior Software Engineer - Navigation at Apptronik, you will... ...people — in spaces built for humans — performing meaningful tasks that improve... ...best practices. Strong debugging and profiling skills for performance optimization. Strong...PerformanceLocal area
- ...Middleware Software Engineer Figure is an AI robotics company developing autonomous general-... ...intelligence. Its robots are engineered to perform a variety of tasks in the home and... ...Comfortable using debuggers and performance profiling tooling. Bonus Qualifications:...Performance
$152k - $241.5k
...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like... ...components of TensorRT, NVIDIA’s SDK for high-performance deep learning inference.Closely follow... ...in software performance benchmarking, profiling, and optimizations.Background in...Performance$184k - $287.5k
## Senior Software Engineer - NVIDIA WarpApplylocations: US, CA, Santa Clara: US, WA, Seattletime... ...robotics targets.* Optimize on-device performance under real constraints including... ...behavior, and to diagnose bottlenecks using profiling and system tools.* Strong...Performance$116.8k - $174.2k
...Vancouver, British Columbia, Canada Software and Services Imagine shaping... ...Apple Developer Services Engineering team is at the heart of this... ...robust, scalable, and high-performance server-side systems. These... ...in performance tuning, profiling, and optimizing Java applications...PerformanceLocal areaWorldwideRelocation$168k - $270.25k
...NVIDIA GPU Architecture Group is seeking a senior software engineer to automate and optimize performance analysis workflows for AI training and inference workloads... ...intuitive tooling. Develop integrations between profiling infrastructure and AI frameworks and workflows....PerformanceWork experience placement$147.4k - $272.1k
..., California, United States Software and Services Imagine what you... ...and battery life focused engineer on the Workout team, you will... ...development process to hit our performance goals, including help... ...optimizing applications and profiling throughout the stack Experience...PerformanceRelocation$169.98k - $279.97k
...Implement and design Quantum Pro EDA software that enables engineers to design, model, and validate next‑... ...into robust, maintainable, high‑performance software delivered through an Agile... ...APIs, writing unit/integration tests, profiling and optimizing performance, and collaborating...PerformanceFlexible hours$152k - $241.5k
...application is built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source... ...an engineer who enjoys digging into performance bottlenecks, designing pragmatic... ...improve throughput and tail latency. Profile and improve hot paths across layers—from...Performance- ...Description: The Software Engineer position will be responsible for hands-on development as well... ...application architecture, ensure high performance, scalability and availability for... ...with others. o What makes a candidate profile stand out to you? Candidate should be...PerformanceContract workWork at officeLocal areaRemote work
$126.8k - $220.9k
...Bluetooth Software Engineer, Wireless Technologies & Ecosystems Cupertino, California, United... ...based features for automotive audio/phone profiles, working closely with OEM partners to... ...profiles and into OS-level frameworks Performance & Reliability — Profile, debug, and...PerformanceRelocation$184k - $287.5k
...production. That position depends on software as much as hardware, and compiler engineering is a big part of what makes it... ...and lowering infrastructure. Performance analysis and optimization across... ...including debugging, performance profiling, and designing for...PerformanceWork experience placement$46 per hour
...Software Engineer, Simulation (Summer Intern) San Jose, CA About the Company DiDi's autonomous... ...engineers to improve overall simulation performance. Responsibilities Perform optimization... ...software under the guidance of profiler. Identify and resolve algorithm deficiencies...PerformanceHourly paySummer workInternshipSummer internship- ...instrumental in enhancing GPU kernel performance, accelerating deep learning... ...across internal GPU software teams and engage with open-source... .... THE PERSON: Skilled engineer with strong technical and analyticalexpertisein... ...Deep Learning Models: Profile, analyze, code change and...Performance
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer - Performance Profiling. Be the first to apply!
- software engineer internship remote San Jose, CA
- IT software developer San Jose, CA
- software engineer staff San Jose, CA
- machine learning software engineer San Jose, CA
- software engineer part time San Jose, CA
- senior robotics software engineer San Jose, CA
- software engineer entry level San Jose, CA
- software development engineer aws San Jose, CA
- startup software engineer San Jose, CA
- rust software engineer San Jose, CA

