Software Engineer - Performance Profiling
$2,000 per monthETCHED LLC
About Etched Etched is building the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history. Job Summary Join our team as a Software Engineer - Performance Tools and take the lead in illuminating the performance landscape of our cutting-edge ML accelerator. We are seeking a highly skilled engineer to design and develop a sophisticated performance analysis tool, tailored specifically for Sohu. You will be instrumental in creating the essential tooling that enables our ML engineers and customers to understand workload behavior, identify performance bottlenecks, and unlock the full potential of Sohu accelerating the most demanding ML applications in the world. This is a unique opportunity to shape performance analysis for novel hardware from the ground up. Key responsibilities
- Lead the design and architecture of a comprehensive performance analysis suite, including data collection mechanisms, data processing pipelines, analysis engines, and user interfaces (CLI and/or GUI).
- Develop robust methods to capture performance data directly from our custom ML accelerator hardware (e.g., hardware performance counters, execution unit status, memory access patterns) via driver interfaces or other mechanisms.
- Implement tracing for host-side API calls (runtime libraries, driver interactions) and system-level events (CPU activity, PCIe traffic, memory usage, network contention) related to Sohu workloads.
- Design and implement techniques to accurately correlate performance events across the host CPU, device driver, PCIe bus, multiple accelerators, and multiple hosts, ensuring precise time synchronization.
- Build analysis modules to automatically interpret collected trace and counter data, identifying key performance limiters (e.g., compute-bound, memory bandwidth-bound, latency-bound, PCIe-bound, specific hardware bottlenecks).
- Develop intuitive visualizations (timelines, dependency graphs, resource utilization charts, statistical summaries) to clearly communicate performance characteristics and bottlenecks to users.
- Work closely with hardware architects, firmware engineers, driver developers, compiler engineers, and ML application engineers to understand their needs, define tool requirements, and provide expert guidance on performance analysis and optimization using the tool.
- Architect and implement the core data collection framework for hardware performance counters on a custom PCIe-based accelerator.
- Develop a kernel driver module or user-space service for low-overhead tracing of accelerator activity.
- Design and build a correlated timeline view visualizing CPU API calls, driver submissions, PCIe transfers, and accelerator execution units.
- Create an analysis pass to detect and quantify memory access inefficiencies or PCIe bandwidth saturation while transacting on a PCIe-attached accelerator.
- Strong proficiency in C++ or Rust
- Proficiency in Python is a plus
- Deep understanding of computer architecture (CPU, GPU, accelerators), memory hierarchies (caches, DRAM), and interconnects (especially PCIe).
- Proven experience in low-level performance analysis, profiling, and bottleneck identification on complex hardware systems (GPUs, CPUs, FPGAs, or custom ASICs).
- Experience with performance analysis tools (e.g., NVIDIA Nsight, AMD uProf, Intel VTune, perf, Tracy, ETW).
- Experience working close to hardware, potentially reading performance counters or interacting directly with device drivers.
- Direct experience developing performance analysis or debugging tools.
- Experience with ML accelerator architectures (GPUs, TPUs, etc.).
- Experience with kernel-mode driver development (Linux or Windows).
- Understanding of compiler internals, code generation, and optimization.
- In-depth knowledge of the PCIe protocol and analysis tools (PCIe analyzers).
- Experience with multi-chip or multi-host accelerator systems (e.g., TPU pods, or NVidia DGX clusters)
- Experience with firmware or embedded systems development.
- Experience with hardware description languages (Verilog, VHDL) or hardware verification.
- Medical, dental, and vision packages with generous premium coverage
- $500 per month credit for waiving medical benefits
- Housing subsidy of $2k per month for those living within walking distance of the office
- Relocation support for those moving to San Jose (Santana Row)
- Various wellness benefits covering fitness, mental health, and more
- Daily lunch + dinner in our office
- Unlimited compute budget subject to ROI justification
Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Software Engineer - Performance Profiling in San Jose, CA vacancy
$272k - $431.25k
...Always-On, low-overhead GPU profiling service that runs in production... ...-on delivery across system software, drivers, and CUDA to make... ...driver/platform layers, and performance counter/trace providers.... ...technical direction for an engineering team; mentor engineers, drive...Performance$120 - $130 per hour
...Animation Software Engineer/Graphics Engineer V Location: Cupertino, California - Remote Duration: Contract Job... ...Adding new features to the Keynote animation engine Profiling and optimizing performance of Keynote animations using state‑of‑the‑art graphics...PerformanceContract workRemote work- ...C++ Software Engineer High-Performance Linux Systems | Full-Time | San Jose CA ****@*****.*** About the Role: We are looking... ...speed, reproducibility, and developer experience • Profile and optimize CPU, memory, I/O, and concurrency; debug complex...PerformanceFull timeWork at office
$100k
...Software Engineer, TT-Distributed Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining... ...maintain testing, debugging, profiling, and monitoring tools for large-...Performance- ...ML Systems Engineer — Training & Inference Optimization (MBMB) We... ...robot foundation models, high-performance training infrastructure, and... ...boundaries across hardware, software, and model design — where improvements... ...Continuously profile, benchmark, and improve system...Performance
- ...Middleware Software Engineer Figure is an AI robotics company developing autonomous general-... ...intelligence. Its robots are engineered to perform a variety of tasks in the home and... ...Comfortable using debuggers and performance profiling tooling. Bonus Qualifications:...Performance
$224k - $356.5k
...are looking for a Senior Deep Learning Software Engineer to design and build our automated... ...JAX, designing and implementing a high-performance execution environment, low-level GPU optimizations... ...deployment solution. Analyze and profile GPU kernel-level performance to...Performance$168k - $270.25k
...NVIDIA GPU Architecture Group is seeking a senior software engineer to automate and optimize performance analysis workflows for AI training and inference workloads... ...intuitive tooling. Develop integrations between profiling infrastructure and AI frameworks and workflows....PerformanceWork experience placement$147.4k - $272.1k
...Software Engineer - Workout & Fitness, Watch Software Imagine what you could do here. At Apple... ...the development process to hit our performance goals, including help influence future... ...Experience optimizing applications and profiling throughout the stack Experience...PerformanceRelocation- ...We are looking for a Senior iOS Software Engineer to build and scale world-class mobile... ...using design systems. Optimize app performance, memory usage, and stability; proactively... ...Instruments (memory leaks, crashes, performance profiling). Experience writing unit tests...Performance
$101.5k - $188.5k
...technology. We are searching for a Software Engineer to work on delay calculation and signal... ...to leverage GPU acceleration for performance-critical workloads. At advanced technology... ...If you are a job seeker creating a profile using our careers website, please see...Performance$122.5 - $129.3 per hour
...requirements. We are seeking an " Software Development Engineer, Release " for one of our Semiconductor clients. According to your profile, this looks like a good match, and I... ..., and ensure reliable, high-performance releases. This is a hands-on engineering...PerformanceContract workLocal areaNight shift$129.19k - $214.78k
...Software Engineer – Motion & Behavioral Planning San Jose, CA DiDi's autonomous driving... ...aware, comfortable, and safe velocity profiles. Model complex driving scenarios and... ...testing, and debugging of the system's performance in various scenarios, leading root...PerformanceInternship- ...delivers the world’s highest performance scale-out networking... ...seamlessly integrates hardware, software and system level technologies... ...thinking team of architects, engineers, and business professionals... ...and CPU/GPU synchronization. Profile distributed AI workloads and...PerformanceFull timeRemote workFlexible hours
- ...Description: The Software Engineer position will be responsible for hands-on development as well... ...application architecture, ensure high performance, scalability and availability for... ...with others. o What makes a candidate profile stand out to you? Candidate should be...PerformanceContract workWork at officeLocal areaRemote work
$152k - $241.5k
...and robotics. We build the software stack that enables Large Language... ...robotics to deliver high-performance, production-ready solutions.... ..., and MoE. Benchmark, profile, and optimize inference performance... ..., Electrical/Computer Engineering, or a closely related field....PerformanceRemote work$2,000 per month
...delivering over 10x higher performance and dramatically lower cost... ...investors and staffed by leading engineers, Etched is redefining the... ...collectives. Utilize performance profiling and debugging tools to... ...or complex distributed software systems like Linux internals...PerformanceWork at officeRelocation package$184k - $287.5k
...Senior Software Engineer For Compiler Team NVIDIA's GPUs are at the core of modern AI infrastructure... ...and execution stack, targeting high-performance kernel generation for deep learning... ..., including debugging, performance profiling, and designing for maintainability....PerformanceWork experience placement- ...technical leader to join the AI Software group. As a Fellow, you will... ...to achieve industry-leading performance for our top-tier customers.... ...engagement, and software engineering, ensuring that AMD's software... ...Engineering: Lead the profiling, analysis, and tuning of large...Performance
- ...Description The Software Engineer position will be responsible for hands-on development... ...application architecture, ensure high performance, scalability and availability for... ...detail What makes a candidate profile stand out to you? Salesforce Administrator...PerformanceContract workWork at officeRemote workFlexible hours2 days per week
$2,000 per month
...models and extremely deep chain-of-thought reasoning. Software Engineer, ML Performance Running millions of tokens per second for large... ...will involve a blend of low-level programming, performance profiling, and hands-on debugging, all aimed at maximizing the performance...PerformanceWork at officeRelocation package$181.1k - $318.4k
...Senior Software Engineer - UI and Performance Our 3D Graphics Team is looking for a Senior UI Engineer with a balanced focus on system and UI performance... ...other Native Platforms. ~ Experience using various profiling tools, (e.g. Xcode Instruments, Android Studio Profiler...PerformanceRelocation$141k - $202k
...Software Engineer III, Flutter (iOS) This full time position is on the Flutter iOS team. Google... ...like new versions of Swift. Profile to improve Flutter startup latency, memory... ...used by other developers. Experience with iOS performance and memory profiling....PerformanceFull time$152k - $241.5k
...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like... ...multiple platforms for functionality and performance Develop components of TensorRT,... ...in software performance benchmarking, profiling, and optimizations. Background in compiler...Performance$116.5k - $160.2k
...Full Stack Software Engineer Department: Information Technology Employment Type: Full... ...Develop responsive, accessible, high-performance user interfaces using modern web... ...performance and Core Web Vitals , applying profiling and experimentation to improve user...PerformanceHourly payFull timeTemporary workInternshipWork at officeLocal areaFlexible hours$152k - $241.5k
...application is built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source... ...an engineer who enjoys digging into performance bottlenecks, designing pragmatic... ...improve throughput and tail latency. Profile and improve hot paths across layers-...Performance- ...instrumental in enhancing GPU kernel performance, accelerating deep learning... ...across internal GPU software teams and engage with open-source... ...PERSON: Skilled engineer with strong technical and analytical... ...Deep Learning Models: Profile, analyze, code change and tune...Performance
$152k - $241.5k
...searching for highly motivated, creative engineers to join the Platform Software team. You will work with a team of... ...optimizations Drive end-to-end performance excellence: debug and root-cause... ...stack: application performance profiling and optimization, low-level...Performance$184k - $287.5k
...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing... ...for developing and maintaining high-performance deep learning frameworks, including SGLang... ...background with performance modeling, profiling, debug, and code optimization or...PerformanceRemote work$184k - $287.5k
...We are now looking for a Senior Deep Learning Software Engineer, LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate... ...Prior experience with performance modeling, profiling, debug, and code optimization of a DL/HPC/high-performance...Performance
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer - Performance Profiling. Be the first to apply!
Related searches
- graduate software developer San Jose, CA
- rust software engineer San Jose, CA
- senior software design engineer San Jose, CA
- software engineer student San Jose, CA
- software engineer amazon San Jose, CA
- software developer positions San Jose, CA
- software engineer full time San Jose, CA
- software qa engineer San Jose, CA
- new graduate software engineer San Jose, CA
- junior software developer San Jose, CA

