Performance Engineer

RadixArk

Performance Engineer

RadixArk is hiring a Performance Engineer in Palo Alto, CA — someone who can push LLM inference and training systems to the limit across real production workloads.

You'll work on the performance-critical path of SGLang, Miles, and the RadixArk infrastructure stack: latency, throughput, GPU utilization, memory efficiency, scheduling, batching, kernel behavior, distributed execution, and cost-per-token. This is not a generic benchmarking role. You'll be working on the systems that determine whether frontier-scale AI workloads are actually usable, affordable, and reliable in production.

Our customers care about real numbers: P99 latency, TTFT, tokens/sec/GPU, throughput under long-context workloads, cost-per-million tokens, RL rollout efficiency, and training-inference consistency. You'll help us measure, debug, and improve these systems across NVIDIA, AMD, Google TPU, and cloud partner environments.

This role is for someone who loves performance debugging, understands that small systems details can create massive product impact, and wants to work at the frontier of AI infrastructure.

What You'll Do

Analyze and improve performance across SGLang, Miles, and RadixArk production deployments
Benchmark LLM inference and training workloads across GPUs, TPUs, and cloud environments
Optimize latency, throughput, memory usage, batching, scheduling, routing, and GPU utilization
Investigate performance regressions in real customer environments
Work closely with kernel, runtime, distributed systems, and product engineers
Build internal tooling for profiling, tracing, benchmarking, and regression detection
Translate customer workload characteristics into concrete performance tuning strategies
Help define performance metrics that matter commercially, including cost-per-token and serving efficiency
Partner with customers and cloud partners on deep technical evaluations
Contribute performance insights back to open-source SGLang and Miles

What We're Looking For

Strong systems engineering background, especially in performance-critical software
Experience with GPU systems, distributed systems, inference serving, ML runtimes, or high-performance computing
Familiarity with profiling tools, performance debugging, tracing, and benchmark methodology
Comfort working with Python and C++
Experience with CUDA, Triton, Pallas, ROCm, XLA, or kernel-level optimization is a strong plus
Understanding of LLM inference concepts such as batching, KV cache, prefill/decode, speculative decoding, MoE, long context, and P99 latency
Ability to debug messy real-world performance issues across software, hardware, and infrastructure layers
Strong communication skills — you should be able to explain performance tradeoffs to both engineers and customers
Prior experience with production AI infrastructure, cloud GPU environments, or open-source ML systems is a plus

About RadixArk

RadixArk is an infrastructure-first company built by engineers who've shipped production AI systems, created SGLang, and developed Miles, our large-scale RL framework.

We're on a mission to democratize frontier-level AI infrastructure by building world-class open systems for inference and training.

Our team has optimized kernels serving billions of tokens daily, designed distributed training systems coordinating 10,000+ GPUs, and contributed to infrastructure that powers leading AI companies and research labs.

We're backed by well-known infrastructure investors and partner with Nvidia, Google, AWS, and frontier AI labs.

Join us in building infrastructure that gives real leverage back to the AI community.

Compensation

We offer competitive compensation with meaningful equity, comprehensive health benefits, and flexible work arrangements. Compensation is determined by location, level, and experience.

Equal Opportunity

RadixArk is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Apply

Vacancy posted 14 hours ago

Similar jobs that could be interesting for youBased on the Performance Engineer in Palo Alto, CA vacancy

Kernel Engineer (Internship and Full-time)
...Kernel Engineer Tilde Research is a moonshot AI lab advancing mechanistic interpretability, new architectures, and pretraining science... ...at Tilde, you'll design, implement, and optimize high-performance GPU kernels that are critical to scaling our training and inference...
Performance
Full time
Internship
Tilde
Palo Alto, CA
3 days ago
Senior LLM Engineer (Volunteer)
...Flexible schedule Qualifications ~5+ years software engineering experience ~2+ years hands-on experience building LLM... ...systems Mentor junior engineers Guide model evaluation and performance optimization Advise on responsible AI practices...
Performance
Remote work
10 hours per week
Flexible hours
Health4TheWorld
Palo Alto, CA
2 days ago
High Performance Computing (HPC) Engineer
$170k - $260k
...established start-up, where a collective of visionary scientists, engineers, and entrepreneurs are dedicated to transforming the landscape... ...GPU Cluster Management: Design, deploy, and maintain high-performance GPU clusters, ensuring their stability, reliability, and scalability...
Performance
Work at office
GenBio AI
Palo Alto, CA
14 hours ago
Robot Perception Engineer
...Robot Perception Engineer At Rhoda AI, we're building the next generation of generalist intelligent robots. We own the full robotics stack from high-performance hardware and robot systems to the infrastructure and state-of-the-art foundation world models that control...
Performance
Rhoda ai
Palo Alto, CA
3 days ago
VMware Engineer
...VMware Engineer Arete Technologies, Inc. offers a set of innovative consulting and outsourcing services, bridging the gap between... ...troubleshoot enterprise scale systems to achieve optimal application performance, stability and availability. The successful candidate is an IT...
Performance
Local area
Worldwide
Arete Technologies Inc
Palo Alto, CA
a month ago
Engineer
$25 per hour
...Description The Aloft Mountain View is currently looking for an Engineer. This is a full-time position that requires associates to work... ...with Disabilities Compensation: $25.00 per hour PERFORMANCE STANDARDS Customer Satisfaction: Our customers are what...
Performance
Hourly pay
Full time
Contract work
Shift work
Crescent Hotels & Resorts LLC
Mountain View, CA
a month ago
Senior Radiation Effects Engineer
...Senior Radiation Effects Engineer Logos Space is a Low Earth Orbit (LEO) satellite system purpose-built to serve the connectivity... ...fill an important gap in the market, providing resilient, high-performance satellite-based connectivity services to enterprise and government...
Performance
Work experience placement
Local area
Worldwide
Logos Space
Mountain View, CA
14 hours ago
LLM Inference Engineer: Scalable, Low-Latency Serving
Hippocratic AI is seeking an experienced LLM Inference Engineer in Palo Alto to optimize large language model serving infrastructure.... ...architectures, apply quantization techniques, and benchmark system performance. With a focus on safety in healthcare, this role offers the...
Performance
Hippocratic AI
Palo Alto, CA
5 hours ago
Authentication Engineer
...McLean, VA (Onsite 5 days per week) We are seeking a Software Engineer to join the Wallet - Authentication team, where we build... ...Familiarity with CI/CD pipelines, automated testing frameworks, and performance Startup or high-growth company experience with a strong bias...
Performance
Walter Services
Mountain View, CA
1 day ago
Actuation Engineer: Advanced Motor Design & Prototyping
...should have a relevant degree and experience in motor design, along with strong simulation skills using tools like Ansys Maxwell. This role involves taking motors from concept through prototype and ensuring they meet performance standards. #J-18808-Ljbffr Mind Robotics
Performance
Mind Robotics
Palo Alto, CA
4 days ago
Runtime Engineer
$120k - $250k
...Runtime Engineer Mountain View, CA What MatX Is Building MatX is building custom silicon for large-language-model inference... ...Python surfaces ML engineers actually use — and hit measurable performance targets on runtime overhead and serving throughput Who You...
Performance
Full time
Contract work
Work experience placement
Local area
Remote work
Monday to Friday
Flexible hours
MatX
Mountain View, CA
2 days ago
Silicon DFT Engineer: SERDES & PHY Bring-Up
A technology company in Mountain View is seeking a Silicon Design-For-Test (DFT) engineer responsible for implementing DFT functions in high-performance silicon. The successful candidate will design functional test solutions, integrate DFT features, and collaborate with...
Performance
MatX
Mountain View, CA
1 day ago
Senior Kernel & Compiler Performance Engineer (GPU/AI)
...Communication. This critical role requires strong expertise in CUDA and GPU optimization, along with 5+ years of experience in performance engineering. The ideal candidate will design high-performance kernels and optimize systems for large GPU clusters, contributing to the...
Performance
RadixArk
Palo Alto, CA
14 hours ago
Endpoint Engineer MDM
...Job Title: Endpoint Engineer MDM Location: Palo Alto, CA Duration: 12 Months Pay rate: $63/hr on W2 Summary This is an... ...Responsibilities: Unified Endpoint Management (UEM) Daily Operations: Perform daily upkeep, system maintenance, and regular patch management...
Performance
Contract work
Work experience placement
Intellipro Group
Palo Alto, CA
1 day ago
Performance Engineer
...Performance Engineer We're hiring a Performance Engineer to own performance across our entire stack. You'll build the automated harnesses that keep us honest, continuously measuring every model, microservice, and infrastructure component, and be our expert voice on...
Performance
Hippocratic AI
Palo Alto, CA
3 days ago
Customer Engineer
...Opportunity: Vianai is looking for an experienced Customer Engineer who will augment our human-centered products through customer interaction... ..., Data Scientists both internally and with our customers. Perform hands-on and in-depth investigations /debugging at all levels...
Performance
Vianai Systems
Palo Alto, CA
14 hours ago
Tracking Engineer, Perception, Autonomy
$179k - $223.8k
...Role Summary We are seeking a highly skilled Software Engineer with a strong background in C++ development and experience in... ...software engineering with real-world impact—building robust, high-performance tracking algorithms that scale into massive fleet deployment....
Performance
Full time
Contract work
Local area
Rivian
Palo Alto, CA
4 days ago
ESX Engineer
$120k - $192k
...by most large corporations, due to its advanced capabilities, performance, and quality. The ESX Core Platform Quality team is responsible... ...processes necessary to become a successful ESX and VMKernel engineer and will participate in the design and development of novel...
Performance
Local area
Broadcom Corporation
Palo Alto, CA
14 hours ago
Geometry & Meshing Engineer for AI-Driven Hardware
Vinci4d is seeking a Geometry / Meshing Engineer to enhance their copilot platform for hardware designers. This role involves developing... ...and optimizing geometry and mesh-generation systems, ensuring performance and efficiency. The ideal candidate will have strong skills in...
Performance
Vinci4d
Palo Alto, CA
1 day ago
Senior HPC Engineer
$140k - $160k
...ASRC Federal is looking for a Senior HPC Engineer, as ASRC Federal InuTeq provides High Performance Computing services across the full HPC lifecycle including computational requirements, architecture, acquisition, and operations for federal government customers, while...
Performance
Contract work
Weekend work
ASRC Federal Holding Company
Mountain View, CA
3 days ago
CyberSecurity SIEM Engineer (Senior SDC)
$77.5k - $140.9k
...build a better working world. Job Title: CyberSecurity SIEM Engineer (Senior SDC) About the job At EY, you’ll have the... ...compensation and benefits package where you’ll be rewarded based on your performance and recognized for the value you bring to the business. The...
Performance
Work experience placement
Summer holiday
Flexible hours
EY
Palo Alto, CA
3 days ago
Simulator Engineer for AGI Hardware & ML Systems
$120k - $400k
...is seeking an experienced team member to develop and maintain performance models of hardware in a hybrid role, requiring three days in... ...Bachelor’s degree in Computer Science and strong skills in software engineering, particularly in Rust or cycle-accurate simulators. The...
Performance
Work at office
Acceler8 Talent
Mountain View, CA
4 days ago
Senior Search & Recommendations Engineer — Drive Relevance
Cacheflow is seeking a talented Senior Search Engineer to join our team in Mountain View, California. In this role, you will manage our... ...on design, development, and optimization for accuracy and performance. The ideal candidate will have a Master’s degree in Computer Science...
Performance
Cacheflow
Mountain View, CA
4 days ago
Senior Signal Integrity Engineer - High-Speed Auto Hardware
$197k - $285k
Aurora is seeking a Senior Staff Signal Integrity Engineer to lead critical implementations in advanced autonomous driving hardware systems... ...role, you will develop simulation methodologies and optimize performance for high-speed channels while ensuring compliance with...
Performance
Aurora
Mountain View, CA
5 hours ago
Senior Perception Engineer
...and safe decisions. We are looking for a Senior Perception Engineer to work on our classical perception algorithms stack. You will... ...exploring how far Aeva’s 4D FMCW LiDARs can push autonomous driving performance. Responsibilities Contribute to Aeva’s 4D perception...
Performance
Flexible hours
Aeva, Inc.
Mountain View, CA
a month ago
Runtime Engineer: High-Performance AI Compute
A technology company specializing in AI seeks a Runtime Engineer in Palo Alto, California. The role involves designing and implementing features for high-performance machine learning applications and supporting system software for next-generation silicon. Ideal candidates...
Performance
SambaNova
Palo Alto, CA
1 day ago
Actuation Engineer, Motor Design
...) optimized for extreme torque density and compact packaging. Perform electromagnetic simulation and analysis (2D/3D FEA) to evaluate... ...designs in CAD AND/OR partner closely with suppliers and mechanical engineers to translate electromagnetic concepts into manufacturable...
Performance
Mind Robotics
Palo Alto, CA
2 days ago
Observability Engineer - Scale, Reliability & Mentorship
$190.9k - $253.75k
A leading data and AI company is seeking a Software Engineer for their Observability team in Mountain View, California. The role focuses... ...observability solutions to enhance product monitoring and performance. Candidates should have at least 7 years of experience in software...
Performance
Menlo Ventures
Mountain View, CA
3 days ago
PMax Infra Engineer: Scale AI-Powered Ads Backend
...company in Mountain View seeks a PMax and Automation Infra Software Engineer to develop features in Java and C++. The role requires a... ...contribute to the PMax Transformer, ensuring system reliability and performance in a collaborative environment. Competitive salary and...
Performance
Google Inc.
Mountain View, CA
3 days ago
High Performance Computing (Triton + MPI) Engineer
...own data to work. Our team is made up of highly experienced ML engineers and tech industry veterans and we’re backed by leading... ...responsible for one or more of: developing and optimizing high performance collective and kernel libraries for running LLMs on AMD GPUs,...
Performance
Lamini
Palo Alto, CA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Performance Engineer. Be the first to apply!