AI Performance Optimization Engineer

$100k - $150k

Bright Vision Technologies

AI Performance Optimization Engineer

Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable, secure, and user-friendly applications. As we continue to grow, we're looking for a skilled AI Performance Optimization Engineer to join our dynamic team and contribute to our mission of transforming business processes through technology. This is a fantastic opportunity to join an established and well-respected organization offering tremendous career growth potential.

Location: 100% Remote (Continental United States)

Position Type: In-house Bright Vision Technologies SOW engagement (no third-party client or vendor)

Salary: $100K - $150K

Experience: 6+ years

Sponsorship: No new H1B sponsorship available. H1B transfers welcomed for qualified candidates.

Employment Type: Full-time, direct W2 with Bright Vision Technologies (no C2C, no 1099, no third-party)

Engagement: Long-term, multi-year, aligned to the Bright Vision SOW delivery roadmap

Compensation: Competitive base salary commensurate with experience, plus benefits.

Employment Terms & Visa Policy: This is a 100% remote, full-time, direct W2 position with Bright Vision Technologies. This role is part of Bright Vision Technologies' in-house Statement of Work (SOW) engagement. The client, end customer, and employer for this position is Bright Vision Technologies — there is no third-party client, vendor, or implementation partner involved. We do not engage in C2C, 1099, or third-party arrangements for this role. BUT STRICTLY NO C2C/1099/3RD PARTY COMPANIES. ALL OUR ROLES ARE W2 AND NO 3RD PARTY BROKERING PLEASE. Candidates must be willing to work directly as a full-time W2 employee of Bright Vision Technologies and contribute to our in-house SOW deliverables. No new H1B sponsorship is available for this role. However, candidates who are currently on a valid H1B visa and require a transfer are welcome to apply. We will support H1B transfers for qualified candidates. For every role, a technical coding assessment is mandatory. Please apply only if you are confident in your technical abilities and hands-on experience.

Job Summary

We are seeking an AI Performance Optimization Engineer to focus on extracting maximum throughput, minimizing latency, and reducing cost across training and inference workloads for large neural network systems. The role spans the full stack from low-level kernel optimization to distributed system tuning, requiring deep understanding of GPU architecture, model parallelism, memory management, and compiler-level optimization. The ideal candidate has demonstrated impact on production AI workloads, with strong instrumentation and measurement discipline that enables rigorous, data-driven optimization decisions. In this role you will work closely with cross-functional partners — product, design, engineering, operations, and business stakeholders — to translate ambiguous requirements into well-engineered solutions, and will be expected to raise the bar through code review, design review, and mentorship of more junior engineers. The successful candidate brings strong engineering discipline, a clear communication style, and a track record of shipping meaningful work that holds up well in production.

Key Responsibilities

Profile and optimize end-to-end AI training and inference pipelines for throughput, latency, and cost.
Identify and eliminate bottlenecks across data loading, model compute, communication, and memory.
Implement and tune quantization, sparsity, and pruning strategies to reduce model footprint and accelerate inference.
Optimize distributed training using tensor parallelism, pipeline parallelism, FSDP, and ZeRO-style sharding.
Tune attention implementations using FlashAttention, paged attention, and related techniques.
Implement KV cache optimization, continuous batching, and speculative decoding for LLM serving.
Drive compiler-level optimizations using Triton, XLA, TorchInductor, or TVM, working with the broader ML framework community to land improvements that translate into measurable end-to-end performance gains.
Optimize data pipelines, sharding strategies, and storage access patterns for high-throughput training.
Build and maintain rigorous benchmark suites and regression frameworks across workloads.
Collaborate with ML and platform engineering teams to embed best practices in standard pipelines.
Drive cost-efficiency improvements through model architecture, hardware selection, and scheduling strategies.
Evaluate new hardware and software offerings, and advise on adoption.
Document performance tuning playbooks and share findings broadly across engineering teams.
Stay current with AI systems research and translate advances into production improvements.

Required Qualifications

Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related field.
Six or more years of experience in performance engineering, ML systems, or HPC.
Strong proficiency in Python and C++.
Hands-on experience optimizing deep learning workloads on modern GPUs.
Deep understanding of distributed training and inference techniques.
Experience with profiling tools across CPU, GPU, and distributed systems.
Familiarity with model compression techniques and their accuracy implications.
Strong grasp of memory hierarchies, communication primitives, and parallelism strategies.
Excellent measurement, debugging, and analytical reasoning skills.
Strong communication and collaboration skills.

Preferred Qualifications

Experience optimizing LLM inference at production scale.
Contributions to vLLM, TensorRT-LLM, DeepSpeed, or similar projects.
Familiarity with custom kernel authoring in Triton or CUTLASS.
Experience with FinOps for AI workloads.
Publications or talks on AI systems performance.

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the AI Performance Optimization Engineer in United States vacancy

Graph Optimization Engineer for High-Performance AI Compiler
Lemurian Labs in Santa Clara is looking for a Graph Optimization Compiler Engineer to optimize the middle tier of our AI compiler stack. You will work on transformations crucial for performance. The ideal candidate has extensive experience in compiler development, particularly...
Performance
Lemurian Labs
Santa Clara, CA
1 day ago
Graph Optimization Engineer — AI Compiler Stack
A pioneering AI technology company in Santa Clara is seeking a Graph Optimization Compiler Engineer to enhance their AI compiler stack. This role focuses on developing graph... ...optimizations to deliver significant performance improvements. The ideal candidate should have...
Performance
Lemurian Labs
Santa Clara, CA
1 day ago
Edge Inference Engineer: Optimize On-Device AI Kernels
Liquid AI is seeking a Systems Programmer to join their Edge Inference team in San Francisco. In this role, you will implement and optimize inference kernels on various hardware, ensuring efficiency and performance. Ideal candidates have over 5 years of systems programming...
Performance
Flexible hours
Liquid AI
San Francisco, CA
3 days ago
Remote CUDA Kernel Engineer - Optimize AI GPU Pipelines
Pragmatike is seeking a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels for high-throughput AI systems. This role involves designing custom kernels, profiling GPU workloads, and resolving performance bottlenecks. Ideal candidates will have a strong understanding...
Performance
Remote job
Pragmatike
Florida, NY
4 days ago
Senior GPU Performance Engineer - Optimize AI Accelerators
A leading tech company is seeking a Software Engineer specialized in GPU development to optimize AI accelerators for critical products. You will work on performance enhancements and software stack optimizations that impact billions of users. Ideal candidates have strong...
Performance
Google
Seattle, WA
2 days ago
GPU-Optimized Kubernetes Engineer for AI/ML
...seeking a skilled professional in Dallas, Texas, to design and optimize GPU-accelerated container platforms. The ideal candidate will... ...expertise in NVIDIA and Kubernetes ecosystems, with a focus on high-performance workloads. This role includes architecting Kubernetes clusters...
Performance
Work at office
Relocation package
3 days per week
Career Techniques Inc
Dallas, TX
1 day ago
Senior GPU Performance Engineer - Optimize AI workloads
$166k - $244k
Google is looking for a Senior Software Engineer in Sunnyvale, CA to lead GPU performance optimizations for cutting-edge AI and machine learning technologies. This role offers the opportunity to work on innovative projects that impact billions of users around the globe....
Performance
Google
Sunnyvale, CA
5 days ago
Senior GPU Performance Engineer: AI Accelerator Optimizer
$174k - $253k
Google is searching for experienced Software Engineers to develop optimizations for the latest generation of GPUs, crucial for critical products... ...low-level GPU programming. This role involves addressing performance bottlenecks and influencing GPU software direction at...
Performance
Google
New York, NY
5 days ago
Senior GPU Performance Engineer: Optimize AI Accelerators
$174k - $252k
Google is seeking a Software Engineer to drive optimizations for advanced GPU technologies impacting billions of users. Candidates should have a... ...building optimizations for GPU architectures and addressing performance bottlenecks across Google’s product suite. The position...
Performance
Google
Kirkland, WA
3 days ago
Middleware Engineer for HPC & AI Runtime Optimizations
...Corporation is seeking a Middleware Development Engineer to join its Communication Runtimes team in... ...Texas. The role involves designing, building, and optimizing software communication libraries to enhance high-performance computing and artificial intelligence capabilities...
Performance
Intel
Austin, TX
1 day ago
Founding Compiler Engineer - AI/ML Model Optimizer
Slope is seeking a Founding Compiler Engineer in San Francisco, responsible for designing core compiler infrastructure and optimizing AI models. You will write CUDA kernels and conduct performance reviews, contributing to Luminal's mission of making AI workloads portable...
Performance
Full time
Slope
San Francisco, CA
5 days ago
Air Vehicle MDAO Engineer — AI-Driven Design Optimizer
$166k - $220k
...Multidisciplinary Design Analysis and Optimization (MDAO) Engineer to join our fast-growing team. You... ...and proficiently apply cutting-edge AI tools to accelerate and optimize MDAO... ...Structures, Propulsion, Aerodynamics, Vehicle Performance, Thermal Management, Power Management...
Performance
Full time
Work experience placement
Immediate start
Neura Market
Costa Mesa, CA
3 days ago
Remote Performance Engineer: Systems & AI Runtime Optimizer
Mercor is looking for a Performance Engineer Expert to join our remote team. In this role, you will guide and improve AI model performance by collaborating with research teams and optimizing production systems. We seek someone with 2+ years of performance engineering experience...
Performance
Remote job
Mercor
New York, NY
20 hours ago
Remote Prompt Engineer - LLM Optimization & AI Safety
A tech company is seeking a skilled Prompt Engineer for a remote position in the European Union. The ideal candidate will design, test, and optimize prompts to enhance AI model performance. Responsibilities include collaborating with data scientists, ensuring compliance...
Performance
Remote job
Flexible hours
Codertal
Union, NJ
5 days ago
Senior Power Analysis and Optimization Engineer, AI-LLM Systems
$136k - $218.5k
...looking for a Senior Power Architecture & Optimization Engineer to push the limits of energy efficiency using advanced analytics and AI, including LLMs trained specifically for power... ...optimizations for power Perform comparative power analysis across workloads...
Performance
NVIDIA
Santa Clara, CA
5 days ago
Senior Storage Production Engineer: AI-Optimized
$176k - $276k
...Corporation in Santa Clara, California, is seeking a Production Engineer to design and maintain scalable storage solutions. The... ...such as C/C++, Java, or Python. Collaboration with AI/ML workloads to optimize performance will be a key responsibility. The role offers a...
Performance
NVIDIA Corporation
Santa Clara, CA
5 days ago
ML Inference & System Optimization Engineer
Zensors is seeking a Machine Learning Engineer focused on ML Runtime & Optimization to enhance our visual sensing platform. The role involves... ...learning pipelines and collaborating with AI research teams to implement high-performance algorithms. Ideal candidates will have a...
Performance
Zensors
San Francisco, CA
1 day ago
AI Process Optimization Engineer (Hybrid)
...Solutions Services, LLC is seeking a professional focused on optimizing processes through AI technologies. The role involves integrating AI into workflows, collaborating with teams, and monitoring AI performance. Applicants should possess a degree in Artificial...
Performance
Creative Solutions Services, LLC
Tampa, FL
4 days ago
Multi-Site Process Engineer — Data & AI Optimization
Walker Lovell is seeking a corporate-facing process engineer to drive optimization across multiple sites in Texas. The role offers a strong base salary with a performance-linked bonus, positioning you to influence major capital projects and data-driven initiatives. You...
Performance
Walker Lovell
Dallas, TX
4 days ago
Senior AI Inference Systems Engineer: GPU-Optimized, Cloud
$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
AI System Research and Development Engineer - Optimization
...usher in this new era, we seek AI-native thinkers across every... ...system development, optimizations, and agentic systems. Our mission... ...Analyze and optimize GPU kernel performance for training and inference... ..., optimizations, and engineering practices in technical blogs...
Performance
Full time
Snowflake
Bellevue, WA
22 hours ago
MTM/STV Process Engineer — Lean Time Optimization
Derichebourg Multiservices is seeking an MTM / STV Process Engineer to develop, validate, and optimize manufacturing times using MTM methodologies. In this... ...will support production teams and improve industrial performance while ensuring accurate and safe production processes...
Performance
Derichebourg Multiservices
New Bremen, OH
5 days ago
Senior FCC Process Engineer — Performance & Optimization
...Arizona is actively seeking an experienced Downstream Process Engineer specializing in Fluid Catalytic Cracking (FCC) processes.... ...will leverage 15+ years of operational expertise to optimize refinery performance and lead initiatives that enhance process safety, reliability...
Performance
Local area
Local First Arizona
Chicago, IL
4 days ago
Process Optimization Engineer - High-Impact Cement Operations
Walker Lovell is seeking a qualified engineer to take ownership of process performance at a flagship manufacturing site in Missouri. The successful candidate will drive process optimization, improve operational efficiencies, and support plant initiatives. Must have proven...
Performance
Walker Lovell
Kansas City, MO
3 days ago
Senior Process Engineer: Dairy Process Optimization
Schreiber Foods Inc. is looking for a Senior Process Engineer based in Mt. Vernon, MO. This role involves optimizing production processes and collaborating with cross-functional teams to enhance operational performance and ensure food safety standards. The ideal...
Performance
Relocation package
Schreiber Foods Inc.
Kansas City, MO
1 day ago
Senior Process Engineer - Cement Operations & Optimization
...Vistas Corporation (Nuvoco) is hiring a Process Engineer for their Cement Plant. The role includes monitoring and optimizing plant processes to ensure quality production... ..., maintenance, and quality teams, preparing performance reports and training operators. #J-18808-...
Performance
Nuvoco Vistas Corporation (Nuvoco)
New York, NY
4 days ago
PCBA Process Engineer — NPI & Yield Optimization
...manufacturing company in Fort Worth, TX, is seeking a Process Engineer to maintain performance in Power Circuit Board Assembly processes. The ideal... ...setting up SMT and DIP machines, conducting process optimizations, and training new staff. This position involves problem...
Performance
Employee Magnets
Fort Worth, TX
2 days ago
Process Engineer II - Batch Optimization & Scale-Up
DSJ Global is seeking a Process Engineer II to support batch manufacturing operations in Fremont, California... .... This hands-on role is focused on process optimization and offers a unique opportunity to influence plant performance and product quality directly. The successful...
Performance
DSJ Global
Fremont, CA
5 days ago
Process Engineer - Advanced Manufacturing & Optimization
USF in Austin is looking for a skilled Process Engineer to optimize manufacturing processes and ensure high-quality production. This hands-... ...part of a team that values technical expertise and continuous improvement to enhance production performance. #J-18808-Ljbffr USF
Performance
USF
Austin, TX
4 days ago
Process Engineer: Manufacturing Optimization & Innovation
...Farathane is seeking a skilled Process Engineer to join our team in Austin, Texas. In this... ...-on role, you will be responsible for optimizing and controlling manufacturing processes... ...-quality parts efficiently, balancing performance, cycle time, cost, and material usage....
Performance
US Farathane
Austin, TX
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Performance Optimization Engineer. Be the first to apply!