AI Performance Optimization Engineer
$100k - $150kBright Vision Technologies
AI Performance Optimization Engineer
Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable, secure, and user-friendly applications. As we continue to grow, we're looking for a skilled AI Performance Optimization Engineer to join our dynamic team and contribute to our mission of transforming business processes through technology. This is a fantastic opportunity to join an established and well-respected organization offering tremendous career growth potential.
Location: 100% Remote (Continental United States)
Position Type: In-house Bright Vision Technologies SOW engagement (no third-party client or vendor)
Salary: $100K - $150K
Experience: 6+ years
Sponsorship: No new H1B sponsorship available. H1B transfers welcomed for qualified candidates.
Employment Type: Full-time, direct W2 with Bright Vision Technologies (no C2C, no 1099, no third-party)
Engagement: Long-term, multi-year, aligned to the Bright Vision SOW delivery roadmap
Compensation: Competitive base salary commensurate with experience, plus benefits.
Employment Terms & Visa Policy: This is a 100% remote, full-time, direct W2 position with Bright Vision Technologies. This role is part of Bright Vision Technologies' in-house Statement of Work (SOW) engagement. The client, end customer, and employer for this position is Bright Vision Technologies — there is no third-party client, vendor, or implementation partner involved. We do not engage in C2C, 1099, or third-party arrangements for this role. BUT STRICTLY NO C2C/1099/3RD PARTY COMPANIES. ALL OUR ROLES ARE W2 AND NO 3RD PARTY BROKERING PLEASE. Candidates must be willing to work directly as a full-time W2 employee of Bright Vision Technologies and contribute to our in-house SOW deliverables. No new H1B sponsorship is available for this role. However, candidates who are currently on a valid H1B visa and require a transfer are welcome to apply. We will support H1B transfers for qualified candidates. For every role, a technical coding assessment is mandatory. Please apply only if you are confident in your technical abilities and hands-on experience.
Job Summary
We are seeking an AI Performance Optimization Engineer to focus on extracting maximum throughput, minimizing latency, and reducing cost across training and inference workloads for large neural network systems. The role spans the full stack from low-level kernel optimization to distributed system tuning, requiring deep understanding of GPU architecture, model parallelism, memory management, and compiler-level optimization. The ideal candidate has demonstrated impact on production AI workloads, with strong instrumentation and measurement discipline that enables rigorous, data-driven optimization decisions. In this role you will work closely with cross-functional partners — product, design, engineering, operations, and business stakeholders — to translate ambiguous requirements into well-engineered solutions, and will be expected to raise the bar through code review, design review, and mentorship of more junior engineers. The successful candidate brings strong engineering discipline, a clear communication style, and a track record of shipping meaningful work that holds up well in production.
Key Responsibilities
- Profile and optimize end-to-end AI training and inference pipelines for throughput, latency, and cost.
- Identify and eliminate bottlenecks across data loading, model compute, communication, and memory.
- Implement and tune quantization, sparsity, and pruning strategies to reduce model footprint and accelerate inference.
- Optimize distributed training using tensor parallelism, pipeline parallelism, FSDP, and ZeRO-style sharding.
- Tune attention implementations using FlashAttention, paged attention, and related techniques.
- Implement KV cache optimization, continuous batching, and speculative decoding for LLM serving.
- Drive compiler-level optimizations using Triton, XLA, TorchInductor, or TVM, working with the broader ML framework community to land improvements that translate into measurable end-to-end performance gains.
- Optimize data pipelines, sharding strategies, and storage access patterns for high-throughput training.
- Build and maintain rigorous benchmark suites and regression frameworks across workloads.
- Collaborate with ML and platform engineering teams to embed best practices in standard pipelines.
- Drive cost-efficiency improvements through model architecture, hardware selection, and scheduling strategies.
- Evaluate new hardware and software offerings, and advise on adoption.
- Document performance tuning playbooks and share findings broadly across engineering teams.
- Stay current with AI systems research and translate advances into production improvements.
Required Qualifications
- Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related field.
- Six or more years of experience in performance engineering, ML systems, or HPC.
- Strong proficiency in Python and C++.
- Hands-on experience optimizing deep learning workloads on modern GPUs.
- Deep understanding of distributed training and inference techniques.
- Experience with profiling tools across CPU, GPU, and distributed systems.
- Familiarity with model compression techniques and their accuracy implications.
- Strong grasp of memory hierarchies, communication primitives, and parallelism strategies.
- Excellent measurement, debugging, and analytical reasoning skills.
- Strong communication and collaboration skills.
Preferred Qualifications
- Experience optimizing LLM inference at production scale.
- Contributions to vLLM, TensorRT-LLM, DeepSpeed, or similar projects.
- Familiarity with custom kernel authoring in Triton or CUTLASS.
- Experience with FinOps for AI workloads.
- Publications or talks on AI systems performance.
- Lemurian Labs in Santa Clara is looking for a Graph Optimization Compiler Engineer to optimize the middle tier of our AI compiler stack. You will work on transformations crucial for performance. The ideal candidate has extensive experience in compiler development, particularly...Performance
- A pioneering AI technology company in Santa Clara is seeking a Graph Optimization Compiler Engineer to enhance their AI compiler stack. This role focuses on developing graph... ...optimizations to deliver significant performance improvements. The ideal candidate should have...Performance
- Liquid AI is seeking a Systems Programmer to join their Edge Inference team in San Francisco. In this role, you will implement and optimize inference kernels on various hardware, ensuring efficiency and performance. Ideal candidates have over 5 years of systems programming...PerformanceFlexible hours
- Pragmatike is seeking a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels for high-throughput AI systems. This role involves designing custom kernels, profiling GPU workloads, and resolving performance bottlenecks. Ideal candidates will have a strong understanding...PerformanceRemote job
- A leading tech company is seeking a Software Engineer specialized in GPU development to optimize AI accelerators for critical products. You will work on performance enhancements and software stack optimizations that impact billions of users. Ideal candidates have strong...Performance
- ...seeking a skilled professional in Dallas, Texas, to design and optimize GPU-accelerated container platforms. The ideal candidate will... ...expertise in NVIDIA and Kubernetes ecosystems, with a focus on high-performance workloads. This role includes architecting Kubernetes clusters...PerformanceWork at officeRelocation package3 days per week
$166k - $244k
Google is looking for a Senior Software Engineer in Sunnyvale, CA to lead GPU performance optimizations for cutting-edge AI and machine learning technologies. This role offers the opportunity to work on innovative projects that impact billions of users around the globe....Performance$174k - $253k
Google is searching for experienced Software Engineers to develop optimizations for the latest generation of GPUs, crucial for critical products... ...low-level GPU programming. This role involves addressing performance bottlenecks and influencing GPU software direction at...Performance$174k - $252k
Google is seeking a Software Engineer to drive optimizations for advanced GPU technologies impacting billions of users. Candidates should have a... ...building optimizations for GPU architectures and addressing performance bottlenecks across Google’s product suite. The position...Performance- ...Corporation is seeking a Middleware Development Engineer to join its Communication Runtimes team in... ...Texas. The role involves designing, building, and optimizing software communication libraries to enhance high-performance computing and artificial intelligence capabilities...Performance
- Slope is seeking a Founding Compiler Engineer in San Francisco, responsible for designing core compiler infrastructure and optimizing AI models. You will write CUDA kernels and conduct performance reviews, contributing to Luminal's mission of making AI workloads portable...PerformanceFull time
$166k - $220k
...Multidisciplinary Design Analysis and Optimization (MDAO) Engineer to join our fast-growing team. You... ...and proficiently apply cutting-edge AI tools to accelerate and optimize MDAO... ...Structures, Propulsion, Aerodynamics, Vehicle Performance, Thermal Management, Power Management...PerformanceFull timeWork experience placementImmediate start- Mercor is looking for a Performance Engineer Expert to join our remote team. In this role, you will guide and improve AI model performance by collaborating with research teams and optimizing production systems. We seek someone with 2+ years of performance engineering experience...PerformanceRemote job
- A tech company is seeking a skilled Prompt Engineer for a remote position in the European Union. The ideal candidate will design, test, and optimize prompts to enhance AI model performance. Responsibilities include collaborating with data scientists, ensuring compliance...PerformanceRemote jobFlexible hours
$136k - $218.5k
...looking for a Senior Power Architecture & Optimization Engineer to push the limits of energy efficiency using advanced analytics and AI, including LLMs trained specifically for power... ...optimizations for power Perform comparative power analysis across workloads...Performance$176k - $276k
...Corporation in Santa Clara, California, is seeking a Production Engineer to design and maintain scalable storage solutions. The... ...such as C/C++, Java, or Python. Collaboration with AI/ML workloads to optimize performance will be a key responsibility. The role offers a...Performance- Zensors is seeking a Machine Learning Engineer focused on ML Runtime & Optimization to enhance our visual sensing platform. The role involves... ...learning pipelines and collaborating with AI research teams to implement high-performance algorithms. Ideal candidates will have a...Performance
- ...Solutions Services, LLC is seeking a professional focused on optimizing processes through AI technologies. The role involves integrating AI into workflows, collaborating with teams, and monitoring AI performance. Applicants should possess a degree in Artificial...Performance
- Walker Lovell is seeking a corporate-facing process engineer to drive optimization across multiple sites in Texas. The role offers a strong base salary with a performance-linked bonus, positioning you to influence major capital projects and data-driven initiatives. You...Performance
$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...Performance- ...usher in this new era, we seek AI-native thinkers across every... ...system development, optimizations, and agentic systems. Our mission... ...Analyze and optimize GPU kernel performance for training and inference... ..., optimizations, and engineering practices in technical blogs...PerformanceFull time
- Derichebourg Multiservices is seeking an MTM / STV Process Engineer to develop, validate, and optimize manufacturing times using MTM methodologies. In this... ...will support production teams and improve industrial performance while ensuring accurate and safe production processes...Performance
- ...Arizona is actively seeking an experienced Downstream Process Engineer specializing in Fluid Catalytic Cracking (FCC) processes.... ...will leverage 15+ years of operational expertise to optimize refinery performance and lead initiatives that enhance process safety, reliability...PerformanceLocal area
- Walker Lovell is seeking a qualified engineer to take ownership of process performance at a flagship manufacturing site in Missouri. The successful candidate will drive process optimization, improve operational efficiencies, and support plant initiatives. Must have proven...Performance
- Schreiber Foods Inc. is looking for a Senior Process Engineer based in Mt. Vernon, MO. This role involves optimizing production processes and collaborating with cross-functional teams to enhance operational performance and ensure food safety standards. The ideal...PerformanceRelocation package
- ...Vistas Corporation (Nuvoco) is hiring a Process Engineer for their Cement Plant. The role includes monitoring and optimizing plant processes to ensure quality production... ..., maintenance, and quality teams, preparing performance reports and training operators. #J-18808-...Performance
- ...manufacturing company in Fort Worth, TX, is seeking a Process Engineer to maintain performance in Power Circuit Board Assembly processes. The ideal... ...setting up SMT and DIP machines, conducting process optimizations, and training new staff. This position involves problem...Performance
- DSJ Global is seeking a Process Engineer II to support batch manufacturing operations in Fremont, California... .... This hands-on role is focused on process optimization and offers a unique opportunity to influence plant performance and product quality directly. The successful...Performance
- USF in Austin is looking for a skilled Process Engineer to optimize manufacturing processes and ensure high-quality production. This hands-... ...part of a team that values technical expertise and continuous improvement to enhance production performance. #J-18808-Ljbffr USFPerformance
- ...Farathane is seeking a skilled Process Engineer to join our team in Austin, Texas. In this... ...-on role, you will be responsible for optimizing and controlling manufacturing processes... ...-quality parts efficiently, balancing performance, cycle time, cost, and material usage....Performance
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Performance Optimization Engineer. Be the first to apply!
- ai research engineer United States
- ai developer United States
- ai prompt engineer United States
- ai engineer United States
- senior ai engineer United States
- ai ml engineer United States
- ai engineer remote United States
- machine learning ai engineer United States
- senior performance engineer United States
- performance specialist United States

