Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Performance Optimization Engineer

$100k - $150k

Bright Vision Technologies

AI Performance Optimization Engineer

Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable, secure, and user-friendly applications. As we continue to grow, we're looking for a skilled AI Performance Optimization Engineer to join our dynamic team and contribute to our mission of transforming business processes through technology. This is a fantastic opportunity to join an established and well-respected organization offering tremendous career growth potential.

Location: 100% Remote (Continental United States)

Position Type: In-house Bright Vision Technologies SOW engagement (no third-party client or vendor)

Salary: $100K - $150K

Experience: 6+ years

Sponsorship: No new H1B sponsorship available. H1B transfers welcomed for qualified candidates.

Employment Type: Full-time, direct W2 with Bright Vision Technologies (no C2C, no 1099, no third-party)

Engagement: Long-term, multi-year, aligned to the Bright Vision SOW delivery roadmap

Compensation: Competitive base salary commensurate with experience, plus benefits.

Employment Terms & Visa Policy: This is a 100% remote, full-time, direct W2 position with Bright Vision Technologies. This role is part of Bright Vision Technologies' in-house Statement of Work (SOW) engagement. The client, end customer, and employer for this position is Bright Vision Technologies — there is no third-party client, vendor, or implementation partner involved. We do not engage in C2C, 1099, or third-party arrangements for this role. BUT STRICTLY NO C2C/1099/3RD PARTY COMPANIES. ALL OUR ROLES ARE W2 AND NO 3RD PARTY BROKERING PLEASE. Candidates must be willing to work directly as a full-time W2 employee of Bright Vision Technologies and contribute to our in-house SOW deliverables. No new H1B sponsorship is available for this role. However, candidates who are currently on a valid H1B visa and require a transfer are welcome to apply. We will support H1B transfers for qualified candidates. For every role, a technical coding assessment is mandatory. Please apply only if you are confident in your technical abilities and hands-on experience.

Job Summary

We are seeking an AI Performance Optimization Engineer to focus on extracting maximum throughput, minimizing latency, and reducing cost across training and inference workloads for large neural network systems. The role spans the full stack from low-level kernel optimization to distributed system tuning, requiring deep understanding of GPU architecture, model parallelism, memory management, and compiler-level optimization. The ideal candidate has demonstrated impact on production AI workloads, with strong instrumentation and measurement discipline that enables rigorous, data-driven optimization decisions. In this role you will work closely with cross-functional partners — product, design, engineering, operations, and business stakeholders — to translate ambiguous requirements into well-engineered solutions, and will be expected to raise the bar through code review, design review, and mentorship of more junior engineers. The successful candidate brings strong engineering discipline, a clear communication style, and a track record of shipping meaningful work that holds up well in production.

Key Responsibilities
  • Profile and optimize end-to-end AI training and inference pipelines for throughput, latency, and cost.
  • Identify and eliminate bottlenecks across data loading, model compute, communication, and memory.
  • Implement and tune quantization, sparsity, and pruning strategies to reduce model footprint and accelerate inference.
  • Optimize distributed training using tensor parallelism, pipeline parallelism, FSDP, and ZeRO-style sharding.
  • Tune attention implementations using FlashAttention, paged attention, and related techniques.
  • Implement KV cache optimization, continuous batching, and speculative decoding for LLM serving.
  • Drive compiler-level optimizations using Triton, XLA, TorchInductor, or TVM, working with the broader ML framework community to land improvements that translate into measurable end-to-end performance gains.
  • Optimize data pipelines, sharding strategies, and storage access patterns for high-throughput training.
  • Build and maintain rigorous benchmark suites and regression frameworks across workloads.
  • Collaborate with ML and platform engineering teams to embed best practices in standard pipelines.
  • Drive cost-efficiency improvements through model architecture, hardware selection, and scheduling strategies.
  • Evaluate new hardware and software offerings, and advise on adoption.
  • Document performance tuning playbooks and share findings broadly across engineering teams.
  • Stay current with AI systems research and translate advances into production improvements.
Required Qualifications
  • Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related field.
  • Six or more years of experience in performance engineering, ML systems, or HPC.
  • Strong proficiency in Python and C++.
  • Hands-on experience optimizing deep learning workloads on modern GPUs.
  • Deep understanding of distributed training and inference techniques.
  • Experience with profiling tools across CPU, GPU, and distributed systems.
  • Familiarity with model compression techniques and their accuracy implications.
  • Strong grasp of memory hierarchies, communication primitives, and parallelism strategies.
  • Excellent measurement, debugging, and analytical reasoning skills.
  • Strong communication and collaboration skills.
Preferred Qualifications
  • Experience optimizing LLM inference at production scale.
  • Contributions to vLLM, TensorRT-LLM, DeepSpeed, or similar projects.
  • Familiarity with custom kernel authoring in Triton or CUTLASS.
  • Experience with FinOps for AI workloads.
  • Publications or talks on AI systems performance.
Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the AI Performance Optimization Engineer in United States vacancy
  • Lemurian Labs in Santa Clara is looking for a Graph Optimization Compiler Engineer to optimize the middle tier of our AI compiler stack. You will work on transformations crucial for performance. The ideal candidate has extensive experience in compiler development, particularly... 
    Performance

    Lemurian Labs

    Santa Clara, CA
    1 day ago
  • A pioneering AI technology company in Santa Clara is seeking a Graph Optimization Compiler Engineer to enhance their AI compiler stack. This role focuses on developing graph...  ...optimizations to deliver significant performance improvements. The ideal candidate should have... 
    Performance

    Lemurian Labs

    Santa Clara, CA
    1 day ago
  • Liquid AI is seeking a Systems Programmer to join their Edge Inference team in San Francisco. In this role, you will implement and optimize inference kernels on various hardware, ensuring efficiency and performance. Ideal candidates have over 5 years of systems programming... 
    Performance
    Flexible hours

    Liquid AI

    San Francisco, CA
    3 days ago
  • Pragmatike is seeking a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels for high-throughput AI systems. This role involves designing custom kernels, profiling GPU workloads, and resolving performance bottlenecks. Ideal candidates will have a strong understanding... 
    Performance
    Remote job

    Pragmatike

    Florida, NY
    4 days ago
  • A leading tech company is seeking a Software Engineer specialized in GPU development to optimize AI accelerators for critical products. You will work on performance enhancements and software stack optimizations that impact billions of users. Ideal candidates have strong... 
    Performance

    Google

    Seattle, WA
    2 days ago
  •  ...seeking a skilled professional in Dallas, Texas, to design and optimize GPU-accelerated container platforms. The ideal candidate will...  ...expertise in NVIDIA and Kubernetes ecosystems, with a focus on high-performance workloads. This role includes architecting Kubernetes clusters... 
    Performance
    Work at office
    Relocation package
    3 days per week

    Career Techniques Inc

    Dallas, TX
    1 day ago
  • $166k - $244k

    Google is looking for a Senior Software Engineer in Sunnyvale, CA to lead GPU performance optimizations for cutting-edge AI and machine learning technologies. This role offers the opportunity to work on innovative projects that impact billions of users around the globe.... 
    Performance

    Google

    Sunnyvale, CA
    5 days ago
  • $174k - $253k

    Google is searching for experienced Software Engineers to develop optimizations for the latest generation of GPUs, crucial for critical products...  ...low-level GPU programming. This role involves addressing performance bottlenecks and influencing GPU software direction at... 
    Performance

    Google

    New York, NY
    5 days ago
  • $174k - $252k

    Google is seeking a Software Engineer to drive optimizations for advanced GPU technologies impacting billions of users. Candidates should have a...  ...building optimizations for GPU architectures and addressing performance bottlenecks across Google’s product suite. The position... 
    Performance

    Google

    Kirkland, WA
    3 days ago
  •  ...Corporation is seeking a Middleware Development Engineer to join its Communication Runtimes team in...  ...Texas. The role involves designing, building, and optimizing software communication libraries to enhance high-performance computing and artificial intelligence capabilities... 
    Performance

    Intel

    Austin, TX
    1 day ago
  • Slope is seeking a Founding Compiler Engineer in San Francisco, responsible for designing core compiler infrastructure and optimizing AI models. You will write CUDA kernels and conduct performance reviews, contributing to Luminal's mission of making AI workloads portable... 
    Performance
    Full time

    Slope

    San Francisco, CA
    5 days ago
  • $166k - $220k

     ...Multidisciplinary Design Analysis and Optimization (MDAO) Engineer to join our fast-growing team. You...  ...and proficiently apply cutting-edge AI tools to accelerate and optimize MDAO...  ...Structures, Propulsion, Aerodynamics, Vehicle Performance, Thermal Management, Power Management... 
    Performance
    Full time
    Work experience placement
    Immediate start

    Neura Market

    Costa Mesa, CA
    3 days ago
  • Mercor is looking for a Performance Engineer Expert to join our remote team. In this role, you will guide and improve AI model performance by collaborating with research teams and optimizing production systems. We seek someone with 2+ years of performance engineering experience... 
    Performance
    Remote job

    Mercor

    New York, NY
    20 hours ago
  • A tech company is seeking a skilled Prompt Engineer for a remote position in the European Union. The ideal candidate will design, test, and optimize prompts to enhance AI model performance. Responsibilities include collaborating with data scientists, ensuring compliance... 
    Performance
    Remote job
    Flexible hours

    Codertal

    Union, NJ
    5 days ago
  • $136k - $218.5k

     ...looking for a Senior Power Architecture & Optimization Engineer to push the limits of energy efficiency using advanced analytics and AI, including LLMs trained specifically for power...  ...optimizations for power Perform comparative power analysis across workloads... 
    Performance

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $176k - $276k

     ...Corporation in Santa Clara, California, is seeking a Production Engineer to design and maintain scalable storage solutions. The...  ...such as C/C++, Java, or Python. Collaboration with AI/ML workloads to optimize performance will be a key responsibility. The role offers a... 
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    5 days ago
  • Zensors is seeking a Machine Learning Engineer focused on ML Runtime & Optimization to enhance our visual sensing platform. The role involves...  ...learning pipelines and collaborating with AI research teams to implement high-performance algorithms. Ideal candidates will have a... 
    Performance

    Zensors

    San Francisco, CA
    1 day ago
  •  ...Solutions Services, LLC is seeking a professional focused on optimizing processes through AI technologies. The role involves integrating AI into workflows, collaborating with teams, and monitoring AI performance. Applicants should possess a degree in Artificial... 
    Performance

    Creative Solutions Services, LLC

    Tampa, FL
    4 days ago
  • Walker Lovell is seeking a corporate-facing process engineer to drive optimization across multiple sites in Texas. The role offers a strong base salary with a performance-linked bonus, positioning you to influence major capital projects and data-driven initiatives. You... 
    Performance

    Walker Lovell

    Dallas, TX
    4 days ago
  • $184k - $356.5k

    NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...usher in this new era, we seek AI-native thinkers across every...  ...system development, optimizations, and agentic systems. Our mission...  ...Analyze and optimize GPU kernel performance for training and inference...  ..., optimizations, and engineering practices in technical blogs... 
    Performance
    Full time

    Snowflake

    Bellevue, WA
    22 hours ago
  • Derichebourg Multiservices is seeking an MTM / STV Process Engineer to develop, validate, and optimize manufacturing times using MTM methodologies. In this...  ...will support production teams and improve industrial performance while ensuring accurate and safe production processes... 
    Performance

    Derichebourg Multiservices

    New Bremen, OH
    5 days ago
  •  ...Arizona is actively seeking an experienced Downstream Process Engineer specializing in Fluid Catalytic Cracking (FCC) processes....  ...will leverage 15+ years of operational expertise to optimize refinery performance and lead initiatives that enhance process safety, reliability... 
    Performance
    Local area

    Local First Arizona

    Chicago, IL
    4 days ago
  • Walker Lovell is seeking a qualified engineer to take ownership of process performance at a flagship manufacturing site in Missouri. The successful candidate will drive process optimization, improve operational efficiencies, and support plant initiatives. Must have proven... 
    Performance

    Walker Lovell

    Kansas City, MO
    3 days ago
  • Schreiber Foods Inc. is looking for a Senior Process Engineer based in Mt. Vernon, MO. This role involves optimizing production processes and collaborating with cross-functional teams to enhance operational performance and ensure food safety standards. The ideal... 
    Performance
    Relocation package

    Schreiber Foods Inc.

    Kansas City, MO
    1 day ago
  •  ...Vistas Corporation (Nuvoco) is hiring a Process Engineer for their Cement Plant. The role includes monitoring and optimizing plant processes to ensure quality production...  ..., maintenance, and quality teams, preparing performance reports and training operators. #J-18808-... 
    Performance

    Nuvoco Vistas Corporation (Nuvoco)

    New York, NY
    4 days ago
  •  ...manufacturing company in Fort Worth, TX, is seeking a Process Engineer to maintain performance in Power Circuit Board Assembly processes. The ideal...  ...setting up SMT and DIP machines, conducting process optimizations, and training new staff. This position involves problem... 
    Performance

    Employee Magnets

    Fort Worth, TX
    2 days ago
  • DSJ Global is seeking a Process Engineer II to support batch manufacturing operations in Fremont, California...  .... This hands-on role is focused on process optimization and offers a unique opportunity to influence plant performance and product quality directly. The successful... 
    Performance

    DSJ Global

    Fremont, CA
    5 days ago
  • USF in Austin is looking for a skilled Process Engineer to optimize manufacturing processes and ensure high-quality production. This hands-...  ...part of a team that values technical expertise and continuous improvement to enhance production performance. #J-18808-Ljbffr USF
    Performance

    USF

    Austin, TX
    4 days ago
  •  ...Farathane is seeking a skilled Process Engineer to join our team in Austin, Texas. In this...  ...-on role, you will be responsible for optimizing and controlling manufacturing processes...  ...-quality parts efficiently, balancing performance, cycle time, cost, and material usage.... 
    Performance

    US Farathane

    Austin, TX
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Performance Optimization Engineer. Be the first to apply!